Joel Miller | 16 May 23:39
Picon

quickly pulling marbles out of urns

I'm looking for a faster way to do the following problem:

I have an urn with many different colors of marbles in it.  I pull one
out and note the color.  I do not replace it.

For the programming of this, I actually know how many are yellow,
green, etc.  So the way the code works right now is:

import math
...

randindex = math.randint(1,number_of_marbles)
for color in colors:
   if randindex<=marble_count[color]: #we've found what color it will be
       break
   else:                                          #try next color
       randindex -= marble_count[color]
marble_count[color] -= 1
number_of_marbles -= 1
return color

Unfortunately, I have hundreds of thousands of colors, so it spends a
while on this loop.  And I have to keep choosing marbles many times at
different points of the code.

Is there a quicker way to do this?

Thanks,
Joel
_______________________________________________
(Continue reading)

Kent Johnson | 17 May 06:15
Gravatar

Re: quickly pulling marbles out of urns

On Fri, May 16, 2008 at 5:43 PM, Joel Miller <joel.c.miller <at> gmail.com> wrote:
> I'm looking for a faster way to do the following problem:
>
> I have an urn with many different colors of marbles in it.  I pull one
> out and note the color.  I do not replace it.

You probably know, this is called sampling without replacement. If you
can put all the 'marbles' in a list, here is a promising approach:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/dff9425afdb744e5/6b9a84329c0815d4?lnk=raot

This is probably worth looking at too though I can't understand it at this hour:
http://safari.oreilly.com/0596007973/pythoncook2-CHP-18-SECT-4

Kent
_______________________________________________
Tutor maillist  -  Tutor <at> python.org
http://mail.python.org/mailman/listinfo/tutor

John Fouhy | 17 May 06:51

Re: quickly pulling marbles out of urns

On 17/05/2008, Joel Miller <joel.c.miller <at> gmail.com> wrote:
>  I have an urn with many different colors of marbles in it.  I pull one
>  out and note the color.  I do not replace it.

Kent's suggest seems simplest: represent the marbles as a list of
integers, where marbles[i] is the integer corresponding to marble i's
colour.

If that doesn't work (too many marbles?), maybe some kind of binary
tree approach?  e.g. a MarbleColour class which stores
 - the colour
 - the number of marbles with this colour
 - the number of marbles in the left subtree
 - the number of marbles in the right subtree

where either subtree could be None (and thus have 0 marbles).

Set the binary tree up initially so it's balanced by marble weight (if
the marble colours are all approximately the same size at the start,
you could approximate this by just filling in colours from the top in
any order).  Then you can randomly generate a colour by generating a
random int in (0,1) and comparing with the proportions in the node.
Then you just have to update the weights for the colour you choose and
all nodes above it.

Should be log_2(n) to pick one colour and n.log_2(n) to pick colours
until you run out.

--

-- 
John.
(Continue reading)


Gmane