[pythonvis] Re: Script Calculating Probability of Duplicate Birthdays Using a Dictionary

  • From: "Jeffrey Thompson" <jthomp@xxxxxxxxxxx>
  • To: <pythonvis@xxxxxxxxxxxxx>
  • Date: Tue, 23 Jun 2015 18:22:23 -0400

Pythonistas!,

Here is a short program that calculates the probability of a random
set of people from a crowd have at least 2 people with the same birthday.

Admittedly, This does not have a lot of practical usefulness.
However the underlying probabilities can be used similarly
to solve other problems.
To more clearly explain what I have done,

1) We do a ju-jitsu math trick,
and calculate the probability of
NOT getting at least one matching birthday.
Then we subtract this value from 1.0 to obtain the opposite probabilty of
having at least one pair of matching birthdays.
This turns out to be much easier than trying to directlycalculate the
probability.

I show here the numerator of selecting 20 people.
we calculate the numerator to be a nultipled sequence:
365*364*363* ... 345
The numerator represents the number of selections that have NO matching
birthdays.
So you can have 365 days to choose from for the initial selection.
But we cannot pick that number again so the next selection must be from
what's left,
which is 364,
and the one after that has 363 choices that will not match any of the
previous chosen.
The denominator is always 365,
because one can pick any number each timewhich is the number of selections
which may have repeats or not.The value we have calculated is the
probability of picking 20 numbers without any of them matching.
To get the probability of any match occurring
we subtract the calculated value from 1.0
So the final probability is:
1.0 - (365/365 * 364/365 * ... * 345/365)

I have not bothered to make the numbers floats in each of the fractions so
that must be done to retain accuracy.
e.g. 362.0/365 or something similar.
Also, it wouldn't be a bad idea to put paraentheses around each of the
fractionsto oblige python
to calculate things in the correct order.
\ (365*365*365*... *365)
Rather than using humongus numbers for numerator and denominator, which will
produce rounding errors,
we can calculate the value step-wise:
(365/365 * 364/365 * 363/365 ... * 345/365)

Here is the output from the code below:
22 people selected Have probability 0.475695307663 of having at least 2
people who have the same birthday.
23 people selected Have probability 0.507297234324 of having at least 2
people who have the same birthday.
24 people selected Have probability 0.538344257915 of having at least 2
people who have the same birthday.


Here is the code that does this:
def birth_day_probs(m, n = 365):
"""
This function calculates the probability of having duplicates when
picking m values out of n possible values. """

the_result = 1.0

for i in xrange(1, m+1):
#the_result = (the_result * (n+ 1-i))/ (float(i) * n)
the_result *= (n + 1 -i) /( 0.0 + n)
# end for loop

return 1.0 - the_result
# end function birth_days_dups()

def main(n = 365, m = [22, 23, 24]):
the_probs = []
for i in m:
the_prob = birth_day_probs(i)
print str(i) + " people selected Have probability " +
str(the_prob) +\
" of having at least 2 people who have the same
birthday."
the_probs.append ((i, the_prob))

# end for loop
return the_probs
# end main()
main()

-----Original Message-----
From: pythonvis-bounce@xxxxxxxxxxxxx [mailto:pythonvis-bounce@xxxxxxxxxxxxx]
On Behalf Of Jeffrey Turner
Sent: Monday, June 22, 2015 12:27 PM
To: pythonvis@xxxxxxxxxxxxx
Subject: [pythonvis] Script Calculating Probability of Duplicate Birthdays
Using a Dictionary

Hello,

This is a rework of a previously submitted solution. The exercise was to
revise the previous solution so it used a dictionary to test for duplicates.

The script is also attached for anyone who would like to run it.

Enjoy,
JDog

#Script to calculate the probability that there will be a match of birthdays
of any 23 people.

from random import randint # to generate random integers

'''Modified function to test for duplicates using a dictionary. Start with
an empty dictionary. Start a for loop and check to see if the list items are
in it. The first one that is, return True.'''
def hasDups(t):
d = {} #dictionary to hold the values
for i in t:
if i in d:
return True
#End of if in d test, so we know it was not in the dictionary
d[i] = True #So now we put it in and set an arbitrary value
#End of For loop
return False #If we drop past the for loop, there are no duplicates
#End of function

#Function to create a list holding 23 integers def get23Rand():
t = [] #list to hold the 23 integers
for i in xrange(23): #do the following 23 times
t.append(randint(1,366)) # get an integer corresponding to a
birthday and append it to the list
#End of for loop
return t
#end of function

match = 0 # to tally number of lists with duplicates
noMatch = 0 # to tally number of lists with nno duplicates

for j in xrange(100000): #Run the test 100,000 times
bDays = get23Rand() #get a list of 23 birthdays (really integers)
if hasDups(bDays):
match += 1
else:
noMatch += 1
#end of for loop
print str(match)+' match and '+str(noMatch)+' without equals
'+str(match/1000)+' percent.'

List web page is
//www.freelists.org/webpage/pythonvis

To unsubscribe, send email to
pythonvis-request@xxxxxxxxxxxxx with "unsubscribe" in the Subject field.

Other related posts: