Big Data, Probability, and Birthdays: Part 2 of 2

Mike Perkins

In Part One of this blog post I discussed how to state an experiment in the form of probability spaces. Determining the sample space and the event space is necessary to be able to talk intelligently about probability measures, which is the topic of this post.

Approach 1: Counting

We’ve figured out the sample space and event space for the birthday problem already. So how about the probability measure? Let’s assume that people are equally likely to be born on any day of the year (this turns out not to be true, but let’s go with it). Then we should assign a probability of 1/365 to each outcome in the simple sample space underlying our vector sample space. We’ll further assume that people’s birthdays are independent of each other so that we can assign a probability to each vector as the product of the probabilities of each of its components. We see that the probability of each outcome vector is:


where omega is our notation for a single outcome vector. Furthermore it is easy to see that the size of the sample space is given as

eq2Now, how many outcomes are there in the sample space where no two components of the birthday vector are identical?

We need to answer two sub questions to answer this question:

  1. How many ways can we select N distinct days from 365?
  2. Given a selection of N distinct days, how many ways can we assign those days to N people?

The answer to the first question is a combination, while the answer to the second question is a permutation. The product of the two is the number of vectors in the sample space with distinct components:


Now, since the probability of each outcome vector is the same, summing up the probabilities assigned to every outcome in our event yields the following product:


Note that this answer corresponds to our intuition about probabilities: the numerator is the total number of vectors satisfying the distinctness criteria, and the denominator is the total number of outcome vectors.

There is another way to count the number of vectors in the numerator, and it may or may not feel more intuitive. It is a recursive argument that starts with N = 2 and goes like this: There are 365 ways to choose the first birthday; for each one of those 365 ways, there are 364 ways of choosing the second birthday such that it is distinct from the first. So there are 365×364 total ways for the first and second birthdays to be unique. In the case of N = 3, for each of the 365×364 ways of choosing the first two birthdays to be distinct, there are 363 ways to choose the third birthday to be different from both of the first two. So there are 365×364×363 ways in which all 3 birthdays can be different. Continuing the argument in this manner we see that for the general case of N distinct birthdays we get the numerator in the above expression.

Approach 2: Conditional Probability

In this section I assume you’ve been exposed to the concepts of joint probability, conditional probability and independent events. (I did not discuss these in Part One of the blog post.) We seek the probability of the event that no two components of the birthday vector are equal. We can work our way up to that by starting with the probability that Birthday 2 does not equal Birthday 1. Since the events are independent, this probability is given as follows


Here B denotes birthday, and the subscript indicates the birthday of the person in question. Now, what is the probability that the third person’s birthday is distinct from both the first and second person’s? In a notation I hope is obvious (because I don’t want to define it!) we have


Here we expressed the desired joint probability as the product of a conditional probability and a probability we already know. Then we used independence in order to “drop” the conditioning term. (Intuitively this is possible because the conditioning term depends on the birthdays of Persons 1 and 2, and the birthday of Person 3 is independent of those birthdays. So the probability that Person 3 is born on any given day is not changed by the fact that we know the birthdays of Persons 1 and 2.) Continuing the argument recursively the next step becomes:


Continuing the argument recursively in this manner we obtain the following expression for the joint probability we desire


Multiplying this equation by 1 in the form of 365 over 365 yields the previously derived equation from Approach 1.

Cool Fact

When N = 23, the probability that all the birthdays are distinct is 0.493. This means that the probability of the complementary set—the event where at least two birthdays are identical—is given by 1 − 0.493 = 0.507. So when just 23 people assemble, there is already a 50% chance that at least two of them have the same birthday. Clearly, if there are 366 people in the same room, at least two people must have the same birthday. Does it match your intuition that by the time 23 people have assembled, the probability that at least two of them share a birthday is already ½?

Categories: Perk, Theory

Cardinal Peak
Learn more about our Audio & Video capabilities.

Dive deeper into our IoT portfolio

Take a look at the clients we have helped.

We’re always looking for top talent, check out our current openings. 

Contact Us

Please fill out the contact form below and our engineering services team will be in touch soon.

We rely on Cardinal Peak for their ability to bolster our patent licensing efforts with in-depth technical guidance. They have deep expertise and they’re easy to work with.
Diego deGarrido Sr. Manager, LSI
Cardinal Peak has a strong technology portfolio that has complemented our own expertise well. They are communicative, drive toward results quickly, and understand the appropriate level of documentation it takes to effectively convey their work. In…
Jason Damori Director of Engineering, Biamp Systems
We asked Cardinal Peak to take ownership for an important subsystem, and they completed a very high quality deliverable on time.
Matt Cowan Chief Scientific Officer, RealD
Cardinal Peak’s personnel worked side-by-side with our own engineers and engineers from other companies on several of our key projects. The Cardinal Peak staff has consistently provided a level of professionalism and technical expertise that we…
Sherisse Hawkins VP Software Development, Time Warner Cable
Cardinal Peak was a natural choice for us. They were able to develop a high-quality product, based in part on open source, and in part on intellectual property they had already developed, all for a very effective price.
Bruce Webber VP Engineering, VBrick
We completely trust Cardinal Peak to advise us on technology strategy, as well as to implement it. They are a dependable partner that ultimately makes us more competitive in the marketplace.
Brian Brown President and CEO, Decatur Electronics
The Cardinal Peak team started quickly and delivered high-quality results, and they worked really well with our own engineering team.
Charles Corbalis VP Engineering, RGB Networks
We found Cardinal Peak’s team to be very knowledgeable about embedded video delivery systems. Their ability to deliver working solutions on time—combined with excellent project management skills—helped bring success not only to the product…
Ralph Schmitt VP, Product Marketing and Engineering, Kustom Signals
Cardinal Peak has provided deep technical insights, and they’ve allowed us to complete some really hard projects quickly. We are big fans of their team.
Scott Garlington VP Engineering, xG Technology
We’ve used Cardinal Peak on several projects. They have a very capable engineering team. They’re a great resource.
Greg Read Senior Program Manager, Symmetricom
Cardinal Peak has proven to be a trusted and flexible partner who has helped Harmonic to deliver reliably on our commitments to our own customers. The team at Cardinal Peak was responsive to our needs and delivered high quality results.
Alex Derecho VP Professional Services, Harmonic
Yonder Music was an excellent collaboration with Cardinal Peak. Combining our experience with the music industry and target music market, with Cardinal Peak’s technical expertise, the product has made the mobile experience of Yonder as powerful as…
Adam Kidron founder and CEO, Yonder Music
The Cardinal Peak team played an invaluable role in helping us get our first Internet of Things product to market quickly. They were up to speed in no time and provided all of the technical expertise we lacked. They interfaced seamlessly with our i…
Kevin Leadford Vice President of Innovation, Acuity Brands Lighting
We asked Cardinal Peak to help us address a number of open items related to programming our systems in production. Their engineers have a wealth of experience in IoT and embedded fields, and they helped us quickly and diligently. I’d definitely…
Ryan Margoles Founder and CTO, notion