What is the probability of someone sharing my name & date of birth?

Welshguy

New member
Joined
Jan 23, 2019
Messages
11
Hi. I'm trying to work out the probability of someone sharing the same name & date of birth as me.

I know roughly that there are 620 people in the UK with my name, how do I work out the rest?

Thanks in advance
 
Hi. I'm trying to work out the probability of someone sharing the same name & date of birth as me.

I know roughly that there are 620 people in the UK with my name, how do I work out the rest?

Thanks in advance
Does "name" include First name and Family name (may be a middle name)?
 
Does "name" include First name and Family name (may be a middle name)?

Thanks for your reply. Yes just first name and family name.

I know there are approx 620 people in the UK who share my first name & family name, so will take that figure as exact.

I probably should be able to work out the answer myself, but it's been a fair while since I did my A level maths!!

Cheers
 
Since you already know that "there are 620 people in the UK with my name" you just need the probability that one of those has the same birthday as you. Ignoring leap years, there are 365 days in a year so the probability a given person has the same birthday as you is 1/365. Of the 620 people with the same name, we expect that 620/365 or approximately 1.7 people will have the same name and the same birthday as you. If you want probability you need to divide that by the number of people in the U.K.
 
Since you already know that "there are 620 people in the UK with my name" you just need the probability that one of those has the same birthday as you. Ignoring leap years, there are 365 days in a year so the probability a given person has the same birthday as you is 1/365. Of the 620 people with the same name, we expect that 620/365 or approximately 1.7 people will have the same name and the same birthday as you. If you want probability you need to divide that by the number of people in the U.K.

Thanks. That would just give the same birthday. I was looking for the probability of someone else having the same date of birth (Day, month & year)

Cheers
 
Hi. I'm trying to work out the probability of someone sharing the same name & date of birth as me.

I know roughly that there are 620 people in the UK with my name, how do I work out the rest?

We need to clarify what you want.

Are you looking for the probability that any given person you meet in the U.K. has both the same name and the same birthdate; or that there is someone in the U.K. who shares both; or something else? These are very different things.
 
We need to clarify what you want.

Are you looking for the probability that any given person you meet in the U.K. has both the same name and the same birthdate; or that there is someone in the U.K. who shares both; or something else? These are very different things.

Thanks for your reply. I'll try and clarify what I'm trying to do;

I'm doing some research on online casinos. Now they say if someone closes down an account and opens a new one using a different address & email, they are unable to link the accounts, even if that person uses the same name and date of birth.

They say that just linking accounts based on First Name, Surname & Date of Birth would return too many false positives. I'm trying to work out whether or not that is likely to be true (for the UK only).

Is that any clearer?

Thanks again!
 
Thanks for your reply. I'll try and clarify what I'm trying to do;

I'm doing some research on online casinos. Now they say if someone closes down an account and opens a new one using a different address & email, they are unable to link the accounts, even if that person uses the same name and date of birth.

They say that just linking accounts based on First Name, Surname & Date of Birth would return too many false positives. I'm trying to work out whether or not that is likely to be true (for the UK only).

Is that any clearer?

Thanks again!

Okay. You are looking for the probability that a random individual has the same first and last name and full date of birth as a given individual (whose account is being investigated).

Presumably you are actually thinking only about yourself as the individual, since you said you know the number of people with your name. That will vary considerably from person to person. If you are trying to suggest a particular general policy, and not just to make some private argument, then you really need to consider anyone, and not just your own name. I would go with the worst case -- what is the most common name in the country?

As far as the birthdate is concerned, you have to find how many people were born on that date (and are still alive), as a fraction of the entire current population. That, too, will vary considerably; there will be fewer with the birthdate Feb 28, 1918 than with, say, Jan 1, 1968. For the worst case, you might need to find the date on which the most people were born (maybe 9 months after some major event).

So, in order to argue that there will not be many false positives, you should probably take (A) the number of people with the most common name, over (P) the total population, times (B) the number of people with the most common birthdate, over (P) the total population. Both parts require research; you can't just assume some arbitrary probability distribution (e.g. that every date since 100 years ago, and every possible name, are equally likely).
 
Okay. You are looking for the probability that a random individual has the same first and last name and full date of birth as a given individual (whose account is being investigated).

Presumably you are actually thinking only about yourself as the individual, since you said you know the number of people with your name. That will vary considerably from person to person. If you are trying to suggest a particular general policy, and not just to make some private argument, then you really need to consider anyone, and not just your own name. I would go with the worst case -- what is the most common name in the country?

As far as the birthdate is concerned, you have to find how many people were born on that date (and are still alive), as a fraction of the entire current population. That, too, will vary considerably; there will be fewer with the birthdate Feb 28, 1918 than with, say, Jan 1, 1968. For the worst case, you might need to find the date on which the most people were born (maybe 9 months after some major event).

So, in order to argue that there will not be many false positives, you should probably take (A) the number of people with the most common name, over (P) the total population, times (B) the number of people with the most common birthdate, over (P) the total population. Both parts require research; you can't just assume some arbitrary probability distribution (e.g. that every date since 100 years ago, and every possible name, are equally likely).

Thank you very much!

So the most common name in the UK is David Smith (6300 people). The most babies born is 2000 in a day.

So I take the total population over 18 (49.74 million) and use your formula it gives me a result of 5.09.

Apologies for sounding daft but does that mean there are likely to be 5 David Smiths with the same date of birth?

Thanks again
 
Last edited:
Thanks for your reply. I'll try and clarify what I'm trying to do;

I'm doing some research on online casinos. Now they say if someone closes down an account and opens a new one using a different address & email, they are unable to link the accounts, even if that person uses the same name and date of birth.

They say that just linking accounts based on First Name, Surname & Date of Birth would return too many false positives. I'm trying to work out whether or not that is likely to be true (for the UK only).

Is that any clearer?

Thanks again!
If you want an exact answer, do the kind of research that Dr. Peterson suggested.

If you want a rough estimate, make a Fermi estimate.

Population of the UK, about 70 million

Percentage of UK population called John Smith, about 0.1%

So number of John Smiths in UK is about 70 thousand.

So say that there are about 1000 born each year.

So 70 John Smiths born each year.

The probability that at least two were born on the same day of that year is greater than 99%, a virtual certainty in short. This is not intuitive and is sometimes called the birthday paradox.

https://betterexplained.com/articles/understanding-the-birthday-paradox/

EDIT: Fermi estimates are designed to give you quick orders of magnitude without doing any hard research. The hard part here is understanding the so-called birthday paradox. False positives are almost certain on common names. Of course, they will probably not occur at all with a name like Darcy Wentworth Thompson.
 
Last edited:
Thank you very much!

So the most common name in the UK is David Smith (6300 people). The most babies born is 2000 in a day.

So I take the total population over 18 (49.74 million) and use your formula it gives me a result of 5.09.

Apologies for sounding daft but does that mean there are likely to be 5 David Smiths with the same date of birth?

Thanks again

Trusting your numbers, what I get is (6300/49,740,000)*(2000/49,740,000) = 5.09e-9, which means 5 billionths. Possibly you did it on a calculator as i did, and didn't notice the bit at the end. (A probability can't be greater than 1.)

So one particular person, even a David Smith, is not at all likely to be a false positive. But as JeffM pointed out, it's still almost certain the someone will be a false positive, so if they want to avoid ever having any, they probably won't go along with you. But that may be excessive caution, depending on exactly what the implications of a false positive would be.
 
Thank you very much!

So the most common name in the UK is David Smith (6300 people). The most babies born is 2000 in a day.

So I take the total population over 18 (49.74 million) and use your formula it gives me a result of 5.09.

Apologies for sounding daft but does that mean there are likely to be 5 David Smiths with the same date of birth?

Thanks again
Working with your numbers, there are probably 8 people with the same birth year named David Smith.

Thus there are 8 * 7 / 2 = 28 possible pairs.

AB, AC, AD, AE, AF, AG, AH, BC, BD, BE, BF, BG, BH, CD, CE, CF, CG, CH, DE, DF, DG, DH, EF, EG, EH, FG, FH, GH

Thus, the probability of a single match on David Smith on the same day in the same year is about

\(\displaystyle 28 * \dfrac{1}{365} \approx 7\%.\)

That may seem low, but that is just one name. Then you have to add in the probabilities for John Smith and Mary Smith plus the Jones's and the Johnsons and Thompsons and Richardsons etc.

False positives are virtually certain. How many false positive per year there are and what the costs of each false positive are then becomes the issue.
 
Working with your numbers, there are probably 8 people with the same birth year named David Smith.

Thus there are 8 * 7 / 2 = 28 possible pairs.

AB, AC, AD, AE, AF, AG, AH, BC, BD, BE, BF, BG, BH, CD, CE, CF, CG, CH, DE, DF, DG, DH, EF, EG, EH, FG, FH, GH

Thus, the probability of a single match on David Smith on the same day in the same year is about

\(\displaystyle 28 * \dfrac{1}{365} \approx 7\%.\)

That may seem low, but that is just one name. Then you have to add in the probabilities for John Smith and Mary Smith plus the Jones's and the Johnsons and Thompsons and Richardsons etc.

False positives are virtually certain. How many false positive per year there are and what the costs of each false positive are then becomes the issue.
That is assuming that all those David Smith wanted to "Cheat". If we assume 10% wanted to cheat - then the false positive will go down further.

By the way, I have not yet found anybody with my name (first+family). My first name is very very rare.I can safely assume the world can "stand" only one "Subhotosh Khan" - no chance of false positive - i.e. less than 1/1010
 
Thanks all for your help. Much appreciated!

Maybe I'm being twp, but I think this is partly a foolish exercise. What is the probability that your existence has NO INLUENCE on ANYONE else picking a birthdate or naming a child or that your situation was influenced by no one? Can you REALLY hide and influence NO ONE? In particular, if YOU have a child, who happens to be born on your birthdate, mightn't you be tempted to name the child after YOU? It's not random. Also, Names and Birthdates are not independent. For example, is there an unusual propensity to name children Noel or Noelle or Noél (because nonFrench don't know any better) or Noël or etc. when they are born on Dec 25?

Anyway...
 
Maybe I'm being twp, but I think this is partly a foolish exercise. What is the probability that your existence has NO INLUENCE on ANYONE else picking a birthdate or naming a child or that your situation was influenced by no one? Can you REALLY hide and influence NO ONE? In particular, if YOU have a child, who happens to be born on your birthdate, mightn't you be tempted to name the child after YOU? It's not random. Also, Names and Birthdates are not independent. For example, is there an unusual propensity to name children Noel or Noelle or Noél (because nonFrench don't know any better) or Noël or etc. when they are born on Dec 25?

Anyway...

If I have a child, it is unlikely to have the same date of birth as me.

This is actually a very serious exercise.

Gambling companies claim that they cannot identify players who have self excluded due to gambling addiction opening a new account, if they use a different email/address/phone but use their actual date of birth. They say that identifying such accounts using only name & DOB would bring up too many false positives. I am trying to establish whether or not that is likely to be the case. If someone opens a new account with the same name & DOB, a quick manual check could identify if it is the same person or not. How often are they going to be doing this? Once a day, hundreds of times a day?

Thanks for your input

Cheers
 
If I have a child, it is unlikely to have the same date of birth as me.

This is actually a very serious exercise.

Gambling companies claim that they cannot identify players who have self excluded due to gambling addiction opening a new account, if they use a different email/address/phone but use their actual date of birth. They say that identifying such accounts using only name & DOB would bring up too many false positives. I am trying to establish whether or not that is likely to be the case. If someone opens a new account with the same name & DOB, a quick manual check could identify if it is the same person or not. How often are they going to be doing this? Once a day, hundreds of times a day?

Thanks for your input

Cheers

Just for the record, correlation and dependence will make the assumptions of no correlation and independent produce an incorrect result, but it may not make it sufficiently incorrect to make the result fail to be useful. :) Gambling is always a serious exercise. That's why I just stay away from it. There are enough unavoidable risks. Obviously, not everyone follows this avoidance philosophy. Keep up your good work!
 
Just for the record, correlation and dependence will make the assumptions of no correlation and independent produce an incorrect result, but it may not make it sufficiently incorrect to make the result fail to be useful. :) Gambling is always a serious exercise. That's why I just stay away from it. There are enough unavoidable risks. Obviously, not everyone follows this avoidance philosophy. Keep up your good work!

Thanks!

It doesn't have to be an exact figure. Obviously the companies involved know exactly how many false positives they do get, but they are openly using that as a reason for not being able to link accounts of people with serious problems. We just want to be confident, which we are, that they won't pull out evidence in court of hundreds of false positives a day.
 
Working with your numbers, there are probably 8 people with the same birth year named David Smith.

Thus there are 8 * 7 / 2 = 28 possible pairs.

AB, AC, AD, AE, AF, AG, AH, BC, BD, BE, BF, BG, BH, CD, CE, CF, CG, CH, DE, DF, DG, DH, EF, EG, EH, FG, FH, GH

Thus, the probability of a single match on David Smith on the same day in the same year is about

\(\displaystyle 28 * \dfrac{1}{365} \approx 7\%.\)

That may seem low, but that is just one name. Then you have to add in the probabilities for John Smith and Mary Smith plus the Jones's and the Johnsons and Thompsons and Richardsons etc.

False positives are virtually certain. How many false positive per year there are and what the costs of each false positive are then becomes the issue.

Sorry to bother you again Jeff, but would you mind when you have time explaining how you get to that 7% figure please? More specifically what you mean by "Thus there are 8 * 7 / 2 = 28 possible pairs".

I thought I was ok at maths but this is making my head hurt! But I need to be able to explain it in layman's terms if asked how I got there!

Many thanks
 
If I have a child, it is unlikely to have the same date of birth as me.

This is actually a very serious exercise.

Gambling companies claim that they cannot identify players who have self excluded due to gambling addiction opening a new account, if they use a different email/address/phone but use their actual date of birth. They say that identifying such accounts using only name & DOB would bring up too many false positives. I am trying to establish whether or not that is likely to be the case. If someone opens a new account with the same name & DOB, a quick manual check could identify if it is the same person or not. How often are they going to be doing this? Once a day, hundreds of times a day?

Thanks for your input

Cheers
Do you understand that this drastically changes your original question? For one thing, you used as your universe the entire adult population of the UK. That universe is entirely irrelevant unless every adult in the UK is an online punter with an admitted gambling addiction. I'd venture a guess that the relevant universe is one or two orders of magnitude smaller, which completely alters the math.

It is of course still true that if there are 25 David Smiths registered as gambling addicts, the probability that at least two will have been born on the same day of the same month will be in excess of 50%. (This is true even though tkhunny is correct that we live in a deterministic world so that it is never strictly true that human behavior is perfectly random.) But once you add year of birth the probabilty that 25 almost random adults will have been born on the same day of the same month of the same year becomes virtually zero.

The real problem, as the wise S. Khan has pointed out, is that addicts will lie about their birthdays if the truth will prevent them from gambling. DOB is absolutely useless if it cannot be verified. The same is true of name. I have experience in the US with the so-called OFAC list, a government list of the names of suspected terrorists and drug smugglers that banks must check before sending money out of the country. The rate of false positives is negligibly different from 100% because terrorists and drug smugglers tend not to be excessively truthful with the authorities. Anyone can lie about his or her birthday.

Because gambling addicts have less incentive and far fewer means to evade detection than terrorists and drug smugglers, what is likely to be more effective than DOB is payment address: gamblers want to be paid when they win, and the bookies want to be paid when the gamblers lose. No one in the gambling business cares about birthdays; everyone cares about payment.
 
Top