Probability = (number of desired outcomes)/(number of possible outcomes)
Assumptions:
- I'm assuming the human genome has an exact sequence of the 3.2 million base pairs that does not vary from individual to individual
Therefore number of desired outcomes = 1
- I'm assuming that since A always pairs with T and C always pairs with G, you only need to know one partner in the pair, at each position in the sequence, in order to uniquely determine the sequence of pairs.
Therefore the number of possible outcomes is just the number of ways of choosing 3.2 million bases out of a set of four, with repetition allowed, and where the ordering matters.
For each position in the sequence, there are 4 possibilities. For each of those possibilities, there are 4 possibilities for what to put in the second position. Therefore there are 4*4 = 42 possibilities for the first two bases in the sequence. There are 4*4*4 = 43 possibilities for the first three bases, and 44 possibilities for the set of the first four bases in the sequence. Et cetera.
So it would seem that the number of possible outcomes for the whole sequence is 4(3 200 000), and the probability is 1 over that. I think this is an impossibly big number to compute.