Probability paradox

Zermelo · Jan 8, 2022

Hello there, I was thinking about some continuous probabilities, and came across this idea:

Let [imath]l\in R^2[/imath] be a smooth non-intersecting curve defined by the equation [imath]l: y = f(x), \ x\in [a, b][/imath], where [imath]f(x)[/imath] is a bijective function. What is the probability of selecting a point from the curve such that [imath] x \in [c, d] \subseteq [a, b] [/imath]?
I have 2 ways of thinking about this:
First, using the geometric definition of probability: let's define a curve [imath]l_0 \subseteq l,\ l_0: y=f(x), x\in [c, d][/imath], Then the probability of picking said point is the probability of picking a point from the curve [imath] l_0 [/imath], when choosing a random point from [imath]l[/imath]. So, the probability is: [math]p = \frac{Length(l_0)}{Length(l)} = \frac{\int_c^d \sqrt{1+y'^2}dx}{\int_a^b \sqrt{1+y'^2}dx}[/math].
On the other hand, the probability is simply [math]p = \frac{d-c}{b-a}[/math], because a point [imath](x, y(x))[/imath] will be on the curve [imath]l_0[/imath] when [imath]x\in [c, d][/imath], and it will be on the curve [imath]l[/imath] when [imath]x\in [a, b][/imath].
This is contradictory, because generally, [imath]\frac{\int_c^d \sqrt{1+y'^2}dx}{\int_a^b \sqrt{1+y'^2}dx} \neq \frac{d-c}{b-a}[/imath].

What is the right answer here, and why? Or is this a case of Bertrand's Paradox?

I myself lean more towards the second solution because I clearly stated in the problem "what is the probability of getting a point on the curve such that [imath] x \in [c, d] [/imath]?", (the set of all possibilities is [imath]\Omega = [a, b][/imath]). If this is the right answer (because of the problem formulation), what kind of problem would have the first answer as a solution? Could anybody give me an example?

BigBeachBanana · Jan 8, 2022

The second method is a particular case of the first method, where the curve is a constant y= k (some constant).

Zermelo · Jan 8, 2022

BigBeachBananas said:
The second method is a particular case of the first method, where the curve is a constant y= k (some constant).

Yes, but I don’t see why it couldn’t work for the first case! Let’s pick a random point from a curve L. When we pick a random point x from [a, b], then (x, f(x)) will be a point from the curve, thus I think this is a valid way of solving this. The more I think about it, the more I think this is a Bertrand’s paradox, beacause when dealing with infinite sets, there is more than one way of picking a random subset, just need someone to confirm, or prove why the second method is wrong

BigBeachBanana · Jan 8, 2022

Zermelo said:
Yes, but I don’t see why it couldn’t work for the first case!

What is "it" referring to?
I'm saying that method 2 is a direct result of method 1 where the curve is some constant k. So method 1 always work, whereas method 2 is a shortcut for method 1 where y=k.

y=k -> y'=0 ->( y')^2=0. Evaluate the definite integral, the result is method 2.

Method 2 doesn't work every time because not all points in the interval have the same probability density. Let's look at two common probability distributions: the normal distribution and the uniform distribution.

Screen Shot 2022-01-08 at 8.23.51 AM.png

Screen Shot 2022-01-08 at 8.36.58 AM.png

Under the normal curve, every point in the interval has a different probability density due to the curve's curvature. As a result, the probability cumulative function (area under the curve) also isn't uniform. Thus we need to integrate the function f(x).

Under the uniform curve (y=k), every point in the interval has the same probability density. It follows the probability cumulative function (area under the curve) is also uniform, i.e. area of rectangles.

Dr.Peterson · Jan 8, 2022

I think in any case you need to define what you mean by "random". That's true even for discrete probability. In this case, randomly choosing an x-coordinate is different from randomly choosing a distance along the curve.

Zermelo · Jan 9, 2022

BigBeachBananas said:
What is "it" referring to?
I'm saying that method 2 is a direct result of method 1 where the curve is some constant k. So method 1 always work, whereas method 2 is a shortcut for method 1 where y=k.

y=k -> y'=0 ->( y')^2=0. Evaluate the definite integral, the result is method 2.

Method 2 doesn't work every time because not all points in the interval have the same probability density. Let's look at two common probability distributions: the normal distribution and the uniform distribution.

View attachment 30534 View attachment 30536
Under the normal curve, every point in the interval has a different probability density due to the curve's curvature. As a result, the probability cumulative function (area under the curve) also isn't uniform. Thus we need to integrate the function f(x).

Under the uniform curve (y=k), every point in the interval has the same probability density. It follows the probability cumulative function (area under the curve) is also uniform, i.e. area of rectangles.

Oh, now it makes perfect sense, thank you for the detailed explanation!

Zermelo · Jan 9, 2022

Dr.Peterson said:
I think in any case you need to define what you mean by "random". That's true even for discrete probability. In this case, randomly choosing an x-coordinate is different from randomly choosing a distance along the curve.

True, I think this is the main point of this problem. Thanks for your help!

Probability paradox

Zermelo

Junior Member

BigBeachBanana

Senior Member

Zermelo

Junior Member

BigBeachBanana

Senior Member

Dr.Peterson

Elite Member

Zermelo

Junior Member

Zermelo

Junior Member