Probabilities of binary classification problem

Philip K. · Apr 19, 2021

Hello,

I have a time series dataset of approximately 3000 observations, which are split fairly even 52 % to 50 % of 0's to 1's.
My general goal is to find what is the probability of the next outcome in the time series to be 0 or 1.

First, I thought that if I find the maximum number of consecutive sequences of 1's and 0's I could determine the probability of the change in the next outcome. For example, if my maximum consecutive sequence of 1's is 10, so 1/10 is 0.10 %. SO for every next 1, I have in my time series, the probability of reverse (to get 0 next) increases by 10 %. But when I thought about it I found a couple of problems I am not sure how to solve.

1 - The increase of the probability should not be linear, judging from the nature of the data. I am not sure how to determine the power of exponential growth I need to use for every next consecutive outcome.
2 - Not sure how to determine the frequencies and their probabilities. What I mean by that is, that a sequence of frequent change (010110100101) is far more likely from the probability of having long consecutive sequences (11111111110110). not sure how to quantitively measure the probabilities.

Namely, I am interested if someone can help with the statistics, but any python code will be greatly appreciated as well.

Dr.Peterson · Apr 19, 2021

Philip K. said:
I have a time series dataset of approximately 3000 observations, which are split fairly even 52 % to 50 % of 0's to 1's.

You don't really mean this, do you? You seem to have 2% that are both 0 and 1.

Philip K. · Apr 19, 2021

Dr.Peterson said:
You don't really mean this, do you? You seem to have 2% that are both 0 and 1.

I meant 52 % to 48 %

Probabilities of binary classification problem

Philip K.

New member

Dr.Peterson

Elite Member

Philip K.

New member