Fitting Gaussian distro to data w/ 100 patients mean 13.3 stand-dev 2

Exodus2727

New member
Joined
Nov 17, 2018
Messages
1
I have data with number of patients 100 mean 13.3 and standard deviation 2

concentrationnumber of patientsGaussian
931.98
112210.30
134619.72
151913.90
1783.60
1920.34
total10049.82

I fit a Gaussian (shown in table) but the total number of patients is now different (49.8 vs 100), below is equation I used, could you tell me where I'm going wrong? Thanks

\(\displaystyle y\, =\, \dfrac{1}{\sigma\, \sqrt{2\pi \,}}\, e^{-\dfrac{(x\, -\, \mu)^2}{2\sigma^2}}\)

\(\displaystyle \mu:\, \mbox{mean}\)

\(\displaystyle \sigma:\, \mbox{standard deviation}\)

\(\displaystyle \pi\, \approx\, 3.14159...\)

\(\displaystyle e\, \approx\, 2.71828\)
 

Attachments

  • gaussian distribution.jpg
    gaussian distribution.jpg
    6.6 KB · Views: 21
Last edited by a moderator:
Hi,

The pre-factor of \(\displaystyle 1/\sqrt{ 2\pi\sigma^2} \) on the equation you used is in order to normalize the Gaussian function so that its integral (over all x) is unity (one). This is because it's supposed to be a probability density function. I.e. the probability of getting some value should be 1 (100%)

One thing you could do is scale the pre-factor by 100 so that the Gaussian integrates to 100 instead of 1.

A better thing to do would be to think about this more carefully...normalize the patient-count value in each concentration bin to be a probability density, and fit the Gaussian to that. For example, I assume the first value in the table is meant to be the number of patients whose actual concentration was in a bin centered on 9, ranging from 8 to 10. If so, the probability is

(number of counts in bin)/(total number of counts)

and the probability density would be the above, divided by the bin width of 2.

Or am I supposed to assume that all 3 patients had a concentration of EXACTLY 9? If so, that's weird.
 
Last edited:
But also, the values in your third column are nowhere near what you'd get if you just plugged mu, sigma, and x (from the leftmost column) into the equation you posted. So now I have no idea what you actually did. Did you use software to compute a best fit to the data? If so, was the amplitude (pre-factor) scaling the Gaussian function a free parameter in the fit, or a fixed one?
 
Top