no latex

I have a database that contains samples from two families
\(\displaystyle X_1....X_n\)~iid
\(\displaystyle Y_1....Y_m\)~iid


I assume (even though it isn't true) that the samples are from normal distributions:


X~\(\displaystyle N(\mu_1,\sigma^2_1)\)
Y~\(\displaystyle N(\mu_2,\sigma^2_2)\)


And I use MLE to estimate those parameters:


\(\displaystyle \widehat{\mu_1}=\frac{1}{n}\sum_{i=1}^{n}X_i\)
\(\displaystyle \widehat{\mu_2}=\frac{1}{m}\sum_{i=1}^{m}Y_i\)
\(\displaystyle \widehat{\sigma^2_1}=\frac{1}{n}\sum_{i=1}^{n}(X_i-\widehat{\mu_1})^2\)
\(\displaystyle \widehat{\sigma^2_2}=\frac{1}{m}\sum_{i=1}^{m}(Y_i-\widehat{\mu_2})^2\)


Then using a Bayesian decision rule (priors are equal), I will Decide that a New sample "w" belongs to family X if:


\(\displaystyle \frac{1}{\sqrt{2\pi }\widehat{\sigma _{1}}}e^{-\frac{(w-\widehat{\mu_1})^2}{2\widehat{\sigma^2_1}}}>\frac{1}{\sqrt{2\pi }\widehat{\sigma _{2}}}e^{-\frac{(w-\widehat{\mu_2})^2}{2\widehat{\sigma^2_2}}}\)
Otherwise, decide that the new sample "w" belongs to family Y.


My questions are:
1. Can I use Likelihood ratio test to get a different decision rule?
2. Can I use Likelihood ratio test to decide between this parametric model and another one (not necessarily parametric)?
3. if none of the above, please explain what information can LRT give me here.


Thank you!!
 
I have a database that contains samples from two families
\(\displaystyle X_1....X_n\)~iid
\(\displaystyle Y_1....Y_m\)~iid


I assume (even though it isn't true) that the samples are from normal distributions:


X~\(\displaystyle N(\mu_1,\sigma^2_1)\)
Y~\(\displaystyle N(\mu_2,\sigma^2_2)\)


And I use MLE to estimate those parameters:


\(\displaystyle \widehat{\mu_1}=\frac{1}{n}\sum_{i=1}^{n}X_i\)
\(\displaystyle \widehat{\mu_2}=\frac{1}{m}\sum_{i=1}^{m}Y_i\)
\(\displaystyle \widehat{\sigma^2_1}=\frac{1}{n}\sum_{i=1}^{n}(X_i-\widehat{\mu_1})^2\)
\(\displaystyle \widehat{\sigma^2_2}=\frac{1}{m}\sum_{i=1}^{m}(Y_i-\widehat{\mu_2})^2\)
I want to break in here and explore your next statement as related to Bayes' Theorem and conditional probabilities.

\(\displaystyle \displaystyle P(w\ |\ X) = \dfrac{1}{\sqrt{2\pi}\ \widehat{\sigma _1}}\ \exp{\left( -\dfrac{(w-\widehat{\mu_1})^2}{2\ \widehat{\sigma^2_1}} \right)},\)......\(\displaystyle
P(w\ |\ Y) = \dfrac{1}{\sqrt{2\pi}\ \widehat{\sigma _2}}\ \exp{\left( -\dfrac{(w-\widehat{\mu_2})^2}{2\ \widehat{\sigma^2_2}} \right)}\)

\(\displaystyle \displaystyle P(X\ |\ w) = \dfrac{P(w\ |\ X) \times P(X)}{P(w\ |\ X) + P(w\ |\ Y)},\)......................\(\displaystyle
P(Y\ |\ w) = \dfrac{P(w\ |\ Y) \times P(Y)}{P(w\ |\ X) + P(w\ |\ Y)}\)

OK - if \(\displaystyle P(X)=P(Y)\), then \(\displaystyle P(X\ |\ w)\sim P(w\ |\ X)\).

Then using a Bayesian decision rule (priors are equal), I will Decide that a New sample "w" belongs to family X if:

\(\displaystyle \frac{1}{\sqrt{2\pi }\widehat{\sigma _{1}}}e^{-\frac{(w-\widehat{\mu_1})^2}{2\widehat{\sigma^2_1}}}>\frac{1}{\sqrt{2\pi }\widehat{\sigma _{2}}}e^{-\frac{(w-\widehat{\mu_2})^2}{2\widehat{\sigma^2_2}}}\)

Otherwise, decide that the new sample "w" belongs to family Y.

My questions are:
1. Can I use Likelihood ratio test to get a different decision rule?
2. Can I use Likelihood ratio test to decide between this parametric model and another one (not necessarily parametric)?
3. if none of the above, please explain what information can LRT give me here.

Thank you!!
Thanks for giving us something to think about . . .

I think the answer is that the LRT does not apply, unless the two models X and Y are related:

"The test requires nested models, that is, models in which the more complex one can be transformed into the simpler model by imposing a set of linear constraints on the parameters."
[http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Likelihood-ratio_test.html]

Since the LRT depends on number of degrees of freedom, would it be equivalent to modify the decision by changing the priors to be proportional to the number of data in the two sets?
 
Thank you.
Do you have a good example where LRT is useful?
I will have to admit that I had never heard of LRT before looking it up to answer this question!

The example that immediately came to mind was a paper I wrote in 1963 (yes, 50 years ago) in which I really needed it. I was working on the "semi-empirical atomic mass formula," which involved fitting all known mass data to a model with 15 to 20 terms based on theory. Least-squares analysis was used to find arbitrary coefficients, along with their Variances and Covariances. I needed a statistical test to decide whether some of the proposed terms "significantly" improved the goodness of fit - which is exactly what LRT tells you. Instead, I managed to use an F-test, which does account for the difference of degrees of freedom. Adding another term - even if it is not "true" - always improves the fit, but LRT helps you decide whether the improvement is significant.
 
Thank you for the answer, I wasn't aware I'm talking with an accomplished professor to say the least.
I think I can modify my model to a Gaussian mixture model instead of normal distribution and perhaps use LRT to measure how many parameters I need in order to stay in a confidence interval. I have to think about it...
Do you mind giving a reference for your paper on atomic mass? I'm a physicist and I'm curious to see how you managed to do it, probably more than a decade before the first IBM personal computer.
 
Thank you for the answer, I wasn't aware I'm talking with an accomplished professor to say the least.
I think I can modify my model to a Gaussian mixture model instead of normal distribution and perhaps use LRT to measure how many parameters I need in order to stay in a confidence interval. I have to think about it...
Do you mind giving a reference for your paper on atomic mass? I'm a physicist and I'm curious to see how you managed to do it, probably more than a decade before the first IBM personal computer.
Not quite as far back as I said .. the paper where I first used an F-test for significance of added terms was in 1967:

P.A. Seeger in Barber, R.C., ed.: Proc. 3rd Intern. Conf. on Atomic Masses, University of Manitoba Press, Winnipeg, 1967, p. 85.

There is a pretty good version in the Third Edition of the American Institute of Physics Handbook (1972), p. 8.92-8.142

The most-often referenced is
P. A. Seeger and W. M. Howard, “Semiempirical Atomic Mass Formula,” Nucl. Phys. A238 (1975) 491-532.

BTW, we used least-squares for fitting, as the procedure was well developed. Computers used were IBM7094 and CDC6600. The key routine was a double-precision matrix inversion subroutine from Brookhaven, for up to a 20×20 matrix. Doing the actual matrix inversion gives the Variance-Covariance matrix as a result.

Nowadays you might consider Maximum Liklihood, based on "information" or entropy, rather than least squares. This is a logarithmic approach, as is the Liklihood Ratio Test.
 
Very interesting, Thank you. I'm going to read one of those papers just for fun.
 
Top