Self-designed objective for multiple linear regression

baiyang11

New member
Joined
Jul 9, 2013
Messages
15
A multiple linear regression is to use several predictor variables to predict the outcome of a response variable, like the following relationship:


\(\displaystyle y_{i}=\beta_{1}x_{i1}+...+\beta_{p}x_{ip}+\epsilon_{i}, i=1,...,n\)


I understand the typical objective to learn the \(\displaystyle \beta\) paramters is least-squares, which means to minimize the sum of the sqaure of \(\displaystyle \epsilon_{i}\). Now I want other kinds of objective, for example to maximize the Shannon entropy of the sequences of \(\displaystyle \epsilon\) (or other self-specified objective). I googled towards this direction but no luck. I am wondering if there is any problem (and tool to solve it if possible) I can look into to do that?


Thank you for your help.
 
https://en.wikipedia.org/wiki/Principle_of_maximum_entropy
...
General solution for the maximum entropy distribution with linear constraints

Main article: maximum entropy probability distribution
Discrete case

We have some testable information I about a quantity x taking values in {x1, x2,..., xn}. We assume this information has the form of m constraints on the expectations of the functions fk; that is, we require our probability distribution to satisfy
06dec065cd5be8a978bf8790c23aba6b.png
Furthermore, the probabilities must sum to one, giving the constraint
90d2b42ea70f97a8ee9d3e3ccf904d28.png
The probability distribution with maximum information entropy subject to these constraints is
8f8e0acc69e4e25af4154f8b11d29a0f.png
It is sometimes called the Gibbs distribution. The normalization constant is determined by
77ad132fab67679491936e6050035aa5.png
and is conventionally called the partition function. (Interestingly, the Pitman–Koopman theorem states that the necessary and sufficient condition for a sampling distribution to admit sufficient statistics of bounded dimension is that it have the general form of a maximum entropy distribution.)
The λk parameters are Lagrange multipliers whose particular values are determined by the constraints according to
b5aebe2e585b872156cf66cfb93a20d0.png
These m simultaneous equations do not generally possess a closed form solution, and are usually solved by numerical methods.
Continuous case

For continuous distributions, the Shannon entropy cannot be used, as it is only defined for discrete probability spaces. Instead Edwin Jaynes (1963, 1968, 2003) gave the following formula, which is closely related to the relative entropy (see also differential entropy). ...
 
Thanks for reply.

I read the wiki page, but I am not sure how it can help to solve my problem. Could you please elaborate more?
Please note that I am trying to compute (estimate) entropy over a sequence.

Normal 'Curve Fitting' is somewhat different than doing a maximum entropy estimate. The constraints are a little different and the form is different. Try reading the lecture notes
https://www.google.com/search?q=brigita+urbanc&ie=utf-8&oe=utf-8#q=brigita+urbanc+lecture+entropy
 
Last edited:
Top