If [imath]X[/imath] is a matrix of variables, [imath]g(X)[/imath] is a scalar-valued function of [imath]X[/imath], and [imath]<\cdot,\ \cdot>_F[/imath] is the Frobenius Inner Product, then [imath]dg\ =\ <\nabla g,\ dX>_F[/imath]. Some examples I've seen derived are [imath]d(||X||_F)\ =\ <\frac{X}{||X||_F},\ dX>_F[/imath] and [imath]d(\vec v^T X \vec w)\ =\ <\vec v \vec w^T,\ dX>_F[/imath].
In general (and thus if [imath]g(X)[/imath] is still scalar-valued in particular), if [imath]J(g)[/imath] is the Jacobian Matrix for [imath]g(X)[/imath], then [imath]dg\ =\ J(g)dX[/imath].
However, the Frobenius Inner Product returns a scalar, and therefore the RHS of the first equation and thus the LHS of the first equation must be a scalar, while for all [imath]X[/imath] which are "non-trivial matrices" ([imath]2[/imath] by [imath]2[/imath] or larger), each of [imath]J(g)[/imath] and [imath]dX[/imath] should also be a non-trivial matrix, and since the matrix multiplication of two non-trivial matrices is never a scalar, the RHS of the second equation and thus the RHS of the second equation must not be a scalar. Therefore [imath]dg[/imath] is both a scalar and not a scalar.
I have not referenced the fact that [imath]\nabla g = (J(g))^T[/imath]. This is also true of course, but I can't even get the shapes to make sense. The problem seems to be that the Frobenius Inner Product returns a very different shape than matrix multiplication. What is my misconception here?
In general (and thus if [imath]g(X)[/imath] is still scalar-valued in particular), if [imath]J(g)[/imath] is the Jacobian Matrix for [imath]g(X)[/imath], then [imath]dg\ =\ J(g)dX[/imath].
However, the Frobenius Inner Product returns a scalar, and therefore the RHS of the first equation and thus the LHS of the first equation must be a scalar, while for all [imath]X[/imath] which are "non-trivial matrices" ([imath]2[/imath] by [imath]2[/imath] or larger), each of [imath]J(g)[/imath] and [imath]dX[/imath] should also be a non-trivial matrix, and since the matrix multiplication of two non-trivial matrices is never a scalar, the RHS of the second equation and thus the RHS of the second equation must not be a scalar. Therefore [imath]dg[/imath] is both a scalar and not a scalar.
I have not referenced the fact that [imath]\nabla g = (J(g))^T[/imath]. This is also true of course, but I can't even get the shapes to make sense. The problem seems to be that the Frobenius Inner Product returns a very different shape than matrix multiplication. What is my misconception here?