If X is a matrix of variables, g(X) is a scalar-valued function of X, and <⋅, ⋅>F is the Frobenius Inner Product, then dg = <∇g, dX>F. Some examples I've seen derived are d(∣∣X∣∣F) = <∣∣X∣∣FX, dX>F and d(vTXw) = <vwT, dX>F.
In general (and thus if g(X) is still scalar-valued in particular), if J(g) is the Jacobian Matrix for g(X), then dg = J(g)dX.
However, the Frobenius Inner Product returns a scalar, and therefore the RHS of the first equation and thus the LHS of the first equation must be a scalar, while for all X which are "non-trivial matrices" (2 by 2 or larger), each of J(g) and dX should also be a non-trivial matrix, and since the matrix multiplication of two non-trivial matrices is never a scalar, the RHS of the second equation and thus the RHS of the second equation must not be a scalar. Therefore dg is both a scalar and not a scalar.
I have not referenced the fact that ∇g=(J(g))T. This is also true of course, but I can't even get the shapes to make sense. The problem seems to be that the Frobenius Inner Product returns a very different shape than matrix multiplication. What is my misconception here?
In general (and thus if g(X) is still scalar-valued in particular), if J(g) is the Jacobian Matrix for g(X), then dg = J(g)dX.
However, the Frobenius Inner Product returns a scalar, and therefore the RHS of the first equation and thus the LHS of the first equation must be a scalar, while for all X which are "non-trivial matrices" (2 by 2 or larger), each of J(g) and dX should also be a non-trivial matrix, and since the matrix multiplication of two non-trivial matrices is never a scalar, the RHS of the second equation and thus the RHS of the second equation must not be a scalar. Therefore dg is both a scalar and not a scalar.
I have not referenced the fact that ∇g=(J(g))T. This is also true of course, but I can't even get the shapes to make sense. The problem seems to be that the Frobenius Inner Product returns a very different shape than matrix multiplication. What is my misconception here?