Exercise:
Assume that 4 vectors in \(\mathbb{R}^3\) satisfy \(A+B+C+D\)=0.
Simplify
$$ A\times B-B\times C+C \times D-D\times A$$
Solution: By the anti-symmetry \(-B\times C = C\times B\)
and \(-D\times A = A\times D\) we can rewrite the expression as
$$ A\times B+C\times B+C \times D+A\times D$$
Using the distributive law of the cross product, we have
$$ (A+C)\times B+(C+A) \times D$$
and again
$$ (A+C)\times (B+D) =** $$
However, as \(F=(A+C) = -(B+D)\) we get
$$ **= F\times (-F) = 0 $$
Exercise: Integrate \(dx/y\) over the circle (oriented counterclockwise)
\(C:=\{x^2+y^2=R^2\}\).…
Exercise:
Assume that 4 vectors in \(\mathbb{R}^3\) satisfy \(A+B+C+D\)=0.
Simplify
$$ A\times B-B\times C+C \times D-D\times A$$
Solution: By the anti-symmetry \(-B\times C = C\times B\)
and \(-D\times A = A\times D\) we can rewrite the expression as
$$ A\times B+C\times B+C \times D+A\times D$$
Using the distributive law of the cross product, we have
$$ (A+C)\times B+(C+A) \times D$$
and again
$$ (A+C)\times (B+D) =** $$
However, as \(F=(A+C) = -(B+D)\) we get
$$ **= F\times (-F) = 0 $$
Exercise: Integrate \(dx/y\) over the circle (oriented counterclockwise)
\(C:=\{x^2+y^2=R^2\}\).
Solution: utilize a counter clockwise parametrization
$$ x(t)=R\cos(2\pi t),~~~~~y(t)=R\sin(2\pi t),~~~~ t\in [0,1) $$
The differential becomes
$$ \frac{dx}{y} = \frac{ -2\pi R\sin(2\pi t) dt}{R\sin(2\pi t)}= -2\pi dt $$
(except two removable singularities at \(t=0,\frac{1}{2}\)).
The overall integral would then be
$$ \int_{C} \frac{dx}{y} = \int_{0}^1 (-2\pi dt) = -2\pi $$
An alternative approach is to integrate over \(x\) and use the explicit
form of \(y=\pm \sqrt{R^2-x^2}\) (and account for integration direction):
$$
\underbrace{\int_{R}^{-R} \frac{dx}{\sqrt{R^2-x^2}} dx}_{
\begin{array}{l}
\text{Upper half integral} \\
\text{Note: right to left direction}
\end{array}} +
\underbrace{\int_{-R}^{R} \frac{dx}{(-\sqrt{R^2-x^2})} dx}_{
\text{Lowed half integral}}
= -2\int_{-R}^{R} \frac{dx}{\sqrt{R^2-x^2}}
=** $$
Then, by substitution \(x = -\cos(\pi t) \) with end-points \(t=0,1\) we get
$$ ** = -2\int_{0}^1 \frac{\pi R\sin(\pi t)}{R\sin(\pi t)} dt = -2\pi $$
Exercise: let
\(\omega = dx\wedge dy + ds \wedge dt + du\wedge dv \). Find \(\omega^3\).
Solution: This is a \(2\)-form on \(\mathbb{R}^6\), so \(\omega^3\)
would be a \(6\)-form on \(\mathbb{R}^6\), hence a multiple of the volume form:
$$ \omega^3 = cdx \wedge dy \wedge du \wedge dv \wedge ds \wedge dt $$
So that we just need to find \(c\).
First compute the \(4\)-form \(\omega^2\):
$$
\begin{align*}
\omega\wedge\omega &= (dx\wedge dy + ds \wedge dt + du\wedge dv )\wedge
(dx\wedge dy + ds \wedge dt + du\wedge dv ) \\
= &(dx\wedge dy)\wedge (dx\wedge dy) +
(dx\wedge dy)\wedge (ds \wedge dt) +
(dx\wedge dy)\wedge (du\wedge dv) \\
& +(ds \wedge dt)\wedge (dx\wedge dy)
+(ds \wedge dt)\wedge (ds \wedge dt)
+(ds \wedge dt)\wedge (du\wedge dv) \\
& +(du\wedge dv)\wedge (dx\wedge dy)
+(du\wedge dv)\wedge (ds \wedge dt)
+(du\wedge dv)\wedge (du\wedge dv) \\ \end{align*} $$
We can get rid of parenthesis since the wedge product is also associative
(in contrast to the cross product ). The diagonal elements vanish
since \(dx\wedge dx = ds\wedge ds = du\wedge du =0\), and the
off diagonal elements are equal as \(2\)-forms have commutative wedge
\(\alpha^\wedge \beta = (-1)^4 \beta\wedge \alpha \)
so that
$$
\begin{align*}
\omega^2 &= 2\left[ (dx\wedge dy)\wedge (ds \wedge dt)
+ (dx\wedge dy)\wedge (du\wedge dv)
+ (ds \wedge dt)\wedge (du\wedge dv) \right]
\end{align*} $$
Then by the same rules we get
$$
\begin{align*} \omega^3 =
\omega \wedge \omega^2 &= 2\left[ dx\wedge dy + ds \wedge dt + du\wedge dv\right] \wedge \left[(dx\wedge dy)\wedge (ds \wedge dt)
+ (dx\wedge dy)\wedge (du\wedge dv)
+ (ds \wedge dt)\wedge (du\wedge dv) \right] \\
&= 2 \left[(dx\wedge dy)\wedge (du \wedge dv) \wedge (ds\wedge dt) +
(ds\wedge dt)\wedge(dx\wedge dy)\wedge (du \wedge dv) +
(du \wedge dv) \wedge (dx\wedge dy)\wedge (ds\wedge dt) \right] \\
& = 6 dx\wedge dy\wedge du \wedge dv \wedge ds\wedge dt
\end{align*} $$
By parameterization, this extends to integration over surfaces, curves and volumes, giving rise
to the beautiful theory of vector calculus.
This theory can be elegantly generalized to the theory of integrals of differential
forms
Our new type of integral, denoted
$$ \int_{C} \omega = \text{ The integral of the form } \omega \text{ over the manifold } C, $$
has two main components:
Note: we require that the dimension of \(C\) equals to the degree of \(\omega\).
Surface integrals (on two dimensional manifolds) take \(2\)-forms, volume integrals (on three dimensional
manifolds) take \(3\)-forms, and in general, if the tangent of \(C\subset \mathbb{R}^n\) at point \(x\)
is \(k\) dimensional (based on its dimension), then \(\omega\) should be a \(k\)-form.
We first define the integral on a \(k\)-dimensional cube \(C\), mapped by \(\psi\) into \(\Omega \subset
\mathbb{R}^n\). This cube can be partitioned to many small cubes of size \(\epsilon\), and
we defined our integral over the image of the cube as the limit of the sum
$$ \int_{\psi C}\omega = \lim_{\epsilon\to 0} \sum_{\text{cubes}} \epsilon^k \omega(v_1,\ldots,v_k) $$
where \( v_k:=\frac{\partial \psi}{\partial e_k} \) is the tangent at the direction
the \(k\)-th entry of \(\psi\).
To generalize to smooth manifolds , assume that a patch \(C’\) smoothly maps to the
cube \(C\) through \(\phi\):
$$ C’ \rightarrow{\phi} C \rightarrow{\psi} \Omega $$
Then the integral of \(C\) can be pulled back to \(C’\)
by defining a differential form:
\( \psi^* \omega = \sum_I \overbrace{c_I(\psi(y))}^{\psi^*c_I} dx_{i_1} \wedge\ldots\wedge dx_{i_k} \),
(known as a pullback form), such that
$$ \int_{C’} \psi^* \omega = \int_{\psi C} \omega $$
If every coordinate \(x\in \Omega \) is an image of \(y\in C\)
\( x_i = \psi_i(y)\), then the corresponding \(1\)-form is the sum
\( dx_i = \sum \frac{\partial \psi_i}{\partial y_j} dy_j \).
We can re-arrange the form as
$$ \omega = \frac{(x_1 dx_2 – x_2 dx_1)}{|r|^2}\wedge \frac{dx_3}{|r|} + \frac{ x_3 dx_1\wedge dx_2}{|r|^3} $$
The left part of this form vanish due to the spherical symmetry, leaving only the
right side form \(\frac{ x_3 dx_1\wedge dx_2}{|r|^3}\) whose integral over the sphere
is merely its surface area \( \int_C \omega = 4\pi \).
In general, for any \(k\)-form \(\omega\), there exists unique
\(k+1\)-form (denoted \(d\omega\)) s.t the flux through the boundary
of a small cube \(C_{\epsilon}^{k+1}\) is
the integral of \(d\omega\) over the interior of the cube.
This \(d\omega\), called the differential mapping (or just the differential)
of \(\omega\), admits several properties:
1.
Given a bilinear form \(Q(\cdot,\cdot)\), or, equivalently, a mapping \(Q:U\to U^*\), one can bake easily new bilinear forms from linear operators \(U\to U\): just take \(Q_A(u,v):=Q(u,Av)\).
This opens door for exploiting the interplay between the normal forms of operators and the properties of the corresponding bilinear forms. One problem is though, that the normal forms of endomorphisms involve complex eigenpairs, and the most natural bilinear form on Euclidean space, the scalar product, is not positive definite if one extends the ground field to complex numbers: for a quadratic form, \(q(\lambda v)=\lambda^2q(v)\), which cannot be nonnegative for all \(\lambda\in\Comp\).…
]]>1.
Given a bilinear form \(Q(\cdot,\cdot)\), or, equivalently, a mapping \(Q:U\to U^*\), one can bake easily new bilinear forms from linear operators \(U\to U\): just take \(Q_A(u,v):=Q(u,Av)\).
This opens door for exploiting the interplay between the normal forms of operators and the properties of the corresponding bilinear forms. One problem is though, that the normal forms of endomorphisms involve complex eigenpairs, and the most natural bilinear form on Euclidean space, the scalar product, is not positive definite if one extends the ground field to complex numbers: for a quadratic form, \(q(\lambda v)=\lambda^2q(v)\), which cannot be nonnegative for all \(\lambda\in\Comp\).
2.
The solution is to redefine the notion of bilinearity so that it does not affect the definitions on the real space, but remains positive definite when we allow for complex coefficients. This leads to the notion of sesquilinear form, the one that satisfies
\[
Q(u_1+u_2,v)=Q(u_1,v)+Q(u_2,v), \mathrm{\ same\ for\ } Q(u,v_1+v_2), \mathrm{\ and\ }
Q(\lambda u, \mu v)=\bar{\lambda}\mu.
\]
A (complex) vector space with a positive definite sesquilinear (a.k.a Hermitian) form is called a Hermitian space. The standard example is the space of vector(columns) with the Hermitian form
\[
(u,v):=\sum \bar{u}_kv_k.
\]
In general, this is how one makes a Hermitian space out of Euclidean one: allow complex scalars, but make the scalar product sesquilinear rather than bilinear.
3.
As mentioned, any endomorphism \(A:U\to U\) of a Hermitian space engenders a bilinear form
\(Q_A(u,v):=(u,Av)\). One can, of course, apply \(A\) to the first argument; the result, in general, will be different:
\[
(Au,v)\neq (u,Av).
\]
But one can always, for a given operator \(A\) define an operator \(B\) such that
\[
(Bu,v)=(u,Av).
\]
Such a \(B\) is called the adjoint operator, still denoted as \(A^*\): in other words, by definition,
\[
(A^*u,v)=(u,Av) \mathrm{\ for\ all\ } u,v.
\]
4.
One can easily see that in an orthonormal basis, the matrix of the operator adjoint to \(A\) is obtained from the matrix of \(A\) by transposing and complex conjugating.
One can make an analogue of quadratic form out of an operator \(A\) by
\[
q(v):=(v,Av);
\]
the result will be real for all \(v\) if \(A=A^*\), i.e. is self-adjoint.
5.
The fundamental fact about self-adjoint operators is the following
Theorem: Any self-adjoint operator in Hermitian space has real spectrum, and an orthonormal basis consisting of the eigenvectors of \(A\). In particular, the Jordan normal form of a self-adjoint operator has no cells of size 2 or more.
We note that if the operator \(A\) is self-adjoint and real (that is it is interchangeable with complex conjugation: \(\bar{Av}=A\bar{v}\)), then all the eigenvectors can be chosen are real as well.
6.
Similar result can be proven about the skew-adjoint matrices \(A^*=-A\): again, going to the complex domain, we the notice that \(iA\) is self-adjoint, and therefore there exists a basis consisting of the eigenvectors of \(A\), and all eigenvectors of \(A\) are purely imaginary.
If \(A\) is self-adjoint and real, then these non-vanishing eigenvalues split into complex-conjugated pairs, \(\pm\lambda_k\). In particular, a real skew-adjoint operator is necessarily degenerated in an odd-dimensional space.
]]>1. Bilinear forms are functions \(Q:U\times U\to k \) that depend on each of the arguments linearly.
Alternatively, one can think of them as the linear operators
\[
A:U\to U^*, \mathrm{ \ with\ } Q(u,v)=A(u) (v).
\]
If \(U\) has a basis, the bilinear form can be identified with the matrix of its coefficients:
\[
Q_{ij}=Q(e_i,e_j).
\]
(Notice that the order matters!)
Rank 1 bilinear forms of rank 1 are just products of linear functions, \(\bra{u}\bra{v}\).…
]]>1. Bilinear forms are functions \(Q:U\times U\to k \) that depend on each of the arguments linearly.
Alternatively, one can think of them as the linear operators
\[
A:U\to U^*, \mathrm{ \ with\ } Q(u,v)=A(u) (v).
\]
If \(U\) has a basis, the bilinear form can be identified with the matrix of its coefficients:
\[
Q_{ij}=Q(e_i,e_j).
\]
(Notice that the order matters!)
Rank 1 bilinear forms of rank 1 are just products of linear functions, \(\bra{u}\bra{v}\).
2. In the space of continuous functions on interval \([a,b]\), a kernel \(K: [a,b]\times[a,b]\to\Real\) defines a bilinear form
\[
Q(f,g)=\int_a^b \int_a^b K(s,t)f(s)g(t) ds dt.
\]
3. Bilinear form can be symmetric, \(Q(u,v)=Q(v,u)\), or skew-symmetric \(Q(u,v)=-Q(v,u)\), and each bilinear form is a sum of symmetric and skew-symmetric forms. (Skew-)symmetric form correspond to (skew-)symmetric matrices.
4. Passing to a new basis is straightforward:
\[
Q\mapsto C^\top Q C,
\]
where \(C \) is the matrix of passing from new to old basis, abd \(C^\top\) is its transpose.
5. Given a bilinear form \(Q\), one can derive a quadratic form \(q(x):=Q(x,x)\). Adding a skew-symmetric form to a bilinear form leave the associated quadratic form unchanged. Hence, one can assume that a quadratic form comes from a symmetric bilinear form, which can be derived using the polarization process:
\[
Q(x,y)=\frac12(q(x+y)-q(x)-q(y)).
\]
6. A quadratic form on a real vector space is called positive definite is
\[
q(x)>0 \mathrm{\ for\ any\ } x\neq 0.
\]
An example of positive definite quadratic form is the standard Euclidean norm (in a basis \(\{e_k\}_{k=1}^n\)):
\[
q(x)=\sum x_k^2.
\]
One can use any positive definite quadratic form as the defining building block for a Euclidean space (define the norm of a vector as \(\|x\|^2:=q(x)\), the scalar product as \((x,y):=Q(x,y)\) etc). Any theorem about the Euclidean spaces can be reformulated and proven in terms of a positive-definite quadratic form without any loss (say, the Cauchy-Schwartz inequality
\[
(x,y)\leq \|x\|\|y\|
\]
translates into
\[
Q(x,y)\leq \sqrt{q(x)q(y)}.
\]
Still, one needs convenient coordinates. Normal forms in the case of positive-definite forms is done by the familiar process (Gram-Schmidt orthogonalization). The process is straightforward: given a basis \(f_k, k=1,\ldots, n\), set \(e_1:=f_1\), so that and then iterate for \(k\gt 1\):
\[
e_k=f_k+\sum_{l\lt k}c_{kl}e_l,
\]
where the coefficients \(c_{kl}\) are chosen so that \(Q(e_k,e_l)=0\) for \(k<l\), i.e.
\[
c_{kl}=-\frac{Q(f_k,e_l)}{Q(e_l,e_l)}.
\]
(Here we use the fact that \(e_l\neq 0\) – thanks to the linear independence of \(f\)’s, – and positive definiteness of \(Q\).)
Exercise: Consider the space of real polynomial functions of degree at most 3. Consider the form
\[
q_1(f):=|f(-1)|^2+|f(0)|^2+|f(1)|^2.
\]
Is this form positive definite?Consider another form,
\[
q(f)=\int_0^\infty e^{-x}|f(x)|^2dx.
\]
Diagonalize the form using Gram-Schmidt procedure starting with the standard monomial basis \(\{1,x,x^2,x^3\}\).
7. Of course, not only positive definite quadratic forms can be brought to simple normal form; Jacobi diagonalization process works always. It creates a sequence of linear functions \(e^*_k, k=1,\ldots,n\) (that is elements of the dual space) such that
\[
q(v)=\sum a_l e_k(v)^2.
\]
The procedure works like this: in the original basis, the form is
\[
\sum_{k,l} Q_{kl}x_kx_l.
\]
If \(Q_{11}\neq 0\), one can represent
\[
q(x)=Q_{11}(x_1+\sum_{l\geq 2}Q_{1l}/Q_{11} x_l)^2+q^{(1)}
\],
where \(q^{(1)}\) depends only on coordinates \(x_2,\ldots,x_n\). Set \(y_1=x_1+\sum_{l\geq 2}Q_{1l}/Q_{11} x_l\); continuing inductively, we get a diagonalized quadratic form.
If at any step all diagonal elements \(Q_{ll}=0\), change coordinates to \(x_l-x_m, x_l+x_m\).
8. The numbers \(n_+, n_-\) of positive and negative coefficients in a diagonal representation of the quadratic form are called its signature. The signature is an invariant of a quadratic form (that is it does not matter, what is the diagonalization process, the signature of the result is always the same. In fact, \(n_+\) (\(n_-\)) is the largest dimension of a subspace restriction to which of the quadratic form is positive (negative) definite. At the same time, \(n_0:=n-n_+-n_-\) is the dimension of \(\ker Q\).
9. The diagonalization process can be made perfectly efficient is the principal minors of the Gram matrix, that is the determinants
\[
\Delta_0:=1, \Delta_1:=Q_{11}, \Delta_2=\left|\begin{array}{cc}Q_{11}& Q_{12}\\Q_{21}&Q_{22}\\\end{array}\right|,\ldots
\]
are non-vanishing. In this case, the diagonalization results in the quadratic form
\[
\sum_{k=1}^n\frac{\Delta_k}{\Delta_{k-1}}y_k^2,
\]
implying the Sylvester criterion for a quadratic form to be positive definite: all principal minors of its Gram matrix (with respect to any basis) should be positive.
]]>Exercise. Find the signature of the quadratic form \(q_1\) above.
Exercise: do there exist matrices \(A,B\) such that \(AB-BA=E\)?
Solution: let \(f\) be some central function, say, the trace \(f(A) = \text{tr}(A)\). If \(BA-AB = E\), then $$ \underbrace{\text{tr}(BA-AB)}_{=0} = \underbrace{\text{tr}(E)}_{=N} $$ However, the left hand side vanish (as the trace is a central function), whereas the right hand is \(N\) – the dimension of the square (!) matrix \(A\), hence no such
1.
A linear operator \(A:U\to U\) that maps a space into itself is called endomorphism. Such operators can be composed with impunity. In elevated language, they form an algebra. (It is a generalization of our representation of complex numbers as \(2\times 2\)-matrices).
2.
Which means we can form some functions of operators. Polynomials are the easiest one: if \(P=a_0x^n+\ldots+a_n\), then
\[
P(A)=a_0A^n+\ldots+a_n E.
\]
Other functions can be defined as well, if they can be approximated by polynomials: the exponential is a familiar example:
\[
\exp(A)=\sum_{k\geq 0} A^k/k!…
1.
A linear operator \(A:U\to U\) that maps a space into itself is called endomorphism. Such operators can be composed with impunity. In elevated language, they form an algebra. (It is a generalization of our representation of complex numbers as \(2\times 2\)-matrices).
2.
Which means we can form some functions of operators. Polynomials are the easiest one: if \(P=a_0x^n+\ldots+a_n\), then
\[
P(A)=a_0A^n+\ldots+a_n E.
\]
Other functions can be defined as well, if they can be approximated by polynomials: the exponential is a familiar example:
\[
\exp(A)=\sum_{k\geq 0} A^k/k!
\]
Of course, one needs to make sure that this expression makes sense, that is converge (it does – can be proved using the notion of a matrix norm). So one can form functions like \(\sin(A)\). These functions can be very useful, but it is hard to beat the matrix exponential.
3.
Indeed, just like the usual exponential function \(\e(t)=\exp(at)\) satisfies the differential equation
\(d\e/dt=a\e\) with initial condition \(\e(0)=1\), the matrix exponential \(\e(t)=\exp(At)\) satisfies
\[
\frac{\e(t)}{dt}=A\e(t), \e(o)=E.
\]
One example of matrix exponential we know already:
\[
\exp\left(\begin{array}{cc}0&1\\-1&0\end{array}\right)t
\]
is the matrix of rotation by \(t\). This survives to dimension 3: take any skew symmetric matrix
\[
\left(\begin{array}{ccc}
0&c&-b\\
-c&0&a\\
b&-a&0
\end{array}
\right)
\]
then its matrix exponential is a rotation around the axis \((a,b,c)\)…
Compositions (matrix products of rotations) are rotations again. So the rotations around various axes form a group (of rotations), called SO(3). We’ll encounter it (and more generally, discuss what a group is) later on.
4.
Unlike products of numbers (elements of a field, like reals or rationals), the product of matrices is in general noncommutative: \(AB\) is not always the same as \(BA\). The difference is called the commutator:
\[
[A,B]=AB-BA.
\]
This leads to some complications: say,
\(\exp(A+B)\) is not always the same as \(\exp(A)\exp(B)\) (but it is if \(AB=BA\)).
This noncommutativity is not always bad: the noncommutativity of rotation operators in 3D is responsible for big chunk of our physics.
Exercise: let \(A\) is the diagonal matrix with elements \(1,2,\ldots, d\) on the diagonal. Find all matrices commuting with \(A\).
5.
Quite often knowing that a function of an operator vanishes allows one to characterize the operator.
Example: If \(A^2-A=0\), then operator is a projector: the space is split \(U=U_0\oplus U_1\), and
\[
A|_{U_1}=E_{U_1}, A|_{U_0}=0.
\]
Another example: nilpotent operators, that is such that a power of it vanishes: \(A\) is nilpotent if \(A\neq 0, A^n=0 \) for some \(n>1\). One can show that any nilpotent operator is an upper-triangular matrix in some basis.
6.
If we have an operator in \(L(U,V)\) and bases in either of the spaces, we obtain the matrix of the operator.
Changing the basis (in \(U\) or in \(V\)) leads to multiplications of the matrix \(A_{ij}\) on the right and on the left by the change of the basis matrices:
\[
A=\sum_{i,j}\ket{f_i}A_{ij}\bra{e_j}=\sum_{i’}\ket{f’_{i’}}\bra{f’_{i’}}\sum_{i,j}\ket{f_i}A_{ij}\bra{e_j}\sum_j\ket{e’_{j’}}\bra{e_{j’}}=\sum_{i’j’}A’_{i’j’}\ket{f’_{i’}}\bra{e_{j’}},
\]
where the coefficients \(A’_{i’j’}\) of the matrix of operator \(A\) in the new bases are given by
\[
A’_{i’j’}=\sum_{i,j}B_{i’i}A_{ij}C_{jj’}=\sum_{i,j}\braket{e’_{i’}}{e_i}A_{ij}\braket{f_{j}}{f’_{j’}}
\]
(with the \(B\) and \(C\) being the invertible matrices of replacing the basis).
In the case of endomorphisms, i.e. when \(U=V\), the change of basis matrices are the reciprocal: \(BC=E\). Alternatively, if \((A_{ij})\) is the matrix of the endomorphism \(A\) in a basis \(e=f\), then the matrix in the basis \(e’=f’\) is given by
\[
A’=BAB^{-1},
\]
where \(B\) is the basis change matrix.
One nice property of the functions of operators approximable by polynomials is that to change basis, one need not replace basis before applying the function:
\[
f(BAB^{-1})=Bf(A)B^{-1}.
\]
7.
If a \(\Real\) or \(\Comp\)-valued function of a matrix has the similar property,
\[
f(BAB^{-1})=f(A),
\]
such a function is called central.
Immediate examples are \(\det(A)\) and the coefficients of the characteristic polynomial,
\[
p_A(z)=\det(zE-A)=z^n-\tr(A)z^{n-1}+\ldots+(-1)^n\det(A).
\]
8.
In fact all the central functions of a matrix (or operator) are representable as functions of the coefficients of the characteristic polynomial, or as \(\det(f(A))\) for some operator function of \(A\), or on the spectrum of \(A\) (the roots of the characteristic polynomial).
Another characterization of central functions:
\[
f(AB-BA)=0
\]
for any \(A,B\).
Do there exist matrices \(A,B: AB-BA=E\)?
9.
Characteristic polynomial is useful as it allows one to capture eigenvectors, 1-dimensional subspaces left invariant by the operator: we say that \(\lambda\) is an eigenvalue, and \(v\) the corresponding eigenvector if
\[
Av=\lambda v.
\]
Example: if characteristic polynomial has only simple roots, there exist a basis consisting of eigenvectors.
For a polynomial (or analytic function) \(f\), and eigenvalue/vector \(\lambda,v\), one can immediately see that
\[
f(A)v=f(\lambda)v.
\]
In particular, if \(\lambda\) is an element of the spectrum of \(A\) (i.e., if \(p_A(\lambda)=0\)), then there exists a corresponding eigenvector.
10.
One can prove the following remarkable
Theorem (Cayley-Hamilton):
\[
p_A(A)=0.
\]
Those are Fibonacci numbers with initial conditions obtained by the first two determinants: $$ j_1 = \text{det}(a) = a = 1,~~~ ~~~ j_2 = \left| \begin{matrix} a & b \\ c & a \end{matrix} \right| = a^2-bc = 1+1 = 2, $$ and the rest follows the standard Fibonacci sequence (\(1,2,3,5,8,13,\ldots\)). For the second case, \(a=-1,bc=-1\), we similarly get $$ j_{k+1}=aj_k-bcj_{k-1} = -j_k+j_{k-1}$$ with initial conditions \(j_1 = -1\) and \(j_2 = 1+1 = 2 \), which is the alternating Fibonacci sequence (\(-1,2,-3,5,-8,13,\ldots\)).…
]]>Those are Fibonacci numbers with initial conditions obtained by the first two determinants: $$ j_1 = \text{det}(a) = a = 1,~~~ ~~~ j_2 = \left| \begin{matrix} a & b \\ c & a \end{matrix} \right| = a^2-bc = 1+1 = 2, $$ and the rest follows the standard Fibonacci sequence (\(1,2,3,5,8,13,\ldots\)). For the second case, \(a=-1,bc=-1\), we similarly get $$ j_{k+1}=aj_k-bcj_{k-1} = -j_k+j_{k-1}$$ with initial conditions \(j_1 = -1\) and \(j_2 = 1+1 = 2 \), which is the alternating Fibonacci sequence (\(-1,2,-3,5,-8,13,\ldots\)).
Exercise: Find the determinant of
$$
A(x_1,\ldots,x_n)=\left( \begin{matrix} 1 & 1 & 1 & \ldots & 1 \\ x_0 & x_2 & x_3 & & x_n \\ x_0^2 & x_2^2 & x_3^2 & & x_n^2 \\ \vdots & & & \ddots & \\
x_0^{n-2} & x_2^{n-2} & x_3^{n-2} & \ldots & x_n^{n-2}\\
x_0^n & x_2^n & x_3^n & \ldots & x_n^n \end{matrix} \right)
$$
Solution: Experiments with small matrices should convince you that this deperminant is equal to
\[
(x_1+\ldots+x_n)V(x_1,\ldots,x_n),
\]
where \(V(x_1,\ldots,x_n)\) is the standard Vandermonde determinant.
This can be proved by multiplying the Vandermonde matrix from the left by
\[
\left(
\begin{array}{cccc}
1&0&\ldots&0&0&0\\
0&1&\ldots&0&0&0\\
&&\ddots&&\vdots&\\
0&0&\ldots&0&1&0\\
(-1)^ns_n &(-1)^{n-1}s_{n-1} & \ldots &s_3 &-s_2 & s_1\\
%& & -s_2 & s_1
\end{array}
\right)
\]
where
\[
s_1=\sum_{i=1}^n x_i, s_2=\sum_{i<j}x_i x_j, s_3=\sum_{i<j<k}x_ix_jx_k,\ldots
\]
are elementary symmetric functions. One can easily see that this product is \(A(x_1,\ldots,x_n)\), whence the result.
1. Determinant is a function of square matrices (can be defined for any operator \(A:U\to U\), but we’ll avoid this abstract detour). It can be defined as a function that has the following properties:
1. Determinant is a function of square matrices (can be defined for any operator \(A:U\to U\), but we’ll avoid this abstract detour). It can be defined as a function that has the following properties:
2. It is quite easy to see that these properties have the following important implication:
\[
\det(AB)=\det(A)\det(B).
\]
In particular, this means that if \(A\) is invertible \(\Leftrightarrow\) determinant is non-zero!
Conversely, if \(A\) is not invertible, there exists a vector \(x\neq 0:Ax=0\), and hence one of the columns can be represented as a linear combination of others. This, by the properties of the determinants, immediately implies that
\[
A \mathrm{\ is\ invertible\ }\Leftrightarrow \det A\neq 0. A \mathrm{\ is\ invertible\ }\Leftrightarrow \det A\neq 0.
\]
3. One can also derive the Kramer rule:
If the vector \(x\) with coordinates \(x_1,\ldots, x_n\) solves the system of linear equations
\(Ax=b\) (with square matrix \(A\)), that is if
\begin{array}{cccc}
a_{1,1}x_1+&\ldots&+a_{1,n}x_n&=b_1\\
\vdots&&\vdots&\vdots\\
a_{n,1}x_1+&\ldots&+a_{n,n}x_n&=b_n\\
\end{array}
then for any \(k=1,\ldots,n\),
\[
x_k\det(A)=\det(a_1,\ldots,a_{k-1},b,a_{k+1},\ldots,a_n),
\]
where \((a_1,\ldots,a_{k-1},b,a_{k+1},\ldots,a_n)\) is the matrix \(A\) with \(k\)-th column \(a_k\) replaced by the column-vector \(b\).
4. This is an awfully useful result. In particular, it implies a formula for the inverse matrix: if one denote by \(M_{{k}{l}}\) minor, the determinant of the matrix obtained from \(A\) by deleting its \(k\)-th row and \(l\)-th column (called \(k,l\)-th minor), then the matrix with coefficients
\[
B_{kl}=(-1)^{k+l}M_{lk} \mathrm{(notice\ switched\ indices!)}
\]
satisfies
\[
AB=BA=\det(A)E.
\]
In other words, the inverse matrix to \(A\) is a polynomial of its coefficients, divided by \(\det(A)\)…
5. One can also obtain
\[
\det(A)=\sum_\sigma (-1)^{s(\sigma)}a_{1\sigma_1}\cdot\ldots\cdot a_{n\sigma_n}.
\]
From here we can obtain that the determinant of block-triangular matrix is
\[
\det\left(
\begin{array}{cc}
A&B\\
0&D\\
\end{array}
\right)=\det(A)\det(D)
\]
for square-sized \(A,D\).
6. There is an important way to reduce large determinants to smaller ones: it’s called Schur complement:
\[
\mathrm{If\ }\det(A)\neq 0, \det\left(
\begin{array}{cc}
A&B\\
C&D\\
\end{array}
\right)=\det(A)\det(D-CA^{-1}B).
\]
(Notice that all the matrices make sense!)
7. One more general result (Binet-Cauchy formula) about determinants:
Consider two collections of functions, \(f_i,g_i, i=1,\ldots,n\), and the matrix whose coefficients are integrals of products of these functions:
\[
A_{ij}=\int_a^b f_i(x)g_j(x)dx.
\]
Then
\[
\det(A)=\int\ldots\int_{a<x_1<x_2\ldots<x_n} \det(F(x_1,\ldots,x_n))\det(G(x_1,\ldots,x_n))dx_1\ldots dx_n.
\]
(Here \F((x_1,\ldots,x_n)\) is the matrix with the entries
\[
F_{ij}=f_i(x_j),
\]
and similarly for \(G\).
8. There are way too many interesting and useful determinantal formulae… We’ll cover just a few.
A great compendium of results and methods can be found in this survey by Krattenthaler.
Vandermonde: well-known. It appears when one solves the Lagrange interpolation problem:
To find a polynomial \(p=a_0x^{n}+a_1x^{n-1}+\ldots+a_n\) of degree \(n\) which takes given values \(y_0, y_1,\ldots,y_n\) at given points, \(x_0, x_1,\ldots,x_n\).
The result is easy to obtain in different ways (it is
\[
\sum_k y_k\frac{(x-x_1)\ldots(x-x_{k-1})(x-x_{k+1}\ldots(x-x_n)}{(x_k-x_1)\ldots(x_k-x_{k-1})(x_k-x_{k+1}\ldots(x_k-x_n)},
\]
as one can verify easily), and this also gives many interesting identities…
Very useful is also the Cauchy determinant, for \(A_{kl}=\frac{1}{x_k+y_l}\):\
\[
\det A=\frac{\prod_{k<l}(x_{k}-x_{l})(y_{k}-y_{l})}{\prod_{k,l}(x_k+y_l)}.
\]
Jacobi matrices appears in many problems:
\[
J_k=\left(
\begin{array}{ccccc}
a&b&0&\ldots&0\\
c&a&b&\ldots&0\\
0&c&a&\ldots&0\\
\vdots&\vdots&\vdots&\ddots&\vdots\\
0&0&0&\ldots&a\\
\end{array}
\right)
\]
Their determinants satisfy the recursion
\[
j_{k+1}=aj_k-bc j_{k-1}.
\]
Exercises: