1. Background Course in Mathematics1
European University Institute
Department of Economics
Fall 2012
antonio villanacci
September 20, 2012
1
I would like to thank the following friends for helpful comments and discussions: Laura Carosi, Michele
Gori, Vincent Maurin, Michal Markun, Marina Pireddu, Kiril Shakhnov and all the students of the courses
I used these notes for in the past several years.
9. Chapter 1
Systems of linear equations
1.1 Linear equations and solutions
Definition 1 A1 linear equation in the unknowns x1 , x2 , ..., xn is an equation of the form
a1 x1 + a2 x2 + ...an xn = b, (1.1)
where b ∈ R and ∀j ∈ {1, ..., n} , aj ∈ R . The real number aj is called the coefficient of xj and b is
called the constant of the equation. aj for j ∈ {1, ..., n} and b are also called parameters of system
(1.1).
Definition 2 A solution to the linear equation (1.1) is an ordered n-tuple (x1 , ..., xn ) := (xj )n j=1
such2 that the following statement (obtained by substituting xj in the place of xj for any j ) is true:
a1 x1 + a2 x2 + ...an xn = b,
The set of all such solutions is called the solution set or the general solution or, simply, the solution
of equation (1.1).
The following fact is well known.
Proposition 3 Let the linear equation
ax = b (1.2)
in the unknown (variable) x ∈ R and parameters a, b ∈ R be given. Then,
b
1. if a = 0, then x =
6 a is the unique solution to (1.2);
2. if a = 0 and b 6= 0, then (1.2) has no solutions;
3. if a = 0 and b = 0, then any real number is a solution to (1.2).
Definition 4 A linear equation (1.1) is said to be degenerate if ∀j ∈ {1, ..., n}, aj = 0, i.e., it has
the form
0x1 + 0x2 + ...0xn = b, (1.3)
Clearly,
1. if b 6= 0, then equation (1.3) has no solution,
2. if b = 0, any n-tuple (xj )n is a solution to (1.3).
j=1
Definition 5 Let a nondegenerate equation of the form (1.1) be given. The leading unknown of the
linear equation (1.1) is the first unknown with a nonzero coefficient, i.e., xp is the leading unknown
if
∀j ∈ {1, ..., p − 1} , aj = 0 and ap 6= 0.
For any j ∈ {1, ..., n} {p} , xj is called a free variable - consistently with the following obvious
result.
1 In this part, I often follow Lipschutz (1991).
2 “:=” means “equal by definition”.
9
10. 10 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Proposition 6 Consider a nondegenerate linear equation a1 x1 + a2 x2 + ...an xn = b with leading
unknown xp . Then the set of solutions to that equation is
( P )
n
b − j∈{1,...,n}{,p} aj xj
(xk )k=1 : ∀j ∈ {1, ..., n} {p} , xj ∈ R and xp =
ap
1.2 Systems of linear equations, equivalent systems and el-
ementary operations
Definition 7 A system of m linear equations in the n unknowns x1 , x2 , ..., xn is a system of the
form ⎧
⎪ a11 x1 + ... + a1j xj + ... + a1n xn = b1
⎪
⎪
⎪
⎨ ...
ai1 x1 + ... + aij xj + ... + ain xn = bi (1.4)
⎪
⎪ ...
⎪
⎪
⎩
am1 xi + ... + amj xj + ... + amn xn = bm
where ∀i ∈ {1, ..., m} and ∀j ∈ {1, ..., n}, aij ∈ R and ∀i ∈ {1, ..., m}, bi ∈ R. We call Li the
i − th linear equation of system (1.4).
A solution to the above system is an ordered n-tuple (xj )n which is a solution of each equation
j=1
of the system. The set of all such solutions is called the solution set of the system.
Definition 8 Systems of linear equations are equivalent if their solutions set is the same.
The following fact is obvious.
Proposition 9 Assume that a system of linear equations contains the degenerate equation
L: 0x1 + 0x2 + ...0xn = b.
1. If b = 0, then L may be deleted from the system without changing the solution set;
2. if b 6= 0, then the system has no solutions.
A way to solve a system of linear equations is to transform it in an equivalent system whose
solution set is “easy” to be found. In what follows we make precise the above sentence.
Definition 10 An elementary operation on a system of linear equations (1.4) is one of the following
operations:
[E1 ] Interchange Li with Lj , an operation denoted by Li ↔ Lj (which we can read “put Li in the
place of Lj and Lj in the place of Li ”);
[E2 ] Multiply Li by k ∈ R {0}, denoted by kLi → Li , k 6= 0 (which we can read “put kLi in the
place of Li , with k 6= 0”);
[E3 ] Replace Li by ( k times Lj plus Li ), denoted by (Li + kLj ) → Li (which we can read “put
Li + kLj in the place of Li ”).
Sometimes we apply [E2 ] and [E3 ] in one step, i.e., we perform the following operation
[E] Replace Li by ( k0 times Lj and k ∈ R {0} times Li ), denoted by (k 0 Lj + kLi ) → Li , k 6= 0.
Elementary operations are important because of the following obvious result.
Proposition 11 If S1 is a system of linear equations obtained from a system S2 of linear equations
using a finite number of elementary operations, then system S1 and S2 are equivalent.
In what follows, first we define two types of “simple” systems (triangular and echelon form
systems), and we see why those systems are in fact “easy” to solve. Then, we show how to transform
any system in one of those “simple” systems.
11. 1.3. SYSTEMS IN TRIANGULAR AND ECHELON FORM 11
1.3 Systems in triangular and echelon form
Definition 12 A linear system (1.4) is in triangular form if the number n of equations is equal to
the number n of unknowns and ∀i ∈ {1, ..., n}, xi is the leading unknown of equation i, i.e., the
system has the following form:
⎧
⎪ a11 x1 +a12 x2 +...
⎪ +a1,n−1 xn−1 +a1n xn = b1
⎪
⎪
⎨ a22 x2 +... +a2,n−1 xn−1 +a2n xn = b2
... (1.5)
⎪
⎪
⎪
⎪ an−1,n−1 xn−1 +an−1n xn = bn−1
⎩
ann xn = bn
where ∀i ∈ {1, ..., n}, aii 6= 0.
Proposition 13 System (1.5) has a unique solution.
Proof. We can compute the solution of system (1.5) using the following procedure, known as
back-substitution.
First, since by assumption ann 6= 0, we solve the last equation with respect to the last unknown,
i.e., we get
bn
xn = .
ann
Second, we substitute that value of xn in the next-to-the-last equation and solve it for the next-to-
the-last unknown, i.e.,
bn−1 − an−1,n · abnn
n
xn−1 =
an−1,n−1
and so on. The process ends when we have determined the first unknown, x1 .
Observe that the above procedure shows that the solution to a system in triangular form is
unique since, at each step of the algorithm, the value of each xi is uniquely determined, as a
consequence of Proposition 3, conclusion 1.
Definition 14 A linear system (1.4) is said to be in echelon form if
1. no equation is degenerate, and
2. the leading unknown in each equation is to the right of the leading unknown of the preceding
equation.
In other words, the system is of the form
⎧
⎪ a11 x1 +... +a1j2 xj2 +...
⎪ +a1s xs =b1
⎪
⎪
⎨ a2j2 xj2 +... +a2,j3 xj3 +... +a2s xs =b2
a3,j3 xj3 +... +a3s xs =b3 (1.6)
⎪
⎪
⎪
⎪ ...
⎩
ar,jr xjr +ar,jr +1 +... +ars xs =br
with j1 := 1 < j2 < ... < jr and a11 , a2j2 , ..., arjr 6= 0. Observe that the above system has r
equations and s variables and that s ≥ r. The leading unknown in equation i ∈ {1, ..., r} is xji .
Remark 15 Systems with no degenerate equations are the “interesting” ones. If an equation is
degenerate and the right hand side term is zero, then you can erase it; if the right hand side term
is not zero, then the system has no solutions.
Definition 16 An unknown xk in system (1.6) is called a free variable if xk is not the leading
unknown in any equation, i.e., ∀i ∈ {1, ..., r} , xk 6= xji .
In system (1.6), there are r leading unknowns, r equations and s − r ≥ 0 free variables.
Proposition 17 Let a system in echelon form with r equations and s variables be given. Then,
the following results hold true.
12. 12 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1. If s = r, i.e., the number of unknowns is equal to the number of equations, then the system
has a unique solution;
2. if s > r, i.e., the number of unknowns is greater than the number of equations, then we can
arbitrarily assign values to the n − r > 0 free variables and obtain solutions of the system.
Proof. We prove the theorem by induction on the number r of equations of the system.
Step 1. r = 1.
In this case, we have a single, nondegenerate linear equation, to which Proposition 6 applies if
s > r = 1, and Proposition 3 applies if s = r = 1.
Step 2.
Assume that r > 1 and the desired conclusion is true for a system with r −1 equations. Consider
the given system in the form (1.6) and erase the first equation, so obtaining the following system:
⎧
⎪ a2j2 xj2 +... +a2,j3 xj3 +...
⎪ = b2
⎨
a3,j3 xj3 +...
(1.7)
⎪
⎪ ... ...
⎩
ar,jr xjr +ar,jr +1 +ars xs = br
in the unknowns xj2 , ..., xs . First of all observe that the above system is in echelon form and has
r − 1 equation; therefore we can apply the induction argument distinguishing the two case s > r
and s = r.
If s > r, then we can assign arbitrary values to the free variables, whose number is (the “old”
number minus the erased ones)
s − r − (j2 − j1 − 1) = s − r − j2 + 2
and obtain a solution of system (1.7). Consider the first equation of the original system
a11 x1 +a12 x2 +... +a1,j2 −1 xj2 −1 +a1j2 xj2 +... = b1 . (1.8)
We immediately see that the above found values together with arbitrary values for the additional
j2 − 2
free variable of equation (1.8) yield a solution of that equation, as desired. Observe also that the
values given to the variables x1 , ..., xj2−1 from the first equation do satisfy the other equations simply
because their coefficients are zero there.
If s = r, the system in echelon form, in fact, becomes a system in triangular form and then the
solution exists and it is unique.
Remark 18 From the proof of the previous Proposition, if the echelon system (1.6) contains more
unknowns than equations, i.e., s > r, then the system has an infinite number of solutions since each
of the s − r ≥ 1 free variables may be assigned an arbitrary real number.
1.4 Reduction algorithm
The following algorithm (sometimes called row reduction) reduces system (1.4) of m equation and
n unknowns to either echelon form, or triangular form, or shows that the system has no solution.
The algorithm then gives a proof of the following result.
Proposition 19 Any system of linear equations has either
1. infinite solutions, or
2. a unique solution, or
3. no solutions.
13. 1.4. REDUCTION ALGORITHM 13
Reduction algorithm.
Consider a system of the form (1.4) such that
∀j ∈ {1, ..., n} , ∃i ∈ {1, .., m} such that aij 6= 0, (1.9)
i.e., a system in which each variable has a nonzero coefficient in at least one equation. If that is
not the case, the remaining variables can renamed in order to have (1.9) satisfied.
Step 1. Interchange equations so that the first unknown, x1 , appears with a nonzero coefficient in
the first equation; i.e., arrange that a11 6= 0.
Step 2. Use a11 as a “pivot” to eliminate x1 from all equations but the first equation. That is, for
each i > 1, apply the elementary operation
µ ¶
ai1
[E3 ] : − L1 + Li → Li
a11
or
[E] : −ai1 L1 + a11 Li → Li .
Step 3. Examine each new equation L :
1. If L has the form
0x1 + 0x2 + .... + 0xn = 0,
or if L is a multiple of another equation, then delete L from the system.3
2. If L has the form
0x1 + 0x2 + .... + 0xn = b,
with b 6= 0, then exit the algorithm. The system has no solutions.
Step 4. Repeat Steps 1, 2 and 3 with the subsystem formed by all the equations, excluding the
first equation.
Step 5. Continue the above process until the system is in echelon form or a degenerate equation
is obtained in Step 3.2.
Summarizing, our method for solving system (1.4) consists of two steps:
Step A. Use the above reduction algorithm to reduce system (1.4) to an equivalent simpler
system (in triangular form, system (1.5) or echelon form (1.6)).
Step B. If the system is in triangular form, use back-substitution to find the solution; if the
system is in echelon form, bring the free variables on the right hand side of each equation, give
them arbitrary values (say, the name of the free variable with an upper bar), and then use back-
substitution.
Example 20
⎧
⎨ x1 + 2x2 + (−3)x3 = −1
3x1 + (−1) x2 + 2x3 = 7
⎩
5x1 + 3x2 + (−4) x3 = 2
Step A.
Step 1. Nothing to do.
3 The justification of Step 3 is Propositon 9 and the fact that if L = kL0 for some other equation L0 in the system,
then the operation −kL0 + L → L replace L by 0x1 + 0x2 + .... + 0xn = 0, which again may be deleted by Propositon
9.
14. 14 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Step 2. Apply the operations
−3L1 + L2 → L2
and
−5L1 + L3 → L3 ,
to get ⎧
⎨ x1 + 2x2 + (−3)x3 = −1
(−7) x2 + 11x3 = 10
⎩
(−7) x2 + 11x3 = 7
Step 3. Examine each new equations L2 and L3 :
1. L2 and L3 do not have the form
0x1 + 0x2 + .... + 0xn = 0;
L2 is not a multiple L3 ;
2. L2 and L3 do not have the form
0x1 + 0x2 + .... + 0xn = b,
Step 4.
Step 1.1 Nothing to do.
Step 2.1 Apply the operation
−L2 + L3 → L3
to get ⎧
⎨ x1 + 2x2 + (−3)x3 = −1
(−7) x2 + 11x3 = 10
⎩
0x1 + 0x2 + 0x3 = −3
Step 3.1 L3 has the form
0x1 + 0x2 + .... + 0xn = b,
1. with b = −3 6= 0, then exit the algorithm. The system has no solutions.
1.5 Matrices
Definition 21 Given m, n ∈ N {0}, a matrix (of real numbers) of order m × n is a table of real
numbers with m rows and n columns as displayed below.
⎡ ⎤
a11 a12 ... a1j ... a1n
⎢ a21 a22 ... a2j ... a2n ⎥
⎢ ⎥
⎢ ... ⎥
⎢ ⎥
⎢ ai1 ai2 ... aij ... ain ⎥
⎢ ⎥
⎣ ... ⎦
am1 am2 ... amj ... amn
For any i ∈ {1, ..., m} and any j ∈ {1, ..., n} the real numbers aij are called entries of the matrix;
the first subscript i denotes the row the entries belongs to, the second subscript j denotes the column
the entries belongs to. We will usually denote matrices with capital letters and we will write Am×n
to denote a matrix of order m × n. Sometimes it is useful to denote a matrix by its “typical”
element and we write[aij ] i∈{1,...,m} , or simply [aij ] if no ambiguity arises about the number of rows
j∈{1,...,n}
and columns. For i ∈ {1, ..., m},
£ ¤
ai1 ai2 ... aij ... ain
15. 1.5. MATRICES 15
is called the i − th row of A and it denoted by Ri (A). For j ∈ {1, ..., n},
⎡ ⎤
a1j
⎢ a2j ⎥
⎢ ⎥
⎢ ... ⎥
⎢ ⎥
⎢ aij ⎥
⎢ ⎥
⎣ ... ⎦
amj
is called the j − th column of A and it denoted by Cj (A).
We denote the set of m × n matrices by Mm,n , and we write, in an equivalent manner, Am×n
or A ∈ Mm,n .
Definition 22 The matrix ⎡ ⎤
a1
Am×1 = ⎣ ... ⎦
am
is called column vector and the matrix
£ ¤
A1×n = a1 , ... an
is called row vector. We usually denote row or column vectors by small Latin letters.
Definition 23 The first nonzero entry in a row R of a matrix Am×n is called the leading nonzero
entry of R. If R has no leading nonzero entries, i.e., if every entry in R is zero, then R is called a
zero row. If all the rows of A are zero, i.e., each entry of A is zero, then A is called a zero matrix,
denoted by 0m×n or simply 0, if no confusion arises.
In the previous sections, we defined triangular and echelon systems of linear equations. Below,
we define triangular, echelon matrices and a special kind of echelon matrices. In Section (1.6), we
will see that there is a simple relationship between systems and matrices.
Definition 24 A matrix Am×n is square if m = n. A square matrix A belonging to Mm,m is called
square matrix of order m.
Definition 25 Given A = [aij ] ∈ Mm,m , the main diagonal of A is made up by the entries aii
with i ∈ {1, .., m}.
Definition 26 A square matrix A = [aij ] ∈ Mm,m is an upper triangular matrix or simply a
triangular matrix if all entries below the main diagonal are equal to zero, i.e., ∀i, j ∈ {1, .., m} , if
i > j, then aij = 0.
Definition 27 A ∈ Mmm is called diagonal matrix of order m if any element outside the principal
diagonal is equal to zero, i.e., ∀i, j ∈ {1, ..., m} such that i 6= j, aij = 0.
Definition 28 A matrix A ∈ Mm,n is called an echelon (form) matrix, or it is said to be in echelon
form, if the following two conditions hold:
1. All zero rows, if any, are on the bottom of the matrix.
2. The leading nonzero entry of each row is to the right of the leading nonzero entry in the
preceding row.
Definition 29 If a matrix A is in echelon form, then its leading nonzero entries are called pivot
entries, or simply, pivots
Remark 30 If a matrix A ∈ Mm,n is in echelon form and r is the number of its pivot entries,
then r ≤ min {m, n}. In fact, r ≤ m, because the matrix may have zero rows and r ≤ n, because the
leading nonzero entries of the first row maybe not in the first column, and the other leading nonzero
entries may be “strictly to the right” of previous leading nonzero entry.
16. 16 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Definition 31 A matrix A ∈ Mm,n is called in row canonical form if
1. it is in echelon form,
2. each pivot is 1, and
3. each pivot is the only nonzero entry in its column.
Example 32 1. All the matrices below are echelon matrices; only the fourth one is in row canonical
form.
⎡ ⎤ ⎡ ⎤
0 7 0 0 1 2 2 3 2 0 1 2 4 ⎡ ⎤ ⎡ ⎤
⎢ 0 0 0 1 −3 3 ⎥ ⎢ 0 0 1 1 −3 3 0 ⎥ 1 2 3 0 1 3 0 0 4
⎢ ⎥ ⎢ ⎥ ⎣ 0 0 1 ⎦ , ⎣ 0 0 0 1 0 −3 ⎦ .
⎣ 0 0 0 0 0 7 ⎦, ⎣ 0 0 0 0 0 7 1 ⎦,
0 0 0 0 0 0 0 1 2
0 0 0 0 0 0 0 0 0 0 0 0 0
2. Any zero matrix is in row canonical form.
Remark 33 Let a matrix Am×n in row canonical form be given. As a consequence of the definition,
we have what follows.
1. If some rows from A are erased, the resulting matrix is still in row canonical form.
2. If some columns of zeros are added, the resulting matrix is still in row canonical form.
Definition 34 Denote by Ri the i − th row of a matrix A. An elementary row operation is one of
the following operations on the rows of A:
[E1 ] (Row interchange) Interchange Ri with Rj , an operation denoted by Ri ↔ Rj (which we can
read “put Li in the place of Rj and Rj in the place of Ri ”);;
[E2 ] (Row scaling) Multiply Ri by k ∈ R {0}, denoted by kRi → Ri , k 6= 0 (which we can read
“put kRi in the place of Ri , with k 6= 0”);
[E3 ] (Row addition) Replace Ri by ( k times Rj plus Ri ), denoted by (Ri + kRj ) → Ri (which we
can read “put Ri + kRj in the place of Ri ”).
Sometimes we apply [E2 ] and [E3 ] in one step, i.e., we perform the following operation
[E] Replace Ri by ( k 0 times Rj and k ∈ R {0} times Ri ), denoted by (k0 Rj + kRi ) → Ri , k 6= 0.
Definition 35 A matrix A ∈ Mm,n is said to be row equivalent to a matrix B ∈ Mm,n if B can
be obtained from A by a finite number of elementary row operations.
It is hard not to recognize the similarity of the above operations and those used in solving
systems of linear equations.
We use the expression “row reduce” as having the meaning of “transform a given matrix into
another matrix using row operations”. The following algorithm “row reduces” a matrix A into a
matrix in echelon form.
Row reduction algorithm to echelon form.
Consider a matrix A = [aij ] ∈ Mm,n .
Step 1. Find the first column with a nonzero entry. Suppose it is column j1 .
Step 2. Interchange the rows so that a nonzero entry appears in the first row of column j1 , i.e., so
that a1j1 6= 0.
Step 3. Use a1j1 as a “pivot” to obtain zeros below a1j1 , i.e., for each i > 1, apply the row operation
µ ¶
aij1
[E3 ] : − R1 + Ri → Ri
a1j1
or
[E] : −aij1 R1 + a11 Ri → Ri .
17. 1.5. MATRICES 17
Step 4. Repeat Steps 1, 2 and 3 with the submatrix formed by all the rows, excluding the first
row.
Step 5. Continue the above process until the matrix is in echelon form.
Example 36 Let’s apply the above algorithm to the following matrix
⎡ ⎤
1 2 −3 −1
⎣ 3 −1 2 7 ⎦
5 3 −4 2
Step 1. Find the first column with a nonzero entry: that is C1 , and therefore j1 = 1.
Step 2. Interchange the rows so that a nonzero entry appears in the first row of column j1 , i.e., so
that a1j1 6= 0: a1j1 = a11 = 1 6= 0.
Step 3. Use a11 as a “pivot” to obtain zeros below a11 . Apply the row operations
−3R1 + R2 → R2
and
−5R1 + R3 → R3 ,
to get ⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 −7 11 7
Step 4. Apply the operation
−R2 + R3 → R3
to get ⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 −3
which is is in echelon form.
Row reduction algorithm from echelon form to row canonical form.
Consider a matrix A = [aij ] ∈ Mm,n in echelon form, say with pivots
a1j1 , a2j2 , ..., arjr .
1
Step 1. Multiply the last nonzero row Rr by arjr ,so that the leading nonzero entry of that row
becomes 1.
Step 2. Use arjr as a “pivot” to obtain zeros above the pivot, i.e., for each i ∈ {r − 1, r − 2, ..., 1},
apply the row operation
[E3 ] : −ai,jr Rr + Ri → Ri .
Step 3. Repeat Steps 1 and 2 for rows Rr−1 , Rr−2 , ..., R2 .
1
Step 4. Multiply R1 by a1j1 .
Example 37 Consider the matrix
⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 −3
in echelon form, with leading nonzero entries
a11 = 1, a22 = −7, a34 = −3.
18. 18 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1
Step 1. Multiply the last nonzero row R3 by −3 ,so that the leading nonzero entry becomes 1:
⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 1
Step 2. Use arjr = a34 as a “pivot” to obtain zeros above the pivot, i.e., for each i ∈ {r − 1, r − 2, ..., 1} =
{2, 1}, apply the row operation
[E3 ] : −ai,jr Rr + Ri → Ri ,
which in our case are
−a2,4 R3 + R2 → R2 i.e., − 10R3 + R2 → R2 ,
−a1,4 R3 + R1 → R1 i.e., R3 + R1 → R1 .
Then, we get ⎡ ⎤
1 2 −3 0
⎣ 0 −7 11 0 ⎦
0 0 0 1
1
Step 3. Multiply R2 by −7 , and get
⎡ ⎤
1 2 −3 0
⎣ 0 1 − 11 0 ⎦
7
0 0 0 1
Use a23 as a “pivot” to obtain zeros above the pivot, applying the operation:
−2R2 + R1 → R1 ,
to get ⎡ ⎤
1
1 0 7 0
⎣ 0 1 − 11 0 ⎦
7
0 0 0 1
which is in row reduced form.
Proposition 38 Any matrix A ∈ Mm,n is row equivalent to a matrix in row canonical form.
Proof. The two above algorithms show that any matrix is row equivalent to at least one matrix
in row canonical form.
Remark 39 In fact, in Proposition 152, we will show that: Any matrix A ∈ Mm,n is row equivalent
to a unique matrix in row canonical form.
1.6 Systems of linear equations and matrices
Definition 40 Given system (1.4), i.e., a system of m linear equation in the n unknowns x1 , x2 , ..., xn
⎧
⎪ a11 x1 + ... + a1j xj + ... + a1n xn = b1
⎪
⎪
⎪
⎨ ...
ai1 x1 + ... + aij xj + ... + ain xn = bi
⎪
⎪ ...
⎪
⎪
⎩
am1 xi + ... + amj xj + ... + amn xn = bm ,
the matrix ⎡ ⎤
a11 ... a1j ... a1n b1
⎢ ... ⎥
⎢ ⎥
⎢ ai1 ... aij ... ain bi ⎥
⎢ ⎥
⎣ ... ⎦
am1 ... amj ... amn bm
is called the augmented matrix M of system (1.4).
19. 1.6. SYSTEMS OF LINEAR EQUATIONS AND MATRICES 19
Each row of M corresponds to an equation of the system, and each column of M corresponds
to the coefficients of an unknown, except the last column which corresponds to the constant of the
system.
In an obvious way, given an arbitrary matrix M , we can find a unique system whose associated
matrix is M ; moreover, given a system of linear equations, there is only one matrix M associated
with it. We can therefore identify system of linear equations with (augmented) matrices.
The coefficient matrix of the system is
⎡ ⎤
a11 ... a1j ... a1n
⎢ ... ⎥
⎢ ⎥
A = ⎢ ai1 ... aij
⎢ ... ain ⎥ ⎥
⎣ ... ⎦
am1 ... amj ... amn
One way to solve a system of linear equations is as follows:
1. Reduce its augmented matrix M to echelon form, which tells if the system has solution; if M
has a row of the form (0, 0, ..., 0, b) with b 6= 0, then the system has no solution and you can stop.
If the system admits solutions go to the step below.
2. Reduce the matrix in echelon form obtained in the above step to its row canonical form.
Write the corresponding system. In each equation, bring the free variables on the right hand side,
obtaining a triangular system. Solve by back-substitution.
The simple justification of this process comes from the following facts:
1. Any elementary row operation of the augmented matrix M of the system is equivalent to
applying the corresponding operation on the system itself.
2. The system has a solution if and only if the echelon form of the augmented matrix M does
not have a row of the form (0, 0, ..., 0, b) with b 6= 0 - simply because that row corresponds to
a degenerate equation.
3. In the row canonical form of the augmented matrix M (excluding zero rows) the coefficient of
each nonfree variable is a leading nonzero entry which is equal to one and is the only nonzero
entry in its respective column; hence the free variable form of the solution is obtained by
simply transferring the free variable terms to the other side of each equation.
Example 41 Consider the system presented in Example 20:
⎧
⎨ x1 + 2x2 + (−3)x3 = −1
3x1 + (−1) x2 + 2x3 = 7
⎩
5x1 + 3x2 + (−4) x3 = 2
The associated augmented matrix is:
⎡ ⎤
1 2 −3 −1
⎣ 3 −1 2 7 ⎦
5 3 −4 2
In example 36, we have see that the echelon form of the above matrix is
⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 −3
which has its last row of the form (0, 0, ..., 0, b) with b = −3 6= 0, and therefore the system has no
solution.
21. Chapter 2
The Euclidean Space Rn
2.1 Sum and scalar multiplication
It is well known that the real line is a representation of the set R of real numbers. Similarly, a
ordered pair (x, y) of real numbers can be used to represent a point in the plane and a triple (x, y, z)
or (x1 , x2 , x3 ) a point in the space. In general, if n ∈ N+ := {1, 2, ..., }, we can define (x1 , x2 , ..., xn )
or (xi )n as a point in the n − space.
i=1
Definition 42 Rn := R×... × R .
In other words, Rn is the Cartesian product of R multiplied n times by itself.
Definition 43 The elements of Rn are ordered n-tuple of real numbers and are denoted by
x = (x1 , x2 , ..., xn ) or x = (xi )n .
i=1
xi is called i − th component of x ∈ Rn .
Definition 44 x = (xi )n ∈ Rn and y = (yi )n are equal if
i=1 i=1
∀i ∈ {1, ..., n} , xi = yi .
In that case we write x = y.
Let us introduce two operations on Rn and analyze some properties they satisfy.
Definition 45 Given x ∈ Rn , y ∈ Rn , we call addition or sum of x and y the element denoted by
x + y ∈ Rn obtained as follows
x + y := (xi + yi )n .
i=1
Definition 46 An element λ ∈ R is called scalar.
Definition 47 Given x ∈ Rn and λ ∈ R, we call scalar multiplication of x by λ the element
λx ∈ Rn obtained as follows
λx := (λxi )n .
i=1
Geometrical interpretation of the two operations in the case n = 2.
21
22.
23. 22 CHAPTER 2. THE EUCLIDEAN SPACE RN
From the well known properties of the sum and product of real numbers it is possible to verify
that the following properties of the above operations do hold true.
Properties of addition.
A1. (Associative) ∀x, y ∈ Rn , (x + y) + z = x + (y + z);
A2. (existence of null element) there exists an element e in Rn such that for any x ∈ Rn ,
x + e = x; in fact such element is unique and it is denoted by 0;
A3. (existence of inverse element) ∀x ∈ Rn ∃y ∈ Rn such that x + y = 0; in fact, that element
is unique and denoted by −x;
A4. (Commutative) ∀x, y ∈ Rn , x + y = y + x.
Properties of multiplication.
M1. (distributive) ∀α ∈ R, x ∈ Rn , y ∈ Rn α(x + y) = αx + αy;
M2. (distributive) ∀α ∈ R, β ∈ R, x ∈ Rn , (α + β)x = αx + βx
M3. ∀α ∈ R, β ∈ R, x ∈ Rn , (αβ)x = α(βx);
M4. ∀x ∈ Rn , 1x = x.
2.2 Scalar product
Definition 48 Given x = (xi )n , y = (yi )n ∈ Rn , we call dot, scalar or inner product of x and
i=1 i=1
y, denoted by xy or x · y, the scalar
X n
xi · yi ∈ R.
i=1
Remark 49 The scalar product of elements of Rn satisfies the following properties.
1. ∀x, y ∈ Rn x · y = y · x;
2. ∀α, β ∈ R, ∀x, y, z ∈ Rn (αx + βy) · z = α(x · z) + β(y · z);
3. ∀x ∈ Rn , x · x ≥ 0;
4. ∀x ∈ Rn , x · x = 0 ⇐⇒ x = 0.
Definition 50 The set Rn with above described three operations (addition, scalar multiplication
and dot product) is usually called Euclidean space of dimension n.
Definition 51 Given x = (xi )n ∈ Rn , we denote the (Euclidean) norm or length of x by
i=1
v
u n
1 uX
kxk := (x · x) = t
2
(xi )2 .
i=1
Geometrical Interpretation of scalar products in R2 .
Given x = (x1 , x2 ) ∈ R2 {0}, from elementary trigonometry we know that
x = (kxk cos α, kxk sin α) (2.1)
where α is the measure of the angle between the positive part of the horizontal axes and x itself.
24.
25. 2.3. NORMS AND DISTANCES 23
Using the above observation we can verify that given x = (x1 , x2 ) and y = (y1 , y2 ) in R2 {0},
xy = kxk · kyk · cos γ
where γ is an1 angle between x and y.
scan and insert picture (Marcellini-Sbordone page 179)
From the picture and (2.1), we have
x = (kxk cos α1 , kxk sin α1 )
and
y = (kyk cos α2 , kyk sin α2 ) .
Then2
xy = kxk kyk (cos α1 cos α2 + sin α1 sin α2 ) = kxk kyk cos (α2 − α1 ) .
Taken x and y not belonging to the same line, define θ∗ := (angle between x and y with minimum
measure). From the above equality, it follows that
θ∗ = π
2 ⇔ x·y =0
θ∗ < π
2 ⇔ x·y >0
θ∗ > π
2 ⇔ x · y < 0.
Definition 52 x, y ∈ Rn {0} are orthogonal if xy = 0.
2.3 Norms and Distances
Proposition 53 (Properties of the norm). Let α ∈ R and x, y ∈ Rn .
1. kxk ≥ 0, and kxk = 0 ⇔ x = 0,
2. kαxk = |α| · kxk,
3. kx + yk ≤ kxk + kyk (Triangle inequality),
4. |xy| ≤ kxk · kyk (Cauchy-Schwarz inequality ).
qP
n 2 2
Proof. 1. By definition kxk = i=1 (xi ) ≥ 0. Moreover, kxk = 0 ⇔ kxk = 0 ⇔
Pn 2
i=1 (xi ) = 0 ⇔ x = 0.
qP qP
n
2. kαxk = i=1 α2 (xi )2 = |α| n 2
i=1 (xi ) = |α| · kxk.
4. (3 is proved using 4)
We want to show that |xy| ≤ kxk · kyk or |xy|2 ≤ kxk2 · kyk2 , i.e.,
à n !2 à n ! à n !
X X X
xi yi ≤ x2
i · 2
yi
i=1 i=1 i=1
Pn Pn Pn
Defined X := i=1 x2 , Y :=
i
2
i=1 yi and Z := i=1 xi yi , we have to prove that
Z 2 ≤ XY. (2.2)
Observe that
Pn
∀a ∈ R, 1. (axi + yi )2 ≥ 0, and
Pi=1
n 2
2. i=1 (axi + yi ) = 0 ⇔ ∀i ∈ {1, ..., n} , axi + yi = 0
1 Recall that ∀x ∈ R, cos x = cos (−x) = cos (2π − x).
2 Recall that
cos (x1 ± x2 ) = cos x1 cos x2 ∓ sin x1 sin x2 .
26. 24 CHAPTER 2. THE EUCLIDEAN SPACE RN
Moreover,
n
X n
X n
X n
X
(axi + yi )2 = a2 x2 + 2a
i xi yi + yi = a2 X + 2aZ + Y ≥ 0
2
(2.3)
i=1 i=1 i=1 i=1
Z
If X > 0, we can take a = − X , and from (2.3), we get
Z2 Z2
0≤ X −2 +Y
X2 X
or
Z 2 ≤ XY,
as desired.
If X = 0, then x = 0 and Z = 0, and (2.2) is true simply because 0 ≤ 0.
3. It suffices to show that kx + yk2 ≤ (kxk + kyk)2 .
n
X X³
n ´
kx + yk2 = (xi + yi )2 = (xi )2 + 2xi · yi + (yi )2 =
i=1 i=1
(4 ab ove)
2 2 2 2 2 2 2
= kxk + 2xy + kyk ≤ kxk + 2 |xy| + kyk ≤ kxk + 2 kxk · kyk + kyk = (kxk + kyk) .
Remark 54 |kxk − kyk| ≤ kx − yk .
Recall that ∀a, b ∈ R
−b ≤ a ≤ b ⇔ |a| ≤ b.
From Proposition 53.3,identifying x with x−y and y with y, we get kx − y + yk ≤ kx − yk+kyk,
i.e.,
kxk − kyk ≤ kx − yk
From Proposition 53.3, identifying x with y−x and y with x, we get ky − x + xk ≤ ky − xk+kxk,
i.e.,
kyk − kxk ≤ ky − xk = kx − yk
and
− kx − yk ≤ kxk − kyk
¡ ¢n
Definition 55 For any n ∈ N {0} and for any i ∈ {1, ..., n} , ei := ei j=1 ∈ Rn with
n j,n
⎧
⎨ 0 if i 6= j
ei =
n,j
⎩
1 if i=j
In other words, ei is an element of Rn whose components are all zero, but the i − th component
n
which is equal to 1. The vector ei is called the i − th canonical vector in Rn .
n
Remark 56 ∀x ∈ Rn ,
n
X
kxk ≤ |xi | ,
i=1
as verified below.
° n °
°X ° (1) X °
n
° n
X ° ° X n
° ° °xi ei ° (2)
kxk = ° xi ei ° ≤ = |xi | · °ei ° = |xi | ,
° °
i=1 i=1 i=1 i=1
where (1) follows from the triangle inequality, i.e., Proposition 53.3, and (2) from Proposition
53.2.
Definition 57 Given x, y ∈ Rn ,we denote the (Euclidean) distance between x and y by
d (x, y) := kx − yk
27. 2.3. NORMS AND DISTANCES 25
Proposition 58 (Properties of the distance). Let x, y, z ∈ Rn .
1. d (x, y) ≥ 0, and d (x, y) = 0 ⇔ x = y,
2. d (x, y) = d (y, x),
3. d (x, z) ≤ d (x, y) + d (y, z) (Triangle inequality).
Proof. 1. It follows from property 1 of the norm.
2. It follows from the definition of the distance as a norm.
3. Identifying x with x − y and y with y − z in property 3 of the norm, we get
k(x − y) + (y − z)k ≤ kx − yk + ky − zk, i.e., the desired result.
Remark 59 Let a set X be given.
• Any function n : X → R which satisfy the properties listed below is called a norm. More
precisely, we have what follows. Let α ∈ R and x, y ∈ X.Assume that
1. n (x) ≥ 0, and n (x) = 0 ⇔ x = 0,
2. n (αx) = |α| · n (x),
3. n (x + y) ≤ n (x) + n (y).
Then, n is called a norm and (X, n) is called a normed space.
• Any function d : X × X → R satisfying properties 1, 2 and 3 presented for the Euclidean
distance in Proposition 58 is called a distance or a metric. More precisely, we have what
follows. Let x, y, z ∈ X. Assume that
1. d (x, y) ≥ 0, and d (x, y) = 0 ⇔ x = y,
2. d (x, y) = d (y, x),
3. d (x, z) ≤ d (x, y) + d (y, z) ,
then d is called a distance and (X, d) a metric space.
29. Chapter 3
Matrices
We presented the concept of matrix in Definition 21. In this chapter, we study further properties
of matrices.
Definition 60 The transpose of a matrix A ∈ Mm,n , denoted by AT belongs to Mm,n , and it is
the matrix obtained by writing the rows of A, in order, as columns:
⎡ ⎤T ⎡ ⎤
a11 ... a1j ... a1n a11 ... ai1 ... am1
⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥
A =⎢
T
⎢ ai1 ... aij ... ain ⎥ =⎢
⎥ ⎢ a1j ... aij ... amj ⎥.
⎥
⎣ ... ⎦ ⎣ ... ⎦
am1 ... amj ... amn a1n ... ain ... amn
In other words, row 1 of the matrix A becomes column 1 of AT , row 2 of A becomes column 2 of
A , and so on, up to row m which becomes column m of AT . Same results is obtained proceeding
T
as follows: column 1of A becomes row 1 of AT , column 2 of A becomes row 2 of AT , and so on, up
to column n which becomes row n of AT . More formally, given A = [aij ]i∈{1,...,m} ∈Mm,n , then
j∈{1,...,n}
AT = [aji ] j∈{1,...,n} ∈ Mn,m .
i∈{1,...,m}
Definition 61 A matrix A ∈Mn,n is said symmetric if A = AT , i.e., ∀i, j ∈ {1, ..., n}, aij = aji .
Remark 62 We can write a matrix Am×n = [aij ] as
⎡ ⎤
R1 (A)
⎢ ... ⎥
⎢ ⎥
A=⎢
⎢ Ri (A) ⎥ = [C1 (A) , ..., Cj (A) , ..., Cn (A)]
⎥
⎣ ... ⎦
Rm (A)
where
h i
j
Ri (A) = [ai1 , ..., aij , ...ain ] := Ri (A) , ..., Ri (A) , ...Ri (A) ∈ Rn
1 n
for i ∈ {1, ..., m} and
⎡ ⎤ ⎡ 1
⎤
a1j Cj (A)
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ i ⎥
Cj (A) = ⎢ aij
⎢
⎥ := ⎢ Cj (A)
⎥ ⎢
⎥ ∈ Rn
⎥ for j ∈ {1, ..., n} .
⎣ ⎦ ⎣ ⎦
amj m
Cj (A)
In other words, Ri (A) denotes row i of the matrix A and Cj (A) denotes column j
of matrix A.
27
30. 28 CHAPTER 3. MATRICES
3.1 Matrix operations
Definition 63 Two matrices Am×n := [aij ] and Bm×n := [bij ] are equal if for i ∈ {1, ..., m},
j ∈ {1, ..., n}
∀i ∈ {1, ..., m} , j ∈ {1, ..., n} , aij = bij .
Definition 64 Given the matrices Am×n := [aij ] and Bm×n := [bij ], the sum of A and B, denoted
by A + B is the matrix Cm×n = [cij ] such that
∀i ∈ {1, ..., m} , j ∈ {1, ..., n} , cij = aij + bij
Definition 65 Given the matrices Am×n := [aij ] and the scalar α, the product of the matrix A by
the scalar α, denoted by α · A or αA, is the matrix obtained by multiplying each entry A by α :
αA := [αaij ]
Remark 66 It is easy to verify that the set of matrices Mm,n with the above defined sum and
scalar multiplication satisfies all the properties listed for elements of Rn in Section 2.1.
Definition 67 Given A = [aij ] ∈ Mm,n , B = [bjk ] ∈ Mn,p , the product A · B is a matrix
C = [cik ] ∈ Mm,p such that
n
X
∀i ∈ {1, ...m} , ∀k ∈ {1, ..., p} , cik := aij bjk = Ri (A) · Ck (B)
j=1
i.e., since ⎡ ⎤
R1 (A)
⎢ ... ⎥
⎢ ⎥
A=⎢
⎢ Ri (A) ⎥ , B = [C1 (B) , ..., Ck (B) , ..., Cp (B)]
⎥ (3.1)
⎣ ... ⎦
Rm (A)
⎡ ⎤
R1 (A) · C1 (B) ... R1 (A) · Ck (B) ... R1 (A) · Cp (B)
⎢ ... ⎥
⎢ ⎥
AB = ⎢
⎢ Ri (A) · C1 (B) ... Ri (A) · Ck (B) ... Ri (A) · Cp (B) ⎥
⎥ (3.2)
⎣ ... ⎦
Rm (A) · C1 (B) ... Rm (A) · Ck (B) ... Rm (A) · Cp (B)
Remark 68 If A ∈ M1n , B ∈ Mn,1 , the above definition coincides with the definition of scalar
product between elements of Rn . In what follows, we often identify an element of Rn with a row or
a column vectors ( - see Definition 22) consistently with we what write. In other words Am×n x = y
means that x and y are column vector with n entries, and wAm×n = z means that w and z are row
vectors with m entries.
Definition 69 If two matrices are such that a given operation between them is well defined, we say
that they are conformable with respect to that operation.
Remark 70 If A, B ∈Mm,n , they are conformable with respect to matrix addition. If A ∈Mm,n
and B ∈Mn,p , they are conformable with respect to multiplying A on the left of B. We often say the
two matrices are conformable and let the context define precisely the sense in which conformability
is to be understood.
Remark 71 (For future use) ∀k ∈ {1, ..., p},
⎡ ⎤ ⎡ ⎤
R1 (A) R1 (A) · Ck (B)
⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥
⎢ Ri (A) ⎥ · Ck (B) = ⎢
A · Ck (B) = ⎢ Ri (A) · Ck (B) ⎥ (3.3)
⎥ ⎢ ⎥
⎣ ... ⎦ ⎣ ... ⎦
Rm (A) Rm (A) · Ck (B)
31. 3.1. MATRIX OPERATIONS 29
Then, just comparing (3.2) and (3.3), we get
£ ¤
AB = A · C1 (B) ... A · Ck (B) ... A · Cp (B) (3.4)
Similarly, ∀i ∈ {1, ..., m},
£ ¤
Ri (A) · B = Ri (A) · C1 (B) ... Ck (B) ... Cp (B) = ¤
£ (3.5)
= Ri (A) · C1 (B) ... Ri (A) · Ck (B) ... Ri (A) · Cp (B)
Then, just comparing (3.2) and (3.5), we get
⎡ ⎤
R1 (A) B
⎢ ... ⎥
⎢ ⎥
AB = ⎢ Ri (A) B
⎢
⎥
⎥ (3.6)
⎣ ... ⎦
Rm (A) B
Definition 72 A submatrix of a matrix A ∈Mm,n is a matrix obtained from A erasing some rows
and columns.
Definition 73 A matrix A ∈Mm,n is partitioned in blocks if it is written as submatrices using a
system of horizontal and vertical lines.
The reason of the partition into blocks is that the result of operations on block matrices can
obtained by carrying out the computation with blocks, just as if they were actual scalar entries of
the matrices, as described below.
Remark 74 We verify below that for matrix multiplication, we do not commit an error if, upon
conformably partitioning two matrices, we proceed to regard the partitioned blocks as real numbers
and apply the usual rules.
1. Take a := (ai )n1 ∈ Rn1 , b := (bj )n2 ∈ Rn2 , c := (ci )n1 ∈ Rn1 , d := (dj )n2 ∈ Rn2 ,
i=1 j=1 i=1 j=1
⎤⎡
£ ¤ c n1
X n2
X
a | b 1×(n +n ) ⎣ − ⎦ = ai ci + bj dj = a · c + b · d. (3.7)
1 2
d (n i=1 j=1
1 +n2 )×1
2.
Take A ∈Mm,n1 , B ∈Mm,n2 , C ∈Mn1 ,p , D ∈Mn2 ,p , with
⎡ ⎤ ⎡ ⎤
R1 (A) R1 (B)
A = ⎣ ... ⎦ , B = ⎣ ... ⎦,
Rm (A) Rm (B)
C = [C1 (C) , ..., Cp (C)] ,
D = [C1 (D) , ..., Cp (D)]
Then,
⎡ ⎤
∙ ¸ R1 (A) R1 (B) ∙ ¸
£ ¤ C
A B m×(n +n ) = ⎣ ... ... ⎦ C1 (C) , ..., Cp (C) =
1 2 D (n +n )×p C1 (D) , ..., Cp (D)
1 2 Rm (A) Rm (B)
⎡ ⎤
R1 (A) · C1 (C) + R1 (B) · C1 (D) ... R1 (A) · +R1 (B) · Cp (D)
= ⎣ ... ⎦=
Rm (A) · C1 (C) + Rm (B) · C1 (D) Rm (A) · Cp (C) + Rm (B) · Cp (D)
⎡ ⎤ ⎡ ⎤
R1 (A) · C1 (C) ... R1 (A) · Cp (C) R1 (B) · C1 (D) ... R1 (B) · Cp (D)
= ⎣ ... ⎦ + ⎣ ... ⎦=
Rm (A) · C1 (C) Rm (A) · Cp (C) Rm (B) · C1 (D) Rm (B) · Cp (D)
= AC + BD.
32. 30 CHAPTER 3. MATRICES
Definition 75 Let the matrices Ai ∈ M (ni , ni ) for i ∈ {1, ..., K}, then the matrix
⎡ ⎤
A1
⎢ A2 ⎥
⎢ ⎥ ÃK !
⎢ .. ⎥ X XK
⎢ . ⎥
A=⎢ ⎢ ⎥∈M ni , ni
Ai ⎥
⎢ ⎥ i=1 i=1
⎢ .. ⎥
⎣ . ⎦
AK
is called block diagonal matrix.
Very often having information on the matrices Ai gives information on A.
Remark 76 It is easy, but cumbersome, to verify the following properties.
1. (associative) ∀A ∈ Mm,n , ∀B ∈ Mnp , ∀C ∈ Mpq , A(BC) = (AB)C;
2. (distributive) ∀A ∈ Mm,n , ∀B ∈ Mm,n , ∀C ∈ Mnp , (A + B)C = AC + BC.
3. ∀x, y ∈ Rn and ∀α, β ∈ R,
A (αx + βy) = A (αx) + B (βy) = αAx + βAy
It is false that:
1. (commutative) ∀A ∈ Mm,n , ∀B ∈ Mnp , AB = BA;
2. ∀A ∈ Mm,n , ∀B, C ∈ Mnp , hA 6= 0, AB = ACi =⇒ hB = Ci;
3. ∀A ∈ Mm,n , ∀B ∈ Mnp , hA 6= 0, AB = 0i =⇒ hB = 0i.
Let’s show why the above statements are false.
1. ⎡ ⎤
∙ ¸ 1 0
1 2 1
A= B=⎣ 2 1 ⎦
−1 1 3
0 1
⎡ ⎤
∙ ¸ 1 0 ∙ ¸
1 2 1 ⎣ 5 3
AB = 2 1 ⎦=
−1 1 3 1 4
0 1
⎡ ⎤ ⎡ ⎤
1 0 ∙ ¸ 1 2 1
1 2 1
BA = ⎣ 2 1 ⎦ =⎣ 1 5 5 ⎦
−1 1 3
0 1 −1 1 3
∙ ¸ ∙ ¸
1 2 1 0
C= D=
−1 1 3 2
∙ ¸∙ ¸ ∙ ¸
1 2 1 0 7 4
CD = =
−1 1 3 2 2 2
∙ ¸∙ ¸ ∙ ¸
1 0 1 2 1 2
DC = =
3 2 −1 1 1 8
Observe that since the commutative property does not hold true, we have to distinguish between
“left factor out” and “right factor out”:
AB + AC = A (B + C)
EF + GF = (E + G) F
AB + CA 6= A (B + C)
AB + CA 6= (B + C) A
2.
Given ∙ ¸ ∙ ¸ ∙ ¸
3 1 4 1 1 2
A= B= C= ,
6 2 −5 6 4 3
33. 3.1. MATRIX OPERATIONS 31
we have ∙ ¸∙ ¸ ∙ ¸
3 1 4 1 7 9
AB = =
6 2 −5 6 14 18
∙ ¸∙ ¸ ∙ ¸
3 1 1 2 7 9
AC = =
6 2 4 3 14 18
3.
Observe that 3. ⇒ 2. and therefore ¬2. ⇒ ¬3. Otherwise, you can simply observe that 3. follows
from 2., choosing A in 3. equal to A in 2., and B in 3. equal to B − C in 2.:
∙ ¸ µ∙ ¸ ∙ ¸¶ ∙ ¸ ∙ ¸ ∙ ¸
3 1 4 1 1 2 3 1 3 −1 0 0
A (B − C) = · − = · =
6 2 −5 6 4 3 6 2 −9 3 0 0
Since the associative property of the product between matrices does hold true we can give the
following definition.
Definition 77 Given A ∈ Mm,m ,
Ak := A · A·... · A .
1 2 k times
Observe that if A ∈ Mm,m and k, l ∈ N {0}, then
Ak · Al = Ak+l .
Remark 78 Properties of transpose matrices.
1. ∀A ∈ Mm,n (AT )T = A
2. ∀A, B ∈ Mm,n (A + B)T = AT + B T
3. ∀α ∈ R, ∀A ∈ Mm,n (αA)T = αAT
4. ∀A ∈ Mm,n , ∀B ∈ Mn,m (AB)T = B T AT
Matrices and linear systems.
In Section 1.6, we have seen that a system of m linear equation in the n unknowns x1 , x2 , ..., xn and
parameters aij , for i ∈ {1, ..., m}, j ∈ {1, ..., n}, (bi )n ∈ Rn is displayed below:
i=1
⎧
⎪ a11 x1 + ... + a1j xj + ... + a1n xn = b1
⎪
⎪
⎪ ...
⎨
ai1 x1 + ... + aij xj + ... + ain xn = bi (3.8)
⎪
⎪ ...
⎪
⎪
⎩
am1 xi + ... + amj xj + ... + amn xn = bm
Moreover, the matrix ⎡ ⎤
a11 ... a1j ... a1n b1
⎢ ... ⎥
⎢ ⎥
⎢ ai1 ... aij ... ain bi ⎥
⎢ ⎥
⎣ ... ⎦
am1 ... amj ... amn bm
is called the augmented matrix M of system (1.4). The coefficient matrix A of the system is
⎡ ⎤
a11 ... a1j ... a1n
⎢ ... ⎥
⎢ ⎥
A = ⎢ ai1 ...
⎢ aij ... ain ⎥
⎥
⎣ ... ⎦
am1 ... amj ... amn
Using the notations we described in the present section, we can rewrite linear equations and
systems of linear equations in a convenient and short manner, as described below.
The linear equation in the unknowns x1 , ..., xn and parameters a1 , ..., ai , ..., an , b ∈ R
a1 x1 + ... + ai xi + ... + an xn = b
34. 32 CHAPTER 3. MATRICES
can be rewritten as
n
X
ai xi = b
i=1
or
a·x=b
⎡ ⎤
x1
where a = [a1 , ..., an ] and x = ⎣ ... ⎦.
xn
The linear system (3.8) can be rewritten as
⎧ n
⎪ Pa x = b
⎪
⎪
⎪ j=1 1j j 1
⎨
...
⎪ P
⎪ n
⎪
⎪ amj xj = bm
⎩
j=1
or ⎧
⎨ R1 (A) x = b1
...
⎩
Rm (A) x = bm
or
Ax = b
where A = [aij ].
Definition 79 The trace of A ∈ Mmm , written tr A, is the sum of the diagonal entries, i.e.,
m
X
tr A = aii .
i=1
Definition 80 The identity matrix Im is a diagonal matrix of order m with each element on the
principal diagonal equal to 1. If no confusion arises, we simply write I in the place of Im .
n
Remark 81 1. ∀n ∈ N {0} , (Im ) = Im ;
2. ∀A ∈ Mm,n , Im A = AIn = A.
Proposition 82 Let A, B ∈ M (m, m) and k ∈ R. Then
1. tr (A + B) = tr A + tr B;
2. tr kA = k · tr A;
3. tr AB = tr BA.
Proof. Exercise.
3.2 Inverse matrices
Definition 83 Given a matrix An×n , , a matrix Bn×n is called an inverse of A if
AB = BA = In .
We then say that A is invertible, or that A admits an inverse.
Proposition 84 If A admits an inverse, then the inverse is unique.
Proof. Let the inverse matrices B and C of A be given. Then
AB = BA = In (3.9)
and