\( \newcommand{\mat}[1]{\begin{bmatrix}#1\end{bmatrix}} \newcommand{\vxy}[2]{\begin{bmatrix}#1 \\ #2\end{bmatrix}} \newcommand{\vxyz}[3]{\begin{bmatrix}#1 \\ #2 \\ #3\end{bmatrix}} \newcommand{\rxy}[2]{\begin{bmatrix}#1 & #2\end{bmatrix}} \newcommand{\rxyz}[3]{\begin{bmatrix}#1 & #2 & #3\end{bmatrix}} \newcommand{\uu}[0]{\mathbf{u}} \newcommand{\uuu}[0]{\mathbf{\hat{u}}} \newcommand{\vv}[0]{\mathbf{v}} \newcommand{\vvv}[0]{\mathbf{\hat{v}}} \newcommand{\ww}[0]{\mathbf{w}} \newcommand{\sm}[1]{\bigl[\begin{smallmatrix}#1\end{smallmatrix}\bigr]} \newcommand{\dotp}[2]{#1~{\cdot}~#2} \)

Linear Algebra, Part 4: Elimination

posted May 2, 2013

In Part 3 of this series we looked at how systems of equations can be expressed in matrix notation. In this post we'll look at how to eliminate unknowns in a system of linear equations in a matrix setting.

For a system of linear equations we usually want to know whether the system has a solution. In particular, we want to know whether the system has a unique solution. A system of linear equations may have

The case we are interested in is the single-solution case: if the system has a single solution, we want to compute it. But systems of linear equations are difficult to solve if we try to solve for all of the unknowns simultaneously. So we need a way to simplify the process. If we can solve the system by computing one unknown at a time, then finding the solution (if it exists) should be straightforward.

Elimination Example

Consider the following system of equations and corresponding matrix representation:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 4x + 5y + 6z &= 3 \\ 7x + 8y + 10z &= 11 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 10}\vxyz{x}{y}{z} = \vxyz{2}{3}{11} \]

The above system would be easy to solve if we could eliminate the \(x\) term from the second equation and \(x\) and \(y\) terms from the third equation. We can get there by subtracting the right multiples of some of the equations from others to eliminate unknowns. To eliminate the \(x\) term from the second equation, we can subtract \(4\) times the first equation from the second (on both sides). Then for the second equation we have

\[ \begin{align*} 4x + 5y + 6z - 4(x + 2y + 3z) &= 3 - 4(2) \\ 4x - 4x + 5y - 8y + 6z - 12z &= -5 \\ -3y - 6z &= -5. \end{align*} \]

After eliminating the \(x\) term from the second equation we have a new system:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x - 3y - 6z &= -5 \\ 7x + 8y + 10z &= 11 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & -3 & -6 \\ 7 & 8 & 10}\vxyz{x}{y}{z} = \vxyz{2}{-5}{11} \]

Then we can repeat the process for the \(x\) term of the third equation by subtracting \(7\) times the first equation from the third:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x - 3y - 6z &= -5 \\ 0x - 6y - 11z &= -3 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & -6 & -11}\vxyz{x}{y}{z} = \vxyz{2}{-5}{-3} \]

Lastly, to eliminate the \(y\) term from the third equation, we can subtract \(2\) times the second equation from the third:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x - 3y - 6z &= -5 \\ 0x + 0y + z &= 7 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & 0 & 1}\vxyz{x}{y}{z} = \vxyz{2}{-5}{7} \]

What we end up with is a simplified form of the original system in which it is easy to tell that \(z = 7\). Given that, we can back-substitute \(z\) into the second equation to solve for \(y\) and similarly for the first equation to solve for \(x\).

The final form of the matrix after elimination is called an upper-triangular matrix: it has zeros below the main diagonal. Once we have an upper-triangular matrix in a equation like the one above, it is easier to solve for all of the unknowns by back-substitution. But it might not be possible, as we'll see next.

When Elimination Fails

The example above worked out nicely: we kept subtracting multiples of rows from other rows until we ended up with an upper-triangular matrix. But elimination doesn't always go so smoothly: what if we inadvertently eliminate too many unknowns during an elimination step?

Consider the above example, slightly modified so that the final row of the original system has \(9\) instead of \(10\) for the coefficient of \(z\):

\[ \begin{align*} x + 2y + 3z &= 2 \\ 4x + 5y + 6z &= 3 \\ 7x + 8y + 9z &= 11 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9}\vxyz{x}{y}{z} = \vxyz{2}{3}{11} \]

After elimination on the first two rows we'll have:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x - 3y - 6z &= -5 \\ 0x - 6y - 12z &= -3 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & -6 & -12}\vxyz{x}{y}{z} = \vxyz{2}{-5}{-3} \]

But when we go to eliminate the \(y\) term from the third row by subtracting \(2\) times the second row as before, we'll get:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x - 3y - 6z &= -5 \\ 0x + 0y + 0z &= 7 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & 0 & 0}\vxyz{x}{y}{z} = \vxyz{2}{-5}{7} \]

That step left us with an equation \(0z = 7\) or \(0 = 7\) which is clearly false. So this matrix does not have any solutions since there is no value of \(z\) such that \(0z = 7\).

This kind of failure in elimination reveals an important point: all of the coefficients on the main diagonal (\(1\), \(-3\), and \(0\) in this case) must be non-zero or elimination will fail. As we saw, a zero coefficient as in \(0z = 7\) means we can't find a solution. And a zero coefficient in the case of \(0z = 0\) means there are an infinite number of solutions! The coefficients of the unknowns in the matrix after elimination are called pivots. So elimination must yield non-zero pivots or we won't be able to find a unique solution.

Row Exchanges

As we proceed through elimination of unknowns we construct an upper-triangular matrix. But depending on the matrix we may end up eliminating some unknowns too "early" in the process, yielding a matrix which can't be used for back-substituion. Consider the following example:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x + 0y - 2z &= 6 \\ 0x + y + z &= 7 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & 0 & -2 \\ 0 & 1 & 1}\vxyz{x}{y}{z} = \vxyz{2}{6}{7} \]

This matrix doesn't fit the pattern we need to do elimination on the third equation; we can't use the second equation to eliminate the second unknown from the third equation. We could be given such a matrix directly or we could obtain it as the intermediate state of some process of elimination. Either way, elimination can't proceed on this matrix until we perform a row exchange and swap the second and third rows:

\[ \begin{align*} x + 2y + 3z &= 2 \\ 0x + y + z &= 7 \\ 0x + 0y - 2z &= 6 \end{align*} \hskip{0.5in} \mat{1 & 2 & 3 \\ 0 & 1 & 1 \\ 0 & 0 & -2}\vxyz{x}{y}{z} = \vxyz{2}{7}{6} \]

Now elimination can proceed as usual (although in this case elimination is done and we can just solve).

Why Add or Subtract Whole Equations?

At first it might seem arbitrary that we decide to start adding and subtracting equations to eliminate unknowns. But this is a perfectly legal thing to do with systems of linear equations for these two reasons:

Afterword

In this installment we've looked at the process of elimination for solving systems of linear equations. We've also looked at elimination's two failure modes: zero pivots (permanent failure) and row ordering problems (temporary failure). Next time we'll look at matrix multiplication and how we can express elimination as a product of matrices.