\( \newcommand{\vxy}[2]{\begin{bmatrix}#1 \\ #2\end{bmatrix}} \newcommand{\vxyz}[3]{\begin{bmatrix}#1 \\ #2 \\ #3\end{bmatrix}} \newcommand{\uu}[0]{\mathbf{u}} \newcommand{\uuu}[0]{\mathbf{\hat{u}}} \newcommand{\vv}[0]{\mathbf{v}} \newcommand{\vvv}[0]{\mathbf{\hat{v}}} \newcommand{\ww}[0]{\mathbf{w}} \newcommand{\sm}[1]{\bigl[\begin{smallmatrix}#1\end{smallmatrix}\bigr]} \)

Linear Algebra, Part 2: the Dot Product

posted February 20, 2013

In Part 1 of this series, we looked at vectors along with some tools for reasoning about them in a geometric setting. In this post we'll go forward with a big idea about vectors: the dot product.

Motivation

One thing we want to be able to do when dealing with two vectors is to measure the angle between them. We might use vectors to model light reflectance or moving objects; in these situations we need to be able to compute the angle between the vectors involved.

We'll start, as pictured in Figure 1, with two unit vectors. The vectors involved are

\[ \uu = \vxy{\cos \theta}{\sin \theta} ~~~~ \vv = \vxy{1}{0}. \]

Just looking at the picture we can tell that the angle between this particular pair of vectors is \(\theta\), which can be obtained by computing \(\cos^{-1}(\cos \theta) = \cos^{-1}{\uu_1}\).

If, as in Figure 2, \(\uu\) and \(\vv\) were any unit vectors, how would we compute the angle between them? Instead of having one vector on the \(x\) axis to make things simple, these vectors are both rotated through angles \(\alpha\) and \(\beta\), respectively. In that case we have

\[ \vv = \vxy{\cos \alpha}{\sin \alpha} ~~~~ \uu = \vxy{\cos \beta}{\sin \beta}. \]

It turns out that we can use an identity from trigonometry to deal with this situation:

\[ \cos(\alpha)\cos(\beta) + \sin(\alpha)\sin(\beta) = \cos({\beta - \alpha}). \]

This is exactly the angle we're trying to compute: \(\theta = \beta - \alpha\). Given the vectors under consideration, another way to rewrite the above identity in this situation is

\[ \uu_1\vv_1 + \uu_2\vv_2 = \cos(\theta). \]

So that gives us a general way to find the angle between two unit vectors, regardless of how they are oriented.

What if \(\uu\) and \(\vv\) aren't unit vectors? We know by now that we can make them unit vectors by dividing them by their lengths. So that gives us the following (notice the transition from unit vectors to the original \(\uu\) and \(\vv\)):

\[ \begin{align*} \cos \theta &= \uuu_1\vvv_1 + \uuu_2\vvv_2 \\ &= \frac{\uu_1}{\|\uu\|}\frac{\vv_1}{\|\vv\|} + \frac{\uu_2}{\|\uu\|}\frac{\vv_2}{\|\vv\|} \\ &= \frac{\uu_1\vv_1 + \uu_2\vv_2}{\|\uu\|\|\vv\|}. \end{align*} \]

Written another way,

\[ \|\uu\|\|\vv\| \cos \theta = \uu_1\vv_1 + \uu_2\vv_2. \]

The right-hand side of the equation is so common that there is a shorthand for it called the dot product, written

\[ \uu \cdot \vv = \uu_1\vv_1 + \uu_2\vv_2. \]

(In fact, this is just the version for two-element vectors; it applies similarly for vectors in any number of dimensions.)

So we can rewrite the formula above as

\[ \uu \cdot \vv = \|\uu\|\|\vv\| \cos \theta. \]

One way to think about the dot product of two vectors is that it expresses the cosine of the angle between them, scaled by their lengths.

Dot Product Properties

The dot product has some properties we'll need to rely on soon. They are:

\[ \begin{align*} \uu \cdot \vv &= \vv \cdot \uu \\ \uu_1\vv_1 + \uu_2\vv_2 &= \vv_1\uu_1 + \vv_2\uu_2. \end{align*} \]

\[ \begin{align*} \uu \cdot (\vv + \ww) &= (\uu \cdot \vv) + (\uu \cdot \ww) \\ \uu_1(\vv_1 + \ww_1) + \uu_2(\vv_2 + \ww_2) &= \\ \uu_1\vv_1 + \uu_1\ww_1 + \uu_2\vv_2 + \uu_2\ww_2 &= \uu_1\vv_1 + \uu_2\vv_2 + \uu_1\ww_1 + \uu_2\ww_2 \\ &= (\uu \cdot \vv) + (\uu \cdot \ww). \end{align*} \]

\[ \begin{align*} c\uu \cdot \vv &= c(\uu \cdot \vv) \\ c\uu_1\vv_1 + c\uu_2\vv_2 &= c(\uu_1\vv_1 + \uu_2\vv_2). \end{align*} \]

The Schwarz Inequality

Think back to the two unit vectors in Figure 1. If we start with both of those vectors at \(\sm{1\\0}\) and rotate one of them counter-clockwise around the unit circle, the angle between them will go from \(0\) degrees to \(360\) degrees. At the same time, the cosine of the angle will go from \(1\) to \(-1\) (at \(180\) degress) and back to \(1\) at \(360\) degrees; see Figure 3.

Given that the dot product is a cosine "scaled" by the length of the vectors \(\uu\) and \(\vv\), that means that for any two such vectors, as we let the angle between them range between \(0\) and \(360\) degrees, the dot product will range between \((1)\|\uu\|\|\vv\| \dots (-1)\|\uu\|\|\vv\|\) and back:

See Figure 4 for the behavior of \(\uu \cdot \vv\) where, say, \(\|\uu\| = 3\) and \(\|\vv\| = 1\).

When does the dot product equal zero? Exactly when the angle is \(90\) degrees, i.e., when the cosine equals zero (see Figure 4). To see exactly why this happens, assume that two vectors \(\uu\) and \(\vv\) are perpendicular. Then we know that they can be arranged in a right triangle so we can express them in terms of the Pythagorean Theorem. Given that (for any vector \(\vv\)) \(\|\vv\|^2 = \vv \cdot \vv\) and given the properties of the dot product discussed above,

\[ \begin{align*} \|\uu - \vv\|^2 &= \|\uu\|^2 + \|\vv\|^2 \\ (\uu - \vv) \cdot (\uu - \vv) &= (\uu \cdot \uu) + (\vv \cdot \vv) \\ \uu \cdot (\uu - \vv) - \vv \cdot (\uu - \vv) &= \\ \uu \cdot \uu - \uu \cdot \vv - \vv \cdot \uu + \vv \cdot \vv) &= \\ \cancel{\uu \cdot \uu} - 2\uu \cdot \vv + \cancel{\vv \cdot \vv} &= \\ -2\uu \cdot \vv &= 0 \\ \uu \cdot \vv &= 0. \end{align*} \]

So we get a lot of information from the dot product:

All of these facts come together in the Schwarz inequality and in Figure 4:

\[ \|\uu \cdot \vv\| \le \|\uu\|\|\vv\|. \]

This captures the idea that the dot product of two vectors oscillates between \(1\) and \(-1\), scaled by their lengths.

The Law of Cosines

The relationship between the dot product and the angle between vectors goes deeper. Given two vectors \(\uu\) and \(\vv\) separated by an angle \(\theta\) as we had above, call the third side of the triangle \(\ww = (\uu - \vv)\). Then we can write the hypotenuse \(\ww\) as

\[ \begin{align*} \ww \cdot \ww &= (\uu - \vv) \cdot (\uu - \vv) \\ &= (\uu \cdot (\uu - \vv)) - (\vv \cdot (\uu - \vv)) \\ &= \uu \cdot \uu - \uu \cdot \vv - \vv \cdot \uu + \vv \cdot \vv \\ &= \uu \cdot \uu - 2\uu \cdot \vv + \vv \cdot \vv. \end{align*} \]

Since we know that \(\uu \cdot \vv = \|\uu\|\|\vv\| \cos \theta\), we can rewrite the last step as

\[ \|\ww\|^2 = \|\uu\|^2 - 2\|\uu\|\|\vv\| \cos \theta + \|\vv\|^2. \]

which is a generalization of the Pythagorean theorem! Considering the special case of \(\theta = 90^\circ\), then \(\cos \theta = 2\|\uu\|\|\vv\|\cos \theta = 0\), giving the familiar equation

\[ \|\ww\|^2 = \|\uu\|^2 + \|\vv\|^2. \]

The Triangle Inequality

The triangle inequality is a statement of something you've heard before: the shortest path between two points is a straight line between them. But here we state it in terms of vectors:

\[ \|\uu + \vv\| \le \|\uu\| + \|\vv\|. \]

In vector terms, a vector to the point \(\uu + \vv\) is always going to be no bigger than the path taken by going along one vector \(\uu\), then along the other vector \(\vv\). The two sides will be equal only when \(\uu\) and \(\vv\) lie in the same line. The sum \(\|\uu\| + \|\vv\|\) will be greater otherwise.

Afterword

The vector is the basic building block of geometric intuition about solving systems of linear equations, as we'll see in future posts in this series. This post and Part 1 should give a pretty good foundation.