Mathematical education in grade school (and, to a certain extent, college) teaches the mechanics of mathematical problem-solving. Students learn to identify specific types of problems and to mechanically apply techniques to solve them. As a result, this type of math is little more than pattern recognition. You might recall this sort of pattern recognition when doing everything from word problems on up to applying the rules of differentiation and integration in Calculus. However, these problem-solving techniques are based on underlying principles. They work because those principles are logically consistent.

By "consistent" I refer to the logical consistency of mathematical ideas in relation to one another. In school you learn about the various arithmetic operations -- addition, subtraction, etc. -- and how they relate to each other using distributivity, associativity, and commutativity. These operations always work and they always yield the same results when applied in the same ways. So we say that these rules together are consistent.

When problems and solutions become more complex than mere arithmetic problems, and when new ideas are produced or new structures invented, we need to be sure that we can take advantage of these new phenomena with confidence that they will behave correctly. So consistency is a central concern in mathematics at every level.

The way that mathematicians handle this concern is by making convincing, step-by-step arguments about the tools they employ. These arguments are called **proofs**.

A proof, as noted above, is just an argument. However, it must be a clear, convincing one that draws a firm conclusion from one step to the next, leaving no room for guessing or speculation. This is done in plain prose, often with the help of the symbols used to represent the concepts being addressed.

A proof may provide a solid argument, but often it does not explain *why* it is so; it might not seem to provide *intuition* to help you understand the result. This is part of the thought process that goes into constructing the proof: it's the intuitive background for understanding the problem at hand. Most proofs are terse and will not provide this understanding explicitly because the purpose of a proof is to provide a sound foundation, whereas all of the so-called "scratch work" that went into actually constructing the proof is gone, making the final result seem trimmed and clean. You'll have to learn how to read proofs to build your own understanding. This takes practice in both reading and writing proofs to get a sense of how proofs are constructed.

A fundamental tool in mathematics is the notion of a "set." Many mathematics texts include a first chapter on how to use sets. This branch of mathematics -- Set theory -- is the study of behaviors and rules that govern how we can use and think about sets.

A set is just a collection of elements, possibly empty. Unlike with some other mathematical structures, its elements are not *ordered*. A set is denoted with braces, and some example sets are \(\{\},\) \(\{1\},\) \(\{\mbox{cat}, \mbox{dog}, \mbox{fox}\},\) and \(\{\ldots, -4, -2, 0, 2, 4, \ldots\}.\)

Like numbers, sets support various operations. One such operation is the "union" operation, written \(\cup\). We say that the "union" of two sets is the set of all elements from each original set. We write a union as \(C = A \cup B.\) For example:

\[\{1, 3, 5\} \cup \{3, 5, 7, 9\} = \{1,3,5,7,9\}\]

\[\{14\} \cup \{\} = \{14\}\]

Furthermore, we write the size of a set -- the number of elements contained in it -- as \(|A|\). Thus \(|\{\}| = 0\) and \(|\{1, 2\}| = 2\). We also refer to elements in the set by saying that \(x \in A\) if \(x\) is one of the elements in \(A\), and similarly, \(x \notin A\) if \(x\) is not an element of \(A\).

Now, let's suppose (for some reason) that it's useful to investigate the following *conjecture*, which says that given two sets, the size of their union is *no bigger than* their sizes combined:

\[|A \cup B| \le |A| + |B|\]

Common sense might tell you that this is true, but we need to prove it beyond a shadow of a doubt *for all possible sets \(A\) and \(B\)*.

Let's start with a few concrete examples to check the conjecture.

(Example 1) What if \(A\) and \(B\) are both empty?

\[|\{\} \cup \{\}| \le |\{\}| + |\{\}|\]

\[0 \le 0 + 0\]

(Example 2) If only \(A\) is empty:

\[|\{\} \cup \{1, 2\}| \le |\{\}| + |\{1, 2\}|\]

\[2 \le 0 + 2\]

(Example 3) If \(A\) and \(B\) are equal and non-empty:

\[|\{1, 2\} \cup \{1, 2\}| \le |\{1, 2\}| + |\{1, 2\}|\]

\[2 \le 2 + 2\]

We should look at the condition,\(\le\). This suggests we can partition the equation into two cases:

\[\mbox{(Case 1) } |A \cup B| < |A| + |B|\]

and

\[\mbox{(Case 2) } |A \cup B| = |A| + |B|\]

In Case 1, \(A\) and \(B\) have some elements in common; they must, since the size of the union is smaller, which means that some elements are counted twice on the right-hand side of the equation (see Example 3 above). In Case 2, however, the sets are *disjoint*, meaning they share no common elements. In this case, each element in \(A\) and \(B\) account for an element in \(A \cup B\), so the sizes are the same.

Based on the examples and our understanding above, we can make an argument about the conjecture:

We show that for any sets \(A\) and \(B\), \(|A \cup B| \le |A| \cup |B|\). Suppose that \(A\) and \(B\) are disjoint. Then \(A\) and \(B\) share no common elements, so \(|A| + |B| = |A \cup B|\) since all elements present in either set will be present in the union. Now instead suppose that \(A\) and \(B\) are not disjoint. Then there is at least one element \(x \in A\) such that \(x \in B\). Thus, the union of \(A\) and \(B\) will always be smaller since some elements are in common. \(\square\)

This sort of argument is in fact a proof; it covers all of the possible cases and illustrates how the conjecture holds in each. Thus, we have proven it and now call it a *theorem*. Now, when we encounter a problem that relies on set sizes and unions, we may be able to use this result to make further proofs possible.