Arithmetic and high school algebra

You should be familiar with basic arithmetic from before high school: things like counting numbers (0, 1, 2, …), as well as negative numbers, “fractions”, and the real numbers. (You might also recognize complex numbers, those built from i, the square root of –1.) These sets of numbers are usually given the following names, in “blackboard bold” font, in math classes:
• ℕ: the natural numbers or whole counting numbers (starting at either 0 or 1, depending on context & taste);
• ℤ: the integers, including now also negative “whole” numbers;
• ℚ: the rational numbers, including those expressible as fractions of integers (or with repeating decimal expansions);
• ℝ: the “real” numbers, corresponding to points on a line, and including irrational numbers such as √2, e and π.

You should have some idea about how numbers can be written in different bases, e.g., the usual base 10, but also base 2 or 16 (used a lot in computing) or even bases like 7 or 29. (We will in any case review these ideas when we talk about numbers and numerals.)

In addition (heh :) ), you should be familiar with the usual arithmetic operators, as defined to various extents over the several types of numbers: addition, multiplication, exponentiation, subtraction, division, roots, and logarithms (the latter are “inverses” of exponentiation, so to speak, on the left and right). If you have trouble with logarithms, recall that the “log” of a number is roughly the number of digits you need when you write that number out, written in the same base as the log.

the idea of a function

Moving from arithmetic to algebra, recall that when we write out an arithmetic expression using the operators mentioned above, but also include a variable, like x or y, we can manipulate the expression in various ways (see below), but we (usually) can’t evaluate it fully until we know what value the variable should take: in this sense the expression defines a function, i.e., a rule which provides a “result value” (or output) for every possible value of the variable (or input), modulo some concerns about things like division by 0 (or similar issues, depending on the operations and types of numbers involved). You are probably used to graphing these functions as lines or curves in the plane based on a system or Cartesian co-ordinates (“x and y axes”), and used to expressing them in terms such as f(x) = x2+3 or y = x2+3.

generalizing functions away from numbers

In higher-level mathematics and computer science, we often consider functions from and to other sets, beyond the usual types of numbers, for example from numbers to truth values, or on sets of numbers, strings of characters, and various “data structures”. Note that computer science “functions” (also called sub-routines, procedures or methods) are different from the usual math ones, in that they are defined in different terms, may access data beyond their inputs (e.g., value fields of an object or class), and may cause various side-effects (output, changes to variables, etc.), including the possibility that they “throw an exception” or otherwise fail to terminate, thus not returning a result at all (more on this below).

equations with variables & their solution

In addition to using variables as a way of defining functions, you have probably also used them in equations that you “solve” by applying various rules to both sides, “canceling” and simplifying until you have just a variable on one side (usually the left). Equations so solved can then be thought of as defining a function in terms of another variable (or more), for example by getting the Cartesian y in terms of x. In more advanced math, equations are used to express various laws or rules that hold in certain situations. In abstract algebra, such equations are used to characterize the behavior of various abstract sets or structures. This is something like the way interfaces are used in programming languages such as Java: for example, we might characterize the behavior of stacks by writing out various laws regarding how the operations push and pop behave on stacks. Of course, in Java we don't write the equations themselves as part of the code: rather, a Java interface just defines which actual operations (and their types) are defined for the structures involved, and when we describe the behavior we usually do so informally, if at all. In math, equations are used to give a more formal expression to the idea that, for example, the last item that gets pushed on a stack will be the next one to be popped, as long as nothing else is pushed in between.

basic laws of algebra (especially exponents)

Returning now to the rules of algebra mentioned above, you should be familiar with the following laws that help to characterize the behavior of numbers (or various types); you should now think of the variables as “universally quantified”, meaning that the equations are true for all relevant values of x. (Eventually we will have to allow for some exceptions, depending on the operators and numbers involved, e.g., division by zero.)

• 0 acts as the “identity” of addition: x+0 = x
• addition is commutative: x+y = y+x
• addition is associative: x+(y+z) = (x+y)+z

• 1 acts as the identity of multiplication: x·1 = x
• multiplication is commutative: x·y = y·x
• multiplication is associative: x·(y·z) = (x·y)·z

• 0 “cancels” in multiplication: x·0 = 0
• multiplication distributes across addition: x·(y+z) = x·y + x·z

• 1 cancels on the left of exponentiation: 1^x = 1
• 1 is the identify on the right of exponentiation: x^1 = x
• exponentiation distributes on the right of multiplication: (x·y)^z = x^z · y^z
• exponentiation “distributes as multiplication” (?) on the left of addition: x^ (y+z) = (x^y) · (x^z)
• x^ (y·z) = (x^y) ^ z

Set theory

general idea (foundations for mathematics)

Set theory is a mathematical theory used to talk about collections of things in a precise way. It is a very powerful and flexible theory, so much so that it can be used as a foundation for all of mathematics (or at least so the story goes …). Most mathematicians learn a good amount of basic set theory as part of their undergraduate coursework, seeing how many math concepts can be expressed in terms of sets (we sometimes say “encoded” as sets). However, most working mathematicians then return to working with the concepts relevant to their sub-field of math directly, without thinking about them so much in terms of sets, unless it is particularly convenient or necessary for the proper level of rigor. For example, number theorists work with numbers, topologists with “shapes”, and analysts with functions, not usually thinking of these things as sets. (And, yes, numbers can be thought of as sets; see below). In other words, although set theory provides a kind of default foundation for math, most mathematicians are content to know that the foundations are there, without necessarily appealing to the details in their day-to-day work, secure in the knowledge that their ideas can be worked out in more detail in those terms if needed. (The exceptions to this rule are mathematicians concerned with foundations, or with set theory as a subject in itself.) In fact, the origins and early development of set theory in this rôle were part of a “crisis in foundations”, a period when mathematicians were particularly concerned to show that their work could be firmly grounded in logic, so that there was less doubt that their abstract concepts were coherent and consistent.

We will often consider theories which are similar to set theory but that contrast with it in various ways: some of these theories, such as type theory, were first proposed as ways to “fix up” math’s set-theoretic foundations, before other techniques became more popular. But type theory and other general theories about structures and computation, such as domain theory and category theory, have been developed and put to use in computer science in order to address some of the ways in which computation differs from ordinary math (for example, in order to better address issues of non-terminating functions and the like).

sets have a minimal structure

Sets have a kind of minimal structure: a set is completely defined by the elements that are members of it (i.e., the “things” that are in it). Unlike various computer science structures such as lists, stacks, queues and trees, sets have no order or relationships pre-defined between their members: a list, for example, contains values in some order, and may have repetitions amongst its “members”, but a set has only the members, with no further structure. This is reflected in the single primitive operation (actually: a relation) defined in set theory, set membership, written x ∈ S, to describe that x is a member of the set S. Every value or object under consideration is either a member of a given set or not, and that is all we can directly say about a set (what it's members are). Of course, we can define all sorts of interesting things in terms of set membership, including things like lists … but everything we build in terms of sets has to be built in terms of set membership.

In fact, in mathematics we can build everything we need (numbers, functions, relations, structures, etc.) from a basic scheme starting from just the empty set! That is, we don’t need to make use of any primitive elements whatsoever; counting numbers, for example, can be constructed by nesting sets of sets in various ways, so that only the empty set is at the core of it all. How this is done will be sketched below but then reviewed later during the semester, often by comparison and contrast with other ways of thinking about structures. In some sense, set theory is like a dynamically-typed language (think of Lisp or Python, if you are familiar with those): it provides a kind of a very flexible or fluid medium in which to make definitions of other concepts, with no restrictions on how sets are put together … so much so that just about anything (mathematical) can be built in this fashion. Other theories we will study are more like statically-typed languages: their notion of collections is based on things which share some common structure—in set theory, such structure can be imposed by a definition, but it is not required. Another good intuition is to think of set theory as like the binary sequences underlying data structures in a computer: sets provide a kind of primitive substrate in terms of which everything else is defined: underneath programming, a big puddle of bits; underneath math, a puddle of sets. Sometimes the set version of things will technically allow “non-sensical” operations and comparisons to be made, simply because disparate objects are all defined or encoded in terms of the same primitives, just as you might be able to “multiply a character by a pointer” if you looked at their bit strings as representing numbers.

finite set notation—issues of order & repetition

What is the language of set theory? First, we write finite sets down by writing their elements between “curly braces”, separated by commas (the assumes we have a way to write down each element itself). For example, we can write down a finite set of five numbers as follows: {1,2,4,8,16}. Note that we shouldn’t really write repeated elements in the set, and the order doesn’t really matter (although people usually write numbers, for example, in numerical order). Thus, in some sense we would say that, as sets, {1,2,4,8,16} = {8,2,16,1,4} (in another sense, it would just be bad form to write the same finite set in two different places in different ways: not incorrect, exactly, but a poor or unfortunate choice). This is because sets are defined to be equal when they have precisely the same members: A = B, for sets A and B, when x∈A if and only if x∈B, for all (relevant) values of x. A special case, in some sense, is the empty set, which has no members at all, which is thus unique (any “other” empty set that has no members is clearly the same, since the only discernible thing about a set is which members are in it). The empty set is sometimes written using the usual braces, but with nothing between, and sometimes with a special symbol; notationally, {} = ∅, though of course these are merely different ways of writing the same thing.

If we want to talk about sets which have more than a finite number of elements, we need a better notation (otherwise it would take “forever to write them down”). The usual way of doing this is sometimes called Z-F comprehension (after the set-theorists Zermelo and Frænkel) or just set comprehension; we write: { x | P(x) }, read as “the set of x such that P of x”, where here P is some expression of logic involving x, i.e., what is called a predicate over x (see more in the section on logic below). For example, we might write: { x | x>2 and x<10 }. Notice how your interpretation of this will depend on what set of things x ranges over in the first place: it would be finite if x could only take on whole number values (in fact, it would be the same as {3,4,5,6,7,8,9}). For this reason, as well as one other, we often write down which set the variable ranges over as part of the comprehension notation. For example, note the difference between { x∈ℕ | x>2 and x<10 } and { x∈ℝ | x>2 and x<10 }: the former is finite and equal to the previous finite example, whereas the latter is infinite, as there are far more real numbers between 2 and 10. Of course, if the intended range is understood, we can leave it off without too much worry, and clarify if necessary later.

(The other reason for writing down the range is to avoid certain paradoxes which can otherwise arise; see Russell’s Paradox below.)

relations on sets

As we said above, the only basic relation on sets is membership, which we must take as primitive. We can relate it in a kind of trivial way to the comprehension notation by noting that: x ∈ {y | P(y)} if and only if P(x). As above, we can define equality between sets in terms of membership. We are also sometimes interested to note when sets are not quite equal, but when one is contained within the other. We write A ⊆ B when every member of A is also a member of B; that is, when for all x, if x∈A, then x∈B. (A subset is “proper” if it is not actually equal to the full set, written A⊂B.) Note that A = B, for sets A and B, just when both A ⊆ B and also B ⊆ A.

operations on sets

We can define a number of useful operations on sets using the notation above and the logic which is implicit in the predicates of set comprehensions. Typical binary operators defined on sets include the following:
• A ∪ B = { x | x∈A or x∈B } (set union)
• A ∩ B = { x | x∈A and x∈B } (set intersection)
• A ∖ B = { x | x∈A but x∉B } (set difference)

Note that we are being a bit informal about how logical predicates are written out, here using English “operators”. (The “but” in the last example is really “and” in disguise, and the ∉ operator on sets is just the logical negation of ∈; in other words, each is true when the other is false, and vice versa.) Finally, note that we can in some sense “lift” a binary Boolean operator into an operator on sets by following the pattern above, so that union and intersection are the “liftings” of the “or” and “and” operators.

Venn diagrams

…

complement, implicit universes and the power set operator

Often we wish to consider the set of all things that are not members of some given set: this is called the complement of that set. We sometimes have to be careful with complements, as with the set comprehension notation, to specify the “universe” of values under consideration, if it is not obvious from context. For example, the complement of the set of positive numbers might be considered just the set containing 0, relative to ℕ, but would be much larger for other possible contexts (e.g., the reals). The complement of a set S is often written with a line or bar across the top of S (something I can’t easily do here—you’ll have to wait for the LaTEX).

The power set operator, written ℘(S) for some set S, is the set of all subsets of S; that is, ℘(S) = { T | T⊆S }. (This is a very informal definition; to see why, ask what set the variable T should range over … .)

Russell’s Paradox considers the following “set”: R = { S | S∉S }. In other words, the set of all sets which are not members of themselves. The question is whether R∈R? From the “definition” of R, it would seem that it both is and isn’t a member of itself: we generally prefer if our logic and theories do not leave us in a situation where some proposition is both true and false! One way to react to this is to question which universe S ranges over in the definition: the set of all sets? In general, the example shows that we must be careful about unrestricted comprehension: proposed solutions to this problem include the aforementioned use of explicit ranges for the variables in the comprehension notation, but also other possible approaches.

representing math concepts

As mentioned above, we can use the concepts and language of set theory to define many math concepts, thus potentially providing a foundation for mathematics. (The idea is that set theory is in some sense bare-bones and minimal, conceptually, and thus can serve as a clear, no-nonsense arena in which to define and decide subtle issues. Examples like Russell’s Paradox suggest that there is still some care to be taken.) In the following sections we give some examples of how math concepts are “translated” or encoded in set theory. Again, for computer scientists, a good analogy is the way in which various data structures and other CS concepts are “implemented” inside a computer using binary strings.

products and ordered pairs

In math we often want to talk about product spaces, i.e., sets of ordered pairs (or triples, etc.); for example, this comes up when we spoke of functions and cartesian co-ordinates above. As we saw above, sets do not inherently have any order amongst their members, but we can force the issue by defining an ordered pair 〈x,y〉 as a more complex set than just {x,y}. More specifically, we can define as follows: 〈x,y〉 = {{x}, {x,y}}. In other words, we “code” the pair as a set of sets, in such a way that the x and y of the pair are handled differently, and thus their rôles are distinguishable. This will allow us to show, for example, that 〈x,y〉= 〈a,b〉if and only if both x=a and y=b. Furthermore, we will have that 〈x,y〉=〈y,x〉only when x=y.

At the level of sets (as opposed to just their elements), we can then define an operator that takes two sets as arguments and returns a third, the set of all pairs of elements drawn from these arguments:

A × B = { (x,y) | x ∈ A and y ∈ B }

numbers

Natural numbers can be encoded using sets in many ways. In most accounts, 0 is coded (or defined) as the empty set. One way to count up is to simply enclose a set in “one deeper set of braces”, so that n+1 is defined as {n}. Another possibility is to define n+1 as n ∪ {n}. Finally, one can define every natural number as the set of all sets with that number of elements—if done carefully to avoid issues of circularity and the “set of all sets”. Although many such codings are adequate for the needs of mathematics, it has been pointed out that the plurality of codings suggests that numbers have some independent ontological status, since no one coding can thus be considered definitive: numbers cannot “be” these sets, since there is no reason to prefer one coding as more fundamental than the others.

Once the natural numbers are defined (however that is done), the other usual number types can then be defined in terms of ℕ, in terms of various pairing and equivalence ideas (see above for pairs and below for equivalence classes). We won’t spend time on these issues here, but instead look at them more formally later on in the semester.

disjoint union (co-product)

A notion related to products is that of a disjoint union of sets. Note that the usual union operation on sets is unlike a “physical union” of sets of objects, for example by pushing together two piles of things, in that two sets might overlap in their membership. In such cases the number of elements in the union is not just the sum of the numbers of elements in the two sets, as will be the case for the “piles of objects” example. This matters because a lot of our intuitions about numbers and math are ultimately tied to physical actions: we sometimes want a form of union that better respects these intuitions.

In any case, we can define a kind of a union that nevertheless respects the disparate origin of common elements, either as an element from the left or right side argument sets, by “tagging” the members of the union with some arbitrary-but-fixed values (typically, we use whatever codes are in use for the natural numbers 0 and 1). That is, we define:

A + B = {〈i,x〉 | (i = 0 & x ∈ A) or (i = 1 & x ∈ B) }

Although somewhat arbitrary and tedious in the details of the encoding, this definition suffices to give us an alternative to the usual union operation which nevertheless respects our intuitions in situations where we would like to honor the rôle that a given value might play. For example, a disjoint union of Willamette students and Willamette employees in some situations might best be served when we can tell what rôle a person is playing in the union, even if they happen to be both a student and an employee.

relations and functions: properties, operations

predicates, relations, and functions

In math and logic we speak of predicates and relations as properties of things and “connections” between things, respectively. For example, redness is a property of certain things, and thus the predicate “red” holds (or is true) of those things. Some people are “connected” by being cousins, so “cousin” (or “is a cousin of”) is a relation that holds between those people. We can define or formalize predicates as subsets of a set: the subset of all red things, amongst the set of all things (well, perhaps it would be better to restrict our range of things to something more specific: red cars amongst cars, red birds amongst birds, etc.). A relation between sets of things can be defined as a subset of the set of pairs of such things: two things that are related will then be a member of the subset, as a pair.

In general, there may be many things related to some given one; e.g., you may have many cousins. But some relations are special in that there is always exactly one related thing, no more and no less. For example, everyone has exactly one biological mother. (Well, this was true up until recent developments in genetics—a young woman named Alana Saarinen, for example, has genetic contributions from two “mothers”.) Certainly in mathematics we would often like to distinguish situations where this special kind of relation exists, and then bless it with a special notation. Specifically, we call such relations functions, and think of the related entity as defined in terms of its connection to the original; notationally, we write f(x) for the value or thing related to the value x in this unique (and well-defined) way. In set-theoretic terms, a function is thus just a special kind of relation, one in which there is a unique pair in the relation set for every left-hand component of the pairs.

A constant concern with predicates, relations and functions is that we need to be clear about the sets over which they are defined. In general, unless the context makes things crystal clear, one should mention the underlying sets involved explicitly. Given this, we could then define:

• a predicate P over a set S is given as some P ⊆ S
• a relation R between sets A and B is given as some R ⊆ A × B
• a function f from a set A to a set B is given as some f ⊆ A × B such that for every x ∈ A, there is exactly one pair (x,y) ∈ f. (We write “f(x)” for the y which corresponds to a given x.)

Especially when the relation R is denoted by a suitable symbol, we often write xRy for (x,y) ∈ R. (For example, x < y or x ≥ y.

composition and inverse of relations (and functions)

Given a relation R between sets A and B, and a relation S between sets B and C, we can define operations of composition and inverse as follows:

• composition: R ◦ S = { (x,z) ∈ A × C | (x,y) ∈ R and (y,z) ∈ S }

• inverse: R-1 = { (y,x) ∈ B × A | (x,y) ∈ R }

reflexive and transitive closure of relations

injective, surjective, and bijective functions

Some important properties of functions:

• given f:A→B, f is injective when for all x ∈ A, if f(x) = f(y), then x = y (i.e., f never sends different inputs to the same output)

• given f:A→B, f is surjective when for all y ∈ B, there is some x ∈ A such that f(x) = y (i.e., f sends some input to every possible output)

• a function f is bijective if it is both injective and surjective (i.e., f sets up a one-to-one correspondence between the elements of A and those of B)

cardinalities and infinite sets

equipollence and cardinality via bijections

countable sets (bijection with ℕ)

uncountable sets (ℝ; power sets)

Order theory (special kinds of relations)

order relations and some properties

partial & total orderings

… [definitions of reflexive, symmetric, anti-symmetric, transitive] …

• a partial ordering is a relation which is reflexive, transitive and anti-symmetric

• a total ordering is a relation which is transitive, anti-symmetric, and total (i.e., defined one way or the other between every pair of elements)

Partial orderings can be drawn as Hasse diagrams (see class for example); total orderings can be drawn as a single “line” running through all the elements.

equivalence relations, eq-classes, and partitions

An equivalence relation is one which is reflexive, symmetric and transitive. Each such relation over a set S determines a partition of the set, i.e., into disjoint (non-overlapping) subsets which “jointly cover” the original set (i.e., their union is the original set).

Logic

general idea of logic (rigorous/symbolic form of argument)

emphases on validity of arguments (versus truth of propositions)

truth values, boolean operators, and boolean algebra

conjunction, disjunction, etc.

propositions and propositional variables

classifying propositions

tautology (always true)

contradiction (always false)

contingent (it depends …)

some basic tautologies

predicates, relations, functions in logic

variables and quantifiers

the nature and structure of proofs

proof methods

hypothetical proof: make an assumption, prove something; thus the implication holds

proof by contradiction: assume something, prove a contradiction; thus assumption is false

proof by induction: prove base case, prove that one step preserves truth; thus true for all n (or structures)