V
LLC 2019 Midterm Review Notes
V
Using these notes
*
click on the triangles to the left to expand sections
*
use the text size feature in your browser to make this text larger and more readable
*
these notes will be updated throughout the review sessions
V
About the exam
*
The exam will be held on Wednesday 3 April at the usual lecture time (1:50-2:50) in the usual lecture room (Ford 204).
V
The exam will consist of several kinds of questions; typically:
*
10-15 True-or-False questions at 1-2 points each
*
8-10 short answer questions at 4-6 points each
*
2-4 longer answer questions at 8-15 points each
*
remember to review your lecture notes, materials on the course home page, homeworks, and the labs!
*
you won’t have to write longer programs, but 2-3 lines of (easier) Haskell is possible
V
NEW! Links to labs & homeworks
V
Course introduction
*
this is a non-traditional CS theory course, of my own design, evolving
*
traditional courses “climb a ladder” of formal languages & machines
*
this course will cover some traditional topics (but less thoroughly)
*
emphasis here is on interpreted formal languages, programming language theory
V
influences from functional programming, type theory, denotational semantics, category theory
*
(also structuralism, intuitionism, homotopy type theory, linear logic & ludics, Montague semantics, categorial grammar,…)
*
this semester deploying major new intro section on FAST type system
*
early emphasis on “soft” topics (history, philosophy, cultural aspects, etc.), later more technical
V
Language and formalism
*
history of language
V
purpose(s) of language
*
language presumably developed primarily for interpersonal communication, but also plays a role in consciousness (internal language), memory (e.g.. externalizing memory) and conceptual clarification
*
science, language, law, games, conventions, customs, protocols (patterns? rules?)
*
formal language: math, logic, CS, linguistics
V
nature / “stuff” of language (syntax)
*
the form or structure of language has dual nature: both a linear surface structure (strings), but also a recursive structure built from hierarchical phrases (terms or trees)
V
kinds of meanings (semantics)
*
from abstract (based on behavior, like interfaces), denotational (in terms of pre-understood “objects”), or operational (roughly, an “implementation” in terms of data structures)
V
language architecture
*
concrete syntax, abstract syntax, semantics (of various kinds), types & validity
*
object and meta languages
V
semiotics: the study of signs, with 3 kinds (or aspects):
*
icon: the sign resembles the meaning (in shape, form, etc.)
*
index: there is a causal connection between sign and meaning (smoke/fire, thermometer, …)
*
symbol: the relation between sign and meaning is arbitrary but conventional (stop signs, ‘:’ for cons)
V
The FAST type system
*
a simple, familiar system to introduce formal languages & types
*
builds on basic arithmetic, algebra, and some Haskell experience
V
the basic language of types is arithemtic: constants (n), plus (+), times (x), powers (↑ or →)
*
but we say: finite “sets” / enumerated types, sums, products, exponentials
*
there are (usually) many values associated with a given type
V
different interpretations: numbers, but also types, grids, logic, …
*
different variations (for n, +, ×): boolean/2, numeric/n, symbolic
*
when viewed as abstract types, values are things like tagged choices (constructed), pairs, functions
*
when viewed as grids, the values are merely locations within the (shaped) grid of the type
*
when viewed as numbers, or rather numeric expressions, types equivalent to n have all values k<n
*
in all cases / interpretations, the number of values of a (numeric expression) type is just the value of the expression
V
types can be “flattened” down to equivalent (but not equal!) types with the same number of values
*
this is like evaluating the numeric expressions
V
equivalence of types is captured by
V
Sums (+), symbols and coding
*
sums represent the concept of choice
V
constants might start out binary (void type 0 and unit type 1), but we can “upgrade” to numeric or symbolic later
*
the void type 0 has no values; the unit type 1 has only one, say “•”
*
as numbers (i.e., flattened by evaluation), sums are just addition
*
as grids, we just have a two-element grid (or n-element, or symbol-tagged) with nested types = nested grids
*
to write (give, construct, express) a choice, we use a tag (followed by a value of the chosen type)
*
we can upgrade from binary (A+B), to numeric (A+B+…+X+Y), or symbolic (a:A+b:B+ …) “tags”
*
simple sums over the (void and) unit type 1 reduce to coded finite choices—we can add numeric “constant” types n, with values being just k < n (i.e., {0, …, n-1})
*
we can also use symbolic alphabets as types and values: types re just strings, values just “characters”
V
visually, we can view successive binary choices as a binary tree with values at the leaves
*
the code is just the path to the value
*
if the tree is complete, there are 2^h values, where h is the height of the tree = length o path to leaves
*
if the tree is not complete, some binary codes of length h will be invalid as codes
V
see these diagrams:
V
Products (×), tuples and records
*
products represent the concept of (independent) combination
*
as numbers (i.e., flattened by evaluation), products are just multiplication
*
to write (give, construct, express) a combination, we write both parts (possibly with punctuation)
V
as grids, we have (e.g.) a 2-dimensional grid as the product of two 1-dimensional grids
*
similar but more complex for higher dimensions—see homework examples
*
products of sums are very useful—example: cards as products of ranks and suits
V
products can be ordered lexicographically–we order the “slots” in the tuple, then order the values first by the major slot, then by the minor slot
*
when generalized to strings, this yields alphabetical order
*
the Haskell Ord class does this by choosing the left slot of tuples as major, then next rightmost, etc.
V
Exponentials (finite functions: ↑ or →)
*
exponentials represent the concept of connection
V
as numbers (i.e., flattened by evaluation), exponentials are just powers
*
but note that A→B becomes BA
*
we can also usefully think of function values as grids: f: A→B is an A-shaped grid of B-values
*
as a grid … well, it is hard to visualize (esp. at higher-orders), but it is something like a multi-dimensional product (because exponential = iterated product)
*
even though hard to visualize in higher dimensions, we can see various flattenings of function space grids via functin tables (like truth tables) or listings (see immediately below)
*
to write (give, construct, express) a function f: A→B, we can write out all the result values (of type B) in the proper order of argument values (of type A)
V
a function f: A→Bool (i.e., the type 2, roughly) is like a subset of the set of values of type A
*
the power-set constructor ℘(S) is often written as 2S
*
think: pizza topping specification = list of booleans in topping-order
V
Numbers and numerals
*
numbers are the abstract meanings (semantics); numerals are the names or symbols (syntax)
*
natural numbers are whole numbers, starting from 0 (usually) and going up by (+1), as high as you like (i.e., without limit)
*
according to Peano, natural numbers are either zero, or the successor of some natural number
V
we can represent these as simple terms (or constructed values) built from Zero or Succ (say in Haskell)
*
data Nat = Zero | Succ Nat
V
the fold function for the Nat type replaces Succ and Zero with a function call and a value
*
foldn s z Zero = z
foldn s z (Succ n) = s (foldn s z n)
V
we can write out numerals (symbolic codes for numbers) as tallies (using “base 1”)
*
tallies almost physically mimic the“piles of stones” used for early counting
*
we can also write out numerals for “structured numbers” using a mixed radix form, based on products-of-sums (see time format example) and lexicographic order
V
finally, we can write unstructured numbers as numerals using a fixed base (e.g., 2, 10, or 16)
*
but we normally put the most-sgnificant (= major) digits on the left, for cultural reasons
V
we can use Horner’s technique (written as a fold) or its reverse (written as an unfold) to convert numerals to numbers (and vice versa)
*
we have to use left folds and unfolds (foldl and unfoldl) due to the cultural ordering of digits
V
non-standard numeral systems are also possible
*
… or Fritz’s recursive prime-decomposition form (code and examples)
V
Algebraic terms
*
once we have names for specific values (constants or literals) and functions (unary or binary, etc.) on a domain, we can make terms, tree-like structures that express patterns of application
*
we can represent the tree-like structures directly with Haskell data types
*
there are natural ways to write out or “pretty-print” these trees as strings
V
… but perhaps with different orders (prefix, infix, postfix) and punctuation (aprens needed for infix)
*
“PEMDAS” allows us to eliminate some parentheses in favor of “order of operations” conventions
*
terms are most easily “evaluated” using folds—this tends to focus us on the changes between applications, due to re-parameterization of the fold for different purposes
*
we think of terms abstractly/informally as somewhere in-between the tree-like structures and the strings
V
Polynomial functions
*
if we allow just sum and product (possibly also their opposites), but add a single distinguished variable (x), terms have meanings as polynomial functions
*
we can add and multiply polynomials using generalizations of the grade-school algorithms for numerals
*
coefficients of a polynomial are like the digits of a numeral (in this case, the variable x represents the base)
*
we can use Horner’s technique to evaluate polynomials
*
we can also perform (e.g.) differentiation (derivatives) directly on polynomials, rather than on terms-with-variables
V
Regular languages
*
a regular language is a set of strings with a relatively simple “global” structure
V
regular languages are an abstract notion (like an interface) that can be “implemented” in several ways
*
we will see them here via regular expressions and (deterministic) finite automata
*
later on, we will see a hierarchy where regular languages are the lowest = simplest kind
V
think of the definition mechanisms below as defining a “cut” between strings in and out of the language
*
how complex is the cut? here, for regular languages, it is quite simple (linear repetitions)
V
Regular expressions
V
regular expressions (REs) are important in both theory and practice
*
(we study mostly the theory here, but you should learn the practice, too!)
V
REs have six forms: null (Ø), epsilon (ε), symbol (a), sum (|), product (·), Kleene star (*)
V
denotational meanings of REs are sets of strings, also called languages
*
null means empty set
*
epsilon means singleton set of empty string
*
a character literal means the singleton set with the one-element string (of that character)
*
sum means union (of sets of strings)
*
product means every combination (concatenation) of strings, one from each set
*
star means indefinite (concatenated) repetition … not of one specific string, but of any strings (plural) from the set
*
essentially, regular expressions are patterns, which individual strings either match or don’t
V
in addition to the standsard (defining) RE forms, we can define new ones by “abbreviation”
*
plus (R+) means: R · R* (i.e., “one or more”)
*
option (?) means: R | ε (i.e., “one or none”)
*
many more such definitions make practical usage … well, practical!
V
(Deterministic) finite automata
*
a more operational approach defines simple “machines” (automata) to process strings
*
a machine “runs through” the symbols in a string, transitioning from state to state, and either accepting or rejecting the whole string
V
DFAs have an alphabet, states, a transition function, an initial state and some final states
*
we start at the initial state, move between states based on current state and next symbol, and accept if we end in a final state
*
we can write out DFAs as diagrams (with circles, arrows, etc.) or as extended versions of their transition function tables
*
DFAs recognize exactly the same class of reguar languages as REs do
V
in fact, we can construct a DFA for any RE (using “machine-pasting” techniques) or an RE for any DFA (using “state-ripping” techniques)
V
Non-deterministic finite automata (NFAs and GNFAs)
*
deterministic automata always go through a specific, definite sequence of states
*
non-deterministic automata can backtrack, or have “multiple futures”
*
… and they “win” (accept the string) if any accepting run is possible
*
plain NFAs add two features to DFAs: transition to a set of states, and allow “epsilon” transitions (consumes no input)
*
generalized NFAs (or GNFAs) allow transition between states on any string matching a specific regular expression
*
despite the extra features, NFAs are equivalent to DFAs: we can convert back and forth between them
*
we use NFAs and GNFAs to describe ways to convert between REs and DFAs
*
no details of NFAs or GNFAs are on the midterm! (just general points as in this outline)