Topic
V CS 353: Architecture and Compilers—Final Exam Review
V About the exam
* The final exam will be held on Tuesday, December 11, 2018 from 2-5 pm in the normal lecture room (Ford 204)
* study resources: lecture notes, on-line materials (check the homepage), quizzes, lab assignments
* the exam will be three hours long, closed book, closed notes, and closed computer!
* the exam will be comprehensive (covering the whole semester), but with some emphasis on material since the midterm
* typical format: some T/F, some short answer, a few longer “conceptual” questions
* Look at the course homepage for handout archive, diagrams, etc.
V Topics from the midterm review
* [see the midterm review at this linkfor more details]
* Numeral systems: binary, decimal, hexadecimal, etc.
* Data Representation
* Boolean Logic
* Gates and Circuits
* Computer Organization
* Instruction Set Architecture and Assembly Programming
* Machine code and Assembly Language Programming Techniques
V Overview of language translation
* translation takes place in stages, with distinct data structures associated with each stage (see chart from lecture)
* several appropriate theories associated with the various stages can help manage complexity and suggest the "right" way to do things (but we concentrate on practical issues and a small, simple language)
* separation into phases makes the process much easier to understand, but they can be combined in practice (making only one or two passes over the code, for example)
* the meaning of a program can either be implemented dynamically, as we process the code (interpretation or evaluation) or statically, by way of translation to another form (compilation)
V Context-free grammars and parsing techniques
* before (or during) parsing comes lexical analysis or scanning: a phase in which the input string is broken into "chunks" called tokens
* tokens matter during parsing mainly for their classification (e.g., literal or variable), but during code generation also for their content (e.g., 147 or x)
* some tokens, such as variables or literals, might be entered into a symbol table during this phase of processing or the next
* parsing: this phase involves the recognition of hierarchical phrase structure in the language (phrases and sub-phrases, e.g., statements, expressions, etc.)
* we describe the hierarchical structure of possible forms using a context-free grammar
* each context-free grammar describes a language, or set of strings, based on possible expansions starting from its start symbol and ending in a string of terminal symbols
V parser generator programs take a grammar (usually modified) as input and provide a parsing program as output
* pros: easier than hand-generated parser, more likely to correspond exactly to the grammar
* cons: high learning curve, grammar often has to be "massaged" to fit the technique used (reducing confidence in correctness)
V Shunting-yard algorithm
V the shunting yard algorithm is a simple parsing technique which works well for languages with just atomic tokens and infix operators
* (it can be extended to include other features in an ad-hoc fashion)
* the algorithm uses two stacks, one for operands and one for operators, plus a boolean to track whether seeking operand or operator next
* operators may be shifted from the input onto the op stack, or an operator may be reduced along with its arguments from the arg stack: the resulting term is pushed back on the arg stack
* a fundamental aspect of the algorithm is the use of operator precedences to decide whether to shift or reduce
* the shunting yard can also transform infix to postfix using just one stack, but can also build trees, or even values if they are statically determinable
* see the longer description in the handout archive on the class homepage
V Term representation and interpretation (evaluation)
* terms (syntactic phrases) are naturally represented as trees with each node identifying a specific form of expression, and its children representing its immediate sub-phrases
* in OO languages, it is natural to use a class/sub-class hierarchy to help organize phrases by types
* interpretation proceeds as a pass over the tree, in some order determined by the language semantics, with appropriate actions being taken during the traversal (i.e., dynamically)
* be careful to distinguish the syntactic issues of precedence and association with the semantic issues of order of evaluation
* an abstract syntax tree represents the phrase and sub-phrase structure of the program overtly
V Abstract machines and intermediate representations
* in order to make code generation easier, we often choose an intermediate representation corresponding to some abstract machine which has features set in between our language and the actual target machine
* typical abstract machine features include stacks for evaluating expressions or method calls and environments for variable look-up
* an abstract machine can either be implemented with an interpreter itself (as in Java's JVM) or can be used as the basis for further processing
* abstract machines and intermediate representations allow for a more flexible back-end to the compiler, since it is easier to re-target their implementation for different actual machines
* the main distinguishing feature of an abstract machine used for these purposes is that its code is likely to have a more linear (non-hierarchical) form
V Code generation and optimization
* code generation follows the same plan as interpretation (an ordered tree traversal), but now generating pieces of code rather than actually performing semantic actions dynamically
V when is it compilation (versus interpretation)?:
* eliminate tree-like phrase structures for linear object code (with jumps);
* eliminate names for addresses;
* typical problems involve keeping track of run-time resources (e.g., registers and RAM locations), mapping names to their numeric equivalents, and determining proper sequencing of events
* we usually depend on support from a run-time system for such services as dynamic memory allocation, communication with I/O devices, etc.