V
CS 353: Architecture and Compilers—Final Exam Review
V
About the exam
*
The final exam will be held on Saturday, December 14, 2019, from 8-11 am in the normal lecture room (Ford 204)
*
study resources: lecture notes, on-line materials (check the homepage), quizzes, lab assignments
*
the exam will be three hours long, closed book, closed notes, and closed computer!
*
the exam will be comprehensive (covering the whole semester), but with some emphasis on material since the midterm
*
typical format: some T/F, some short answer, a few longer “conceptual” questions
*
Look at the course homepage for handout archive, diagrams, etc.
V
Topics from the midterm review
*
[see the midterm review at this linkfor more details]
*
Numeral systems: binary, decimal, hexadecimal, etc.
*
Data Representation
*
Boolean Logic
*
Gates and Circuits
*
Computer Organization
*
Instruction Set Architecture and Assembly Programming
*
Machine code and Assembly Language Programming Techniques
V
Overview of language translation
*
translation takes place in stages, with distinct data structures associated with each stage (see chart from lecture)
*
several appropriate theories associated with the various stages can help manage complexity and suggest the "right" way to do things (but we concentrate on practical issues and a small, simple language)
*
separation into phases makes the process much easier to understand, but they can be combined in practice (making only one or two passes over the code, for example)
*
the meaning of a program can either be implemented dynamically, as we process the code (interpretation or evaluation) or statically, by way of translation to another form (compilation)
V
Context-free grammars and parsing techniques
*
before (or during) parsing comes lexical analysis or scanning: a phase in which the input string is broken into "chunks" called tokens
*
tokens matter during parsing mainly for their classification (e.g., literal or variable), but during code generation also for their content (e.g., 147 or x)
*
some tokens, such as variables or literals, might be entered into a symbol table during this phase of processing or the next
*
parsing: this phase involves the recognition of hierarchical phrase structure in the language (phrases and sub-phrases, e.g., statements, expressions, etc.)
*
we describe the hierarchical structure of possible forms using a context-free grammar
*
each context-free grammar describes a language, or set of strings, based on possible expansions starting from its start symbol and ending in a string of terminal symbols
V
parser generator programs take a grammar (usually modified) as input and provide a parsing program as output
*
pros: easier than hand-generated parser, more likely to correspond exactly to the grammar
*
cons: high learning curve, grammar often has to be "massaged" to fit the technique used (reducing confidence in correctness)
V
Shunting-yard algorithm
V
the shunting yard algorithm is a simple parsing technique which works well for languages with just atomic tokens and infix operators
*
(it can be extended to include other features in an ad-hoc fashion)
*
the algorithm uses two stacks, one for operands and one for operators, plus a boolean to track whether seeking operand or operator next
*
operators may be shifted from the input onto the op stack, or an operator may be reduced along with its arguments from the arg stack: the resulting term is pushed back on the arg stack
*
a fundamental aspect of the algorithm is the use of operator precedences to decide whether to shift or reduce
*
the shunting yard can also transform infix to postfix using just one stack, but can also build trees, or even values if they are statically determinable
*
see the longer description in the handout archive on the class homepage
V
Term representation and interpretation (evaluation)
*
terms (syntactic phrases) are naturally represented as trees with each node identifying a specific form of expression, and its children representing its immediate sub-phrases
*
in OO languages, it is natural to use a class/sub-class hierarchy to help organize phrases by types
*
interpretation proceeds as a pass over the tree, in some order determined by the language semantics, with appropriate actions being taken during the traversal (i.e., dynamically)
*
be careful to distinguish the syntactic issues of precedence and association with the semantic issues of order of evaluation
*
an abstract syntax tree represents the phrase and sub-phrase structure of the program overtly
V
Abstract machines and intermediate representations
*
in order to make code generation easier, we often choose an intermediate representation corresponding to some abstract machine which has features set in between our language and the actual target machine
*
typical abstract machine features include stacks for evaluating expressions or method calls and environments for variable look-up
*
an abstract machine can either be implemented with an interpreter itself (as in Java's JVM) or can be used as the basis for further processing
*
abstract machines and intermediate representations allow for a more flexible back-end to the compiler, since it is easier to re-target their implementation for different actual machines
*
the main distinguishing feature of an abstract machine used for these purposes is that its code is likely to have a more linear (non-hierarchical) form
V
Code generation and optimization
*
code generation follows the same plan as interpretation (an ordered tree traversal), but now generating pieces of code rather than actually performing semantic actions dynamically
V
when is it compilation (versus interpretation)?:
*
eliminate tree-like phrase structures for linear object code (with jumps);
*
eliminate names for addresses;
*
typical problems involve keeping track of run-time resources (e.g., registers and RAM locations), mapping names to their numeric equivalents, and determining proper sequencing of events
*
we usually depend on support from a run-time system for such services as dynamic memory allocation, communication with I/O devices, etc.