|
 |
CS 353: Architecture and Compilers—Final Exam Review
|
|
|
 |
About the exam
|
|
|
 |
The final exam will be held on Tuesday, December 11, 2018 from 2-5 pm in the normal lecture room (Ford 204)
|
|
|
 |
study resources: lecture notes, on-line materials (check the homepage), quizzes, lab assignments
|
|
|
 |
the exam will be three hours long, closed book, closed notes, and closed computer!
|
|
|
 |
the exam will be comprehensive (covering the whole semester), but with some emphasis on material since the midterm
|
|
|
 |
typical format: some T/F, some short answer, a few longer “conceptual” questions
|
|
|
 |
Topics from the midterm review
|
|
|
 |
[see the midterm review at this linkfor more details]
|
|
|
 |
Numeral systems: binary, decimal, hexadecimal, etc.
|
|
|
 |
Data Representation
|
|
|
 |
Boolean Logic
|
|
|
 |
Gates and Circuits
|
|
|
 |
Computer Organization
|
|
|
 |
Instruction Set Architecture and Assembly Programming
|
|
|
 |
Machine code and Assembly Language Programming Techniques
|
|
|
 |
Overview of language translation
|
|
|
 |
translation takes place in stages, with distinct data structures associated with each stage (see chart from lecture)
|
|
|
 |
several appropriate theories associated with the various stages can help manage complexity and suggest the "right" way to do things (but we concentrate on practical issues and a small, simple language)
|
|
|
 |
separation into phases makes the process much easier to understand, but they can be combined in practice (making only one or two passes over the code, for example)
|
|
|
 |
the meaning of a program can either be implemented dynamically, as we process the code (interpretation or evaluation) or statically, by way of translation to another form (compilation)
|
|
|
 |
Context-free grammars and parsing techniques
|
|
|
 |
before (or during) parsing comes lexical analysis or scanning: a phase in which the input string is broken into "chunks" called tokens
|
|
|
 |
tokens matter during parsing mainly for their classification (e.g., literal or variable), but during code generation also for their content (e.g., 147 or x)
|
|
|
 |
some tokens, such as variables or literals, might be entered into a symbol table during this phase of processing or the next
|
|
|
 |
parsing: this phase involves the recognition of hierarchical phrase structure in the language (phrases and sub-phrases, e.g., statements, expressions, etc.)
|
|
|
 |
we describe the hierarchical structure of possible forms using a context-free grammar
|
|
|
 |
each context-free grammar describes a language, or set of strings, based on possible expansions starting from its start symbol and ending in a string of terminal symbols
|
|
|
 |
parser generator programs take a grammar (usually modified) as input and provide a parsing program as output
|
|
|
 |
pros: easier than hand-generated parser, more likely to correspond exactly to the grammar
|
|
|
 |
cons: high learning curve, grammar often has to be "massaged" to fit the technique used (reducing confidence in correctness)
|
|
|
 |
Shunting-yard algorithm
|
|
|
 |
the shunting yard algorithm is a simple parsing technique which works well for languages with just atomic tokens and infix operators
|
|
|
 |
(it can be extended to include other features in an ad-hoc fashion)
|
|
|
 |
the algorithm uses two stacks, one for operands and one for operators, plus a boolean to track whether seeking operand or operator next
|
|
|
 |
operators may be shifted from the input onto the op stack, or an operator may be reduced along with its arguments from the arg stack: the resulting term is pushed back on the arg stack
|
|
|
 |
a fundamental aspect of the algorithm is the use of operator precedences to decide whether to shift or reduce
|
|
|
 |
the shunting yard can also transform infix to postfix using just one stack, but can also build trees, or even values if they are statically determinable
|
|
|
 |
Term representation and interpretation (evaluation)
|
|
|
 |
terms (syntactic phrases) are naturally represented as trees with each node identifying a specific form of expression, and its children representing its immediate sub-phrases
|
|
|
 |
in OO languages, it is natural to use a class/sub-class hierarchy to help organize phrases by types
|
|
|
 |
interpretation proceeds as a pass over the tree, in some order determined by the language semantics, with appropriate actions being taken during the traversal (i.e., dynamically)
|
|
|
 |
be careful to distinguish the syntactic issues of precedence and association with the semantic issues of order of evaluation
|
|
|
 |
an abstract syntax tree represents the phrase and sub-phrase structure of the program overtly
|
|
|
 |
Abstract machines and intermediate representations
|
|
|
 |
in order to make code generation easier, we often choose an intermediate representation corresponding to some abstract machine which has features set in between our language and the actual target machine
|
|
|
 |
typical abstract machine features include stacks for evaluating expressions or method calls and environments for variable look-up
|
|
|
 |
an abstract machine can either be implemented with an interpreter itself (as in Java's JVM) or can be used as the basis for further processing
|
|
|
 |
abstract machines and intermediate representations allow for a more flexible back-end to the compiler, since it is easier to re-target their implementation for different actual machines
|
|
|
 |
the main distinguishing feature of an abstract machine used for these purposes is that its code is likely to have a more linear (non-hierarchical) form
|
|
|
 |
Code generation and optimization
|
|
|
 |
code generation follows the same plan as interpretation (an ordered tree traversal), but now generating pieces of code rather than actually performing semantic actions dynamically
|
|
|
 |
when is it compilation (versus interpretation)?:
|
|
|
 |
eliminate tree-like phrase structures for linear object code (with jumps);
|
|
|
 |
eliminate names for addresses;
|
|
|
 |
typical problems involve keeping track of run-time resources (e.g., registers and RAM locations), mapping names to their numeric equivalents, and determining proper sequencing of events
|
|
|
 |
we usually depend on support from a run-time system for such services as dynamic memory allocation, communication with I/O devices, etc.
|
|
|