Since we haven’t seen much in lecture yet, this lab will be a simple set of exercises both to get you back in the habit of Haskell and also to get you to start thinking about language and meaning, syntax and semantics. Having seen some list and tree folds in your Functional Programming class, you should be able to write a few short programs using folds to complete the lab … but remember that short does not necessarily mean quick in Haskell! In other words, you should expect to puzzle a bit over these until that “Aha!” moment when you see a nice, concise solution. Along the way, of course, you should try things out in Hugs/ghci, play around, and especially pay attention to the types (they will become more important in this class).
The “language” that we will be concerned with here will be just the language of arithmetic, with just numeric constants and three operations: addition, multiplication, and exponentiation. But, since we will want to generalize this soon, to allow other types of constants and other operators, we will first define a general data type for terms (“tree-like pieces of language”) and then instantiate this general definition to more specific types. In particular, we will call these more specific numeric terms (numeric) expressions.
data Term c o = Con c | App o (Term c o) (Term c o)
type Expr = Term Integer Op
data Op = Add | Mul | Exp deriving (Eq, Ord)
(Note: you should copy this and the other definitions into your program file for the lab.)
The type Op
here is just a little enumerated type of plain, atomic
“symbols”. That is to say, it is an algebraic data type with several
constructors, but each of which is essentially a constant—they take no
arguments, but are “complete” as-is. Enumerated types are like a
set of constants or, in more linguistic terms, like an alphabet of symbols.
In Haskell they are a special, simple case of the more general algebraic data
types. But they are also available in many other languages, including Java and C,
albeit in slightly different forms.
They are usually provided as a type-safe alternative to the kinds of little
“tag” values that programmers sometimes use when they need to indicate
or store one of a small number of alternatives
(e.g., a status number or an error code). The use of raw integers
for tags like this is dangerous: if there are other such tag-sets or
other integers around, these tags or integers could get confused
by a simple mistake. In a type-safe language, a compiler can check to see that the
tags are used appropriately and not confused with each other or other integers.
(Note here that we instantiate the Term
data type to
Expr
essions using Haskell Integer
s. We should probably
use String
s instead, but the syntax and semantics of simple numbers
are not so far apart … besides, we will get more honest about it for one
of the exercises below 😀 .)
You should think of these Term
s, Expr
essions, and
Op
erators as an internal form of abstract syntax, as
opposed to the external form of concrete syntax, which would normally
be represented by a String
, or the forms of meanings,
which in this case would likely be just Integer
s or Int
s.
(Technically, we should distinguish the abstract mathematical concepts of
strings, trees, symbols, and natural numbers from their Haskell-specific
implementations, but it is OK to leave these lines a little blurry for now.)
Now that we have some algebraic datatypes to represent or numeric expressions and symbols, we can start to think about their meanings, or rather about their semantics, i.e. the way they get their meanings.
The style of semantics we will implement in this lab is called
denotational or mathematical semantics. The idea is that
the meanings are given as some set of (mathematical) values, in this case
the natural numbers (whole numbers, but with no negatives, just 0 and up).
A semantics in this style will then supply a natural number meaning for each
possible internal form by way of a meaning function. In Haskell, therefore,
an Integer
or Int
will be provided, as a function result,
for each Expr
ession argument. Traditionally, denotational semantics
are defined in a way that is called compositional: the meaning of the whole
is defined in terms of the meanings of its parts. In this context, with the
recursive-but-ultimately-symbolic tree forms we call Term
s, it means
that we should define the meaning function by cases and recursion over the
Term
type … i.e., using the
“template style” from the Functional Programming course.
But, as we learned there, template-style functions are most concisely expressed as folds. And in fact, one of our themes will be that
folds over algebraic types correspond to (denotational) semantics,at least with regard to Haskell implementations of simple algebraic languages.
What types do we need folds for, and what do those folds look like?
Recall from Functional Programming that a fold is a parameterized, usually recursive function, defined over some (usually) recursive data type. It gets one parameter for each constructor of the data type, and it “replaces” the constructor in the result by the corresponding parameter. In the usual case, the constructor has arguments, and so the parameter to replace it should be a function, applied to the constructor’s arguments … unless one of those arguments represents a recursive sub-structure, i.e., an occurrence in the data type’s definition of the same type being defined. In this case, the fold function we are defining should be called recursively in just the same way—we end up converting the whole structure to some specified result type, corresponding here to the semantic domain, by combining recursive sub-results using the supplied parameter functions.
And, by the way, if a constructor has no structurally-recursive arguments, then the parameter function is just applied to the constructor’s arguments, transforming them into a value of the result type. Finally, if it has no arguments at all, then the corresponding value is not a function, but just some fixed value of the result type, used to replace the fixed value (or constant) of the structural type which is that particular constructor.
(For a somewhat more graphical perspective, you might want to review the recipe for folds from the Functional Programming course.)
So, given all that, what should the folds for our types look like? For the
Term
s themselves, we get the following definition:
fold f g (Con c) = f c
fold f g (App o l r) = g o (rec l) (rec r)
where rec = fold f g
(Can you guess the type of this fold
function, based on
its definition? Try to do so, but if you can’t, remember that
Haskell (well, Hugs
or ghci
) can do it for you:
just type “:t fold
” once the function is defined!)
For example, here are two little functions (perhaps familiar from
Functional Programming) which compute the size and height of
a Term
, thought of here more overtly as a tree (note the
use of the const
function and lambda notation):
size = fold (const 1) (\_ n m -> 1 + (n + m))
hite = fold (const 1) (\_ n m -> 1 + (n `max` m))
But we can also define a fold for our “alphabetic” enumerated type of arithmetic operators: since they take no arguments, these constructors will be “eliminated” (deconstructed?) by simply replacing each with the appropriate parameter value:
foldOp a m e Add = a
foldOp a m e Mul = m
foldOp a m e Exp = e
For example, a function to replace each of the abstract operators by their usual ASCII string equivalents might look like this:
showOp :: Op -> String
showOp = foldOp "+" "*" "^"
Speaking of examples, here is a quick definition of a list of some sample values you can use during lab to test your definitions and check your results:
samples = [ Con 17, -- 17
App Add (Con 2) (App Mul (Con 3) (Con 5)), -- 2 + 3 * 5
App Add (square (Con 7)) (square (Con 9)), -- 7^2 + 9^2
App Mul (samples!!1) (samples!!2) -- (2+3*5) * (7^2 + 9^2)
]
where square x = App Exp x (Con 2)
Note how we can use “helper functions” like square
to make the definitions a little easier to write. And note that we can define
larger samples in terms of smaller ones, even using a bit of recursion, with the
help of Haskell’s indexing function (!!)
.
OK, now it’s time for some actual lab exercises!
Add 3 sample Expr
essions to the list of samples above.
Write them out first on paper or in a text editor the way you would
write them as math formulas, or Haskell expressions. Then try to
translate them into the internal form we use here, as values
of the Haskell type Expr
. And feel free to use
helper functions or other tricks to make it easier.
Write a function eval :: Expr -> Integer
that will
evaluate a numeric expression and return its integer value
(or use Haskell’s Int
type instead).
You should do this using the fold
function for
Term
s … as well as the foldOp
function
for Op
erators. You will need to supply two function
arguments to fold
, one each to “eliminate”
the Con
and App
constructors. In the context
of Expr
essions, the first function will act on the integer
arguments of Con
and the scond will act on the Op
and sub-term arguments of App
. (Hint: in other words,
you will need the meanings of Op
erators for that second part.)
For example, we might have:
> map eval samples
[17,17,130,2210] -- of course, you will have 3 more samples!
Write a function pprt :: Expr -> String
(for
“pretty-print”) which will show what an
Expr
ession would normally look like printed out
in infix format, with the operator between its two arguments.
You will want to use the Haskell show
function to
convert Integer
s into String
s. And
to make things easier, just put in all the parentheses
—don’t worry about deleting them based on the order of
operations (PEMDAS).
For example, we might have:
> pprt (samples !! 3)
"((2 + (3 * 5)) * ((7 ^ 2) + (9 ^ 2)))"
Write another pretty-printing function, but this time format the
operators and their arguments in prefix format; in other
words, the operators come before their arguments, not
between them. (In this case, parentheses are not even necessary!)
Ideally, this variation on pprt
should be a simple
modification of the original.
For example, we would have:
> pprt' (samples !! 3)
"* + 2 * 3 5 + ^ 7 2 ^ 9 2"
Make a type definition similar to that for Expr
, but
this time use Haskell String
s in place of Integers
.
(You can call the new type Expr'
, read “exper prime”.)
Now write a revised evaluation function eval' :: Expr' -> Integer
:
it should work for these new string-based expressions. You will want to
use the read
function to convert strings of digits to numbers.
If you have trouble with it, try declaring and using this instead:
readInt = read :: String -> Integer
For example, you should get:
> eval' (App Mul (Con "23") (Con "2"))
46
Finally, write a function count :: Expr -> (Integer,Integer)
which separately counts the number of constants and operators in a
general Term
; the first number is the constant count,
the second the operator count. You should do this using the
fold
function—the solution should be similar
to size
and hite
above, but of course the
result type is now a pair of numbers. (Hint: and therefore two of the
arguments to the App
-eliminating argument to the
fold
will also be pairs!)
For example, you should have:
> count $ samples !! 3
(7,6)
(For class discussion: can you see a relationship that always holds
between the two count
results? Why does it hold?
could you prove it?)