Regular Languages Through a Haskell Lens

This lab does not involve too much programming, but is rather an opportunity for you to read some Haskell code and use it to explore some representations, constructions, and examples of regular languages. The “work” part of the lab involves a few simple exercises to make sure you understand parts of the code, a few written answers to problems (which you may nevertheless test and develop in a Haskell environment), and a couple of short coding problems.

What’s included in the module

You can find two versions of the code file on the course homepage, one a plain text file, which you can download onto your own machine and then load into hugs or ghci, and a PDF version which might be easier to read (it fits nicely on two sides of a page if you want to print it out). New! Check out the color-syntax HTML version of the Haskell source code, also below.

The two files are available as:

Once you have the files loaded in, you should try a few things as suggested below to get a feel for what it can do: a handful of examples are included, as well as some functions for generating test inputs. The code includes two algebraic data types, one for regular expressions (REs) and one for deterministic finite automata (DFAs).

A tour of the code (and some possibly obscure Haskell features)

Below is a quick overview of the code, in roughly top-down order. You may want to follow along as you read this section and see if you can understand what is going on. To be fair, many of you have relatively recent or relatively “stale” experience with Haskell, so I have included some tips on things that you might not remember or might not have been exposed to yet.

Some things to try out

So, you’ve seen the code, and I hope you’ve tried out a few things based on the comments above, or even just exploring on your own. This section briefly describes some broader exercises you might try, followed below by some that you should be able to demo in lab. But don’t forget to just “play” with the code—you can even modify it if you like, since you can always fetch a fresh copy if you somehow manage to bollix it all up.

First off, if it wasn’t obvious from the above, you should be able to test out the match and acc functions on single and multiple examples as follows:

RegLang> match swd "abba"
True

RegLang> match swd "abbaabba"
True

RegLang> baa
(((b.a).(a.(a*)))*)

RegLang> match baa "baaa"
True

RegLang> match baa "baaaaaaa"
True

RegLang> match bch "baaaaaaa"
False

RegLang> win (match baa) (abs 6)
["","baa","baaa","baaaa","baaaaa","baabaa"]

RegLang> acc bchm "bababa"
True

RegLang> acc baam "baaaaab"
False

RegLang> win (acc bchm) (abs 8)
["ba","baba","bababa","babababa"]

Note how you use the match and acc functions just with one (other) string argument, whereas they can be partially applied and then combined with win and one of the generating functions to test many strings at once.

You can also use the constructions neg and prod to build more complex machines:

RegLang> cut (acc baam) (abs 3)
(["","baa"],["a","b","aa","ab","ba","bb","aaa","aab","aba","abb","bab","bba","bbb"])

RegLang> cut (acc (neg baam)) (abs 3)
(["a","b","aa","ab","ba","bb","aaa","aab","aba","abb","bab","bba","bbb"],["","baa"])

RegLang> win (acc baam) (abs 6)
["","baa","baaa","baaaa","baaaaa","baabaa"]

RegLang> win (acc bchm) (abs 6)
["ba","baba","bababa"]

RegLang> win (acc (prod (||) baam bchm)) (abs 6)
["","ba","baa","baaa","baba","baaaa","baaaaa","baabaa","bababa"]

Of course, you can also try out some of the utility and helper functions to make sure you understand how they work:

RegLang> part "abc"
[("","abc"),("a","bc"),("ab","c"),("abc","")]

RegLang> cut even [1..10]
([2,4,6,8,10],[1,3,5,7,9])

RegLang> get 10
["","a","b","aa","ab","ba","bb","aaa","aab","aba"]

Some things to demo!

OK, now is the time we finally get down to brass tacks: what do you need to do for credit for the lab? You should work on the following questions, producing either sample runs, definitions of examples, or even hand-written notes, for each of them. (I would in any case recommend recording your answers, whether in a file or on paper, as suitable, so that you have access during your demo.)

  1. How many strings are in the list abs k as a function of k? Why?
  2. Write down a regular expression which matches this informal description: strings of as and bs which have have the same letter at the beginning, somewhere in the middle, and at the end of the string (i.e., either an a or b in all 3 places), with these three occurrences separated by non-empty runs of the opposite letter (of any length > 0). Implement your design as Haskell RE called bme (for “beginning-middle-end”) and demonstrate that it works.

    As examples, all the following strings seem to meet this criterion:
    ababa   ababba abbaba   ababbba abbabba abbbaba
    babab   baabab babaab   baaabab baabaab babaaab
    
  3. Design a DFA which corresponds to the previous exercise’s RE; then implement it and demonstrate that it works as intended.
  4. Design a DFA which corresponds to the “Swedish rock group” example RE, i.e., the example swd in the file. (Which strings exactly does this match? How can you design the DFA? I recommend working on paper first.) Once you have the design, implement it as a Haskell DFA and show that it corresponds to the RE version, at least for {a,b} strings of length up to 8.
  5. Consider the following definition of a construction which attempts to convert a machine into one which accepts the “{a,b}-opposite” by reversing the roles of the symbols in the alphabet. In other words, this opp function is trying to modify or “tweak” its argument DFA so that the new, tweaked version does exactly what the original DFA did, but with the roles of a and b swapped.
    opp (DFA qs s d q f) = DFA qs (reverse s) d q f
    
    Show that this fails to do the intended job, and explain why … but program a construction which does the job right. (Note that this {a,b}-opposite effect should in general not affect other symbols in the alphabet.)