Programming Project #2: Simulating an NFA


Assigned: Wed 2 Oct 2002

Due: Fri 18 Oct 2002


Description

For this project you will write a program similar to that which you wrote for the last lab, except now it should simulate the computation performed by a non-deterministic finite automaton. As before, the program should be parameterized (or modular) with respect to the actual NFA simulated; i.e., it should be possible for the user to configure the program with a specific NFA and then run that NFA on several inputs.

In particular, your program should be able to:

  • simulate runs of the NFA on several successive input strings (without re-reading the NFA description or re-configuring the simulator);
  • reject ill-formed NFA descriptions (see below);
  • determine and report whether or not the NFA accepts a given input string;
  • optionally, output a trace of at least one of the machine's accepting computations (e.g., a list of the states entered, in order).

As usual, you may use any language or platform you wish for your implementation; the only restriction is that you be able to demonstrate your program for the instructor, either in lab during regularly scheduled hours, or by advance appointment in some other easily accessible place (e.g., the student research lab, WITS labs, instructor's office, etc.).

For more details, see the various sections below.

Input and configuration

For your demo, it should be possible for you to quickly enter in an NFA of reasonable size (say, about 5 or 6 states) on the basis of a state diagram or other description of the NFA.

For most of you, the configuration portion will probably be done graphically, although it may be useful to have some aspects desribed textually. In any case, Let's suspend disbelief and pretend for the moment that all your machine configuration input is textual.

Let's agree to reserve the "%" sign as standing for the epsilon transition (i.e., in the description of the NFA, we will use "%" this way; in fact, this probably should not preclude it's use in the alphabet, but you may assume that it won't occur there). Under these conventions, we might use a syntax like the following to define a simple NFA, assuming the usual order for the five elements of the tuple:

	q0, q1
	"abc"
	q0:a->q1+q0, b->q1;  q1:a->q0+q1, %->q1, c->q0
	q0
	q0, q1
		
In this syntax, newlines are used to separate parts of the tuple, commas to separate states in sets of states, a traditional string syntax is used for the alphabet, semi-colons are used to separate parts of the specification of the transition function, and plus signs are used to separate multiple target states.

Of course, we could just assume that all states are named "qi" for some i and that the alphabet string is implicitly quoted. These ideas would yield a an alternate configuration description more like the following.

	2
	abc
	q0:a->q1+q0, b->q1;  q1:a->q0+q1, %->q1, c->q0
	q0
	q0, q1
		

Note that this machine (however described) lacks a transition from state q0 on a "c" input and from state q1 on a "b" input. In each state, an "a" input may lead to either state next, and from state q1, an epsilon transition may be taken to q1.

Of course, these are only sample syntaxes (syntaces??): for your program, you might use something completely different, and typically more graphical. But the same issues about epsilon transitions and multiple state targets still apply. The point is, find a nice combination of graphical and textual notations that is easy for you to work with, then build your implementation around that.

Some possible approaches

As discussed in class, there are at least two immediately appealing ways in which you might implement the NFA simulation. The first is by a more or less direct technique in which you keep track of the multiple possible runs explicitly and simultaneously. Perhaps the easiest way to visualize this is as a tree of possible paths, with branches out of a given node or state corresponding to multiple possible transitions from that state on a given input. Under this approach, finding a single accepting trace is just finding a single branch in the tree which ends in an accepting leaf node. In order to fairly consider all possibilities, you could use a queue to traverse the tree in breadth-first or level order. You might also want some extra structure to help you to re-construct the computation trace, if you plan to offer that information to the user (i.e., you will need to keep track of not only the "current node", but also the whole path that led to that node).

Another approach would be to write a method or function which would convert an NFA input by the user into the corresponding DFA, as per the construction described in lecture and the textbook. Of course, under this approach it will be harder to give a computation trace which would correspond to the original NFA described, but it would be easier to perform the simulation. Note in particular that you are not required to print out a computation trace, only a "yes/no" determination of whether the machine accepts the input. This makes the conversion-based approach much easier, though it may be harder to debug.

Assessment

As usual, your project will be assessed on the basis of a "live" demonstration. In addition to demonstrating your implementation, you should be prepared to show your code (on-screen is OK) and to provide a short introduction describing your approach (direct implementation, conversion; what data structures were used; etc.).

Enhancements

Part of the goal of the programming projects in this course is to give you the opportunity to develop some significant programs on your own, in a group setting, with relatively few constraints. This should allow you to exercise both your design and coding skills. In order that you may take best advantage of this opportunity, I will require only that you meet the minimum standards described above, but I am also encouraging you to go beyond the basic problem description and do something more creative, individual and interesting with the general assignment.

For this assignment, one enhancement might be to include the capability to interact with the NFA. For example, you might allow the user to guide the choice of the next state visited when several are possible, and then to back-track if no accepting trace can be found down that "branch".

Another possible enhancement would be to list out all possible traces which give acceptance of the string. This could even be done in a tree-like or outline style format in order to give a sense of how several possible accepting computations share the same trace up to some point of divergence.

If you are writing a graphically-based simulator, or using any approach in which you attempt to give "real-time" feedback on the course of the computation, it will be challenging to convey a good sense of multiple simultaneous traces; it might be better under these circumstances to allow for interaction or back-tracking.

In any case, none of these enhancements are required and you will have the opportunity to earn an "A" in the course even if you only meet the minimum assignment requirements.