Artificial Intelligence (W4701)

Homework 3:

Forward-Chaining Logical Inference System

Due Date: Tuesday November 23. This will leave (only) 2 week for the final (shorter) homework assignment.

Introduction. For this homework, you will implement an entire forward-chaining logical inference system from scratch. This project can be very engaging and enjoyable -- hopefully once you start you'll get involved and won't want to stop until you get it working!

In any case, I highly recommend you read through this entire assignment at least once immediately. Then, before our next class, you should spend at least a few hours starting the implementation. In general, plan to spend at least 5-7 hours a week for the next couple weeks so you do not have to cram it all in at the last moment. This will be the largest implemented project of the semester.

If you program it in a high-level language like Java or LISP I think you'll find progress pretty smooth. C will be more difficult. Since I have broken down the design of the system for you, below, you are in a good position to do a nice, systematic job at this relatively large project. By the way, the next (and last) homework is likely to make use of the system you are implementing for this one, so this makes it even more important that your system is clean (e.g., you may need to do further debugging when you do the next assignment, although probably no expansion).

Reading:

Russell and Norvig, Chapter 9 (Last two subsections of 9.6 are optional, and Section 9.7 is optional), and Sections 10.1, 10.6 (only the first 4 pages are required, e.g., you don't need to know about inheritance exceptions) and 10.7 (general concepts only).

Part I: Complete Logical Inference System

Implement the Forward-Chain algorithm in Figure 9.1 of the text in a language of your choice. This is a complete inference algorithm over Horn sentences. It simply applies Generalized Modus Ponens (GMP) as much as possible. For this homework project, we will use the term rule to refer to an implication in Horn form, each of which has a left-hand side (LHS - a conjunction of non-negated atoms) and a right-hand side (RHS - one non-negated atom). Also, we will use fact to refer to a single atom (which is also legitimate in Horn form). We will say applying or firing a rule to refer to a legal application of GMP to that rule, thus deriving a new fact. Note that in general, an implication could also be considered a "fact", so this lingo-convention applies only to these pages.

For the purposes of this assignment, there are several simplifying assumptions to make your job easier:

Facts must be fully instantiated atoms -- no variables. Also, they cannot be negated, but that is part of the definition of Horn form.
All variables appearing in a rule's RHS must also appear on the LHS. This ensures that when a rule fires, it results in a rule that has no variables. (Your code can assume this without checking for it.)
No use of functions; only predicates, variables and constants.

The first thing you should do is understand the algorithm in Figure 9.1 to a great degree of detail, so you see what is is doing and why it makes sense.

When you implement the algorithm, pay heed to the following items, which are addendum to the the text's Back-Chain algorithm in Figure 9.1, and are due to the above assumptions and to my making a few things more explicit for your convenience:

Argument p is a fact.
Since facts will have no variables, the first line of the algorithm does not have to consider renamings, but rather should simply check whether the fact is already in the KB.
The KB variable should be considered a pointer (or a global variable) so each addition to it effects more than a local variable. In fact, you don't need it as an explicit argument to the functions, depending on how you implement things. For example, the algorithm could be two methods of a KB class.
The first loop is actually a nested loop: For each rule in the KB, for each LHS atom p unifies with...
Make sure your code for this algorithm looks pretty darn similar to the functions in Figure 9.1. That is, the level of detail should be about the same. This means that if something is taking up much more space, it should be separated into another function. In this way, you are forced to be modular to a reasonable extent.
Note that after each call to Unify, you check if it failed (this is only implicit there in the book).
COMPOSE can simply concatenate the two sets of bindings together (since facts have no variables).
Make sure that when you invoke Unify you give the two arguments in the correct order (as shown below, the second argument must have no variables). Specifically, the second call shown in Figure 9.1 must have the arguments reversed.
Subst(substitutions, atom) should be a separate function. It should be straight-forward to write.

In general, there is a set of rules, and a separate set of facts. Each time a new fact comes in, the Forward-Chain algorithm is invoked to set off all possible rule firing, like the "domino effect", adding new facts to the database. In addition to implementing Forward-Chain, you must write code to invoke it for each new incoming fact.

Unification algorithm. Furthermore, due to the above simplifying assumptions, the unification algorithm can be much simpler than that given in Figure 10.3 (part of the optional reading). Here is the unification algorithm you should implement:

Unify(f1, f2) takes two atoms as input, the second of which has no
              variables (a fact).  If they can be matched, it returns a list 
	      of variable bindings.  Otherwise, it returns "fail".

   0. Initialize substitutions (a local variable) to the empty list.
   1. If f1 and f2 have different predicate names or different
      numbers of arguments, Then return "fail".
   2. For each argument, a1, in f1, with corresponding argument a2 in f2:
         If a1 is a variable
         Then If a1 is bound to something other than a2 in substitutions
	     Then return "fail"
             Else add the binding a1/a2 to substitutions
         Else If (constant) a1 doesn't equal (constant) a2
	     Then return "fail"
   3. return substitutions

Text-based I/O. Your system should be able to handle (piped as standard input) input such as that shown below in Part II. As in the text, variables are lower-case, and constants are capitalized. You need not check for errors in the input -- you can assume the input is syntactically correct. You may expect the input to first list all the rules, and then all the facts.

Implementation guideline.

You should make separate data structures (classes, if OOP, e.g., Java) for each of the following (these are general guidelines -- you can stray from them as you like):

Symbol table. This keeps a set of character strings (symbols), and allows you to associate each string with an integer. You can check if a symbol exists, get its number, or get a number's symbol. You'll have separate tables for predicate names and constant names.
Fact. Made of a predicate and a list of arguments, each of which is either a constant or a variable.
Rule. LHS is list of atoms. RHS is one atom. Here you need a symbol table of variable names -- a separate such table for each rule.
A substitution list of variable bindings, i.e., a list of pairs, each pair consisting of a variable name (index to a symbol table) and a constant (index to a symbol table).
etcetera, according to how you are comfortable implementing the algorithms mentioned above.

Parsing the input. You can assume no spaces within each predicate-argument structure, but spaces everywhere else. If you are using Java, you can use that stringbuffer(?) class (I can't remember what it's called), which allows you to get each token one at a time.

Debugging hints: Get a little to work at a time. For example, first, get it to successfully input rules (only), including the use of the symbol table, and then print them back out. In general, have a "trace" mode that can be turned on or off easily (for debugging only -- turn it off when capturing your system's performance for your homework submission).

Part II: Testing Your System

You must show your system working on all of the following examples:

The example from the book, typed here for your convenience so you can cut-and-paste.

American(x) ^ Weapon(y) ^ Nation(z) ^ Hostile(z) ^ Sells(x,z,y) -> Criminal(x)
Owns(Nono,x) ^ Missile(x) -> Sells(West,Nono,x)
Missile(x) -> Weapon(x)
Enemy(x,America) -> Hostile(x)
American(West)
Nation(Nono)
Enemy(Nono,America)
Owns(Nono,M1)
Missile(M1)
Nation(America)

From this, it should derive Criminal(West).

The example I showed in class.

TooBig(x) ^ GoodSize(y) -> BetterPet(y,x)
Giraffe(x) -> TooBig(x)
Dog(x) -> GoodSize(x)
Barks(x) ^ WagsTail(x) -> Dog(x)
Giraffe(Bob)
Barks(Sally)
WagsTail(Sally)

From this, it should derive BetterPet(Sally,Bob)

The example from the midterm exam.

Instrument(y) ^ Musician(x) -> Plays(x,y)
Instrument(y) ^ Plays(x,y) -> NotToneDeaf(x)
Musician(Grace)
Instrument(I1)

From this, it should derive that Grace is not tonedeaf.

Family relationships.
```
Parent(x,y) ^ Parent(x,z) ^ Distinct(y,z) -> Sibling(y,z)
Parent(x,y) ^ Sibling(x,z) ^ Parent(z,w) -> Cousin(y,w)
Distinct(x,y) -> Distinct(y,x)
Parent(Lisa,Eric)
Parent(Lisa,Rachel)
Parent(Speed,Lisa)
Parent(Speed,Jay)
Parent(Jay,Frances)
Distinct(Eric,Rachel)
Distinct(Lisa,Jay)
```
After testing this input, leave the two rules as is, but replace the rest with your own family or a made-up family. Only assert parental relationships, and allow it to ascertain sibling and cousin relationships. Your example family input need not be any larger than the example shown above. Make sure to clarify that everyone is distinct from one another (unless two names both refer to the same person). Read (but don't do) exercise 9.6 to understand why we need "Distinct".
You are, of course, welcome and encouraged to optionally devise your own example input for your system. For example, how would you implement Ascendent and Descendent predicates for the family relationships?

By the way, you should realize you have essentially written an interpreter for a programming language here. Note that Prolog is different -- it does backward chaining instead of forward chaining.

Part III: Exercises

Questions about your system, implemented in Parts I and II. Although getting it to work may be the hard part, these theoretical questions are just as important.
1. If you allowed NOT to be used in the system, would this increase its expressive or inference abilities in any way? That is, would it then be able to do anything it can't already do? Why or why not?
2. Exercise 9.3
3. Does it hurt expressiveness or inference abilities to not allow variables in the facts? Under what assumption does it definitely not hurt?
4. What if we wanted a set of conjoined atoms on a rule's RHS? If we allowed that, would the system be more expressive or capable of a greater range of inferences?
5. What if we wanted to allow disjunctions on a rule's LHS? If we allowed that, would the system be more expressive or capable of a greater range of inferences?
9.7, part b
Exercise 9.9, parts a, b, c, d, e (not f or g)

email: evs at cs dot columbia dot edu