Now, don’t get all excited. The theme changed from one that came out in 2010 to one from 2016. It’s progress, but we’ve still got a ways to go to get “modern.” Don’t hold your breath!
It Might Be Alive After All!
After a l-o-n-g break away from this project, it seems like there may be some new activity forthcoming. I recently acquired a Raspberry Pi, and I’ve been looking for a “project” for the little beastie. Perhaps this is the right device for the task.
The Pi is one of the newer ones with a bit more memory and such. I have it running Raspbian a Debian Linux distribution, which has Python already installed. I suspect much of my old NLTK code will run fine, but s-l-o-w-l-y.
I’ve decided to use Twitter as the main interface to the program. Eventually, it will check Twitter every 5-10 minutes for “mentions” of it’s handle, parse the message, and generate a suitable response. Python Twitter Tools seems to include most of the functionality I need and is tailor-made for the job. I have the Twitter account (@braynebuddy) ready, I’ve generated the required credentials, and am (not very) busily coding.
Stay tuned for more…
Maybe… just maybe?
What would prompt new activity on a dead blog…? Well, I listened to a Python411 podcast recently (something I highly recommend, by the way) and learned about a project called Open Allure. It’s focused on building systems for teaching, but the things that caught my interest were that it used Python as the development language, and that it was able to have a “conversation with your computer: it talks, it listens, it watches, it responds.”
It seems to me this is a promising development. I’m inspired to try and dig up my old code out of the trash bin and start this whole project again! In six months these two (way-smarter-than-me) guys have made some amazing progress. Maybe there’s hope…?
Stalled and Distracted
Not much visible progress in the last several weeks. I’ve been off wandering in the world of Python software development with forays into the Twitter API, using sockets for inter-process communication, and a dozen other fascinating areas. Lots of new code written. Almost all the old code seems broken in some way or other.
So, it’s time to take stock of where we are on this journey. To that end, I’m going to try and document the bits that I have, and the challenges as I see them now. The bits will be in the order I think of them, so probably won’t make sense.
This bit started life as code I found here that “defined several simple classes for building and using semantic networks.” It defined three classes: Entity, Relation, and Fact. It allows statements like these:
>>> animal = Entity("animal")
>>> fish = Entity("fish")
>>> trout = Entity("trout")
>>> isa = Relation("is-a",True)
>>> Fact(fish, isa, animal)
>>> Fact(trout, isa, fish)
Having defined these variables, statements like this were easy:
>>> print "trout is a fish?", isa(trout,fish)
trout is a fish? True
>>> print "trout is an animal?", isa(trout,animal)
trout is an animal? True
The trouble came when I tried to persist the relations to a file. It turns out that the “Entity” object is storing the “isa” object in its list of known facts (actor-relation-object). Of course when the entity object is reloaded from disk, the relation object is a different object than the signature that was stored, so things come unraveled.
That has lead me into trying to figure out how to persist and recreate objects without losing the network relationships. I suspect there is an easy pattern for this sort of thing, but I’m not a skilled enough programmer to see it immediately.
More adventures will follow…
It’s easy to get out of control on a project like this, and I think that’s where I’ve been for the last several weeks.
I’ve read more about AI and natural language processing than I ever knew existed a few months ago. This is a large and active field, and I’m in awe of the amazing work that’s going on all over the world. Places like the University of Rochester are full of smart people working long hours for years to get PhDs in this area. I have read several of the papers available on their CS department web site, and it sounds like they have already accomplished much of what I set out to do and have moved on.
Other, non-academics have made significant inroads into the problem. They are beginning to move beyond simple script-driven AIML bots to more sophisticated programs. I had added a link to one of them (Jeeney) to this post so you could see what I mean, but it’s not working. It’s not quite human-level interaction, but it’s not bad. I really like this approach and intend to follow a similar route if I can.
Which brings me back to the scope question… What am I really going to attempt here? I have asked this question several times already, and will likely do so again. Here’s my answer for today. This is heavily influenced by Homer [S. Vere and T. Bickmore, “A basic agent” (Computational Intelligence 1990)].
- Parser/Interpreter – This will tear apart the incoming text stream and convert it into some kind of standard internal form. I don’t expect this to be a flawless English parser because that’s both too difficult and not necessary. I need to extract the meaning from sentences at the level of a 3rd-grader at most. Vocabulary is pretty easy given all the work the Princeton folks have put into WordNet.
- Episodic Memory – This is what we might call our short-term memory. It’s surprisingly important in our ability to make sense of language. Basically I think of it as a list of the recent communication/event stream. I’m not sure how to “compress” this short-term memory into general memory, but it will need to be done.
- General Memory – All the stuff the computer “knows” stored in some kind of formal ontology. I have spent a lot of time worrying about how to do this, and have decided it probably doesn’t matter too much right now. If the thing ever gets really big I may have scaling problems. I don’t know how to solve or even pose them at this point.
- Planner/Reasoner – Figure out what to say, or perhaps do. This is a new idea for me, but it makes sense. Once the computer has understood the meaning of an input stream, what does it do? That’s the planner’s job. It will identify goals and come up with the steps needed to achieve the goal. I don’t think I’m too clear on how to do this.
- Executer – Figures out whether a plan step can and should be executed right now. Executes the step.
- Learner – Adds new knowledge to General Memory. No idea how to do this well. Jeeney claims to be reading through Wikipedia!
- Text generator – The opposite of the Parser/Interpreter. Converts from the standard internal form generated elsewhere in the program to simple 3rd-grade English.
I have several useful-looking snippets of Python code spread across these areas and have been trying to figure out where they go. I don’t know whether I’ll be able to tell if any of them will work before I get a basic shell constructed for each function.
Scope control… How much can one person get done in a few hours a week? Probably not as much as I’d like to. Oh well!
Understanding is the Important Bit
I’ve spent the last few weeks building and deleting little snippets of code in the general vicinity of language parsing. Mostly it has been an exercise of exploring some of the things that have been learned by the serious folks in the NLP field over the past 10-20 years. Some of what I’ve tried has worked out, and some has not. Most of it will never see the light of day.
Much of what I’ve been tinkering with is code for parsing English grammar. There has been a lot of very complex work done on solving this problem, so there are a lot of things to read and understand. I’ve learned a lot about top-down, bottom-up, shift-reduce, and left-corner parsers. I’ve tinkered with taggers and chunkers and sense and valency in an effort to deal with the complexity that we build into our language.
One of the “big” things I’ve concluded from all this is that I’m not as interested as I thought in building a good parser (i.e., a parser that includes all valid sentences while excluding all invalid ones). What the machine really needs is to do is extract the “meaning” from the word-stream coming in. Meaning can be gleaned from all sorts of ungrammatical constructions. For example, “In the park running the dog is” and “girl like boy” are easily understood in spite of the fact they’re nonsense grammatically. Even things like “Tihs snceente is esay to raed eevn wtih meixd up lteetrs” can be understood without much effort (try it here, and read more about this effect here). There is a lot of redundancy in our language, so the grammar and spelling doesn’t matter as much as our teachers implied, especially since the average letters/word runs around 4 or so.
This brings me back to an earlier question, “What does it mean to understand?” I’m going to start with the idea that if the machine can convert the incoming word-stream into a set of entities, the relationships among those entities, and answer questions about them requiring inference or deduction, then it has understood the statements.
I’m currently grappling with understanding propositional logic, first-order logic, and lambda abstraction (or λ-calculus) because I think these ideas might lead to a way of systematically encoding meaning in a form that the machine can use easily.
I’ve got such a long way to go.
These guys are really good! Their stuff actually does what I’m struggling mightily to conceive might be possible. Why isn’t this stuff already embedded in the products we use?
These guys are really smart! They’re a special interest group of the big group of smart people working on Computational Semantics.
What have I got myself into…
It has occurred to me that I have drifted from building an intelligent machine to building a working language parser. I suppose this is a necessary first step, but I must not lose sight of the real goal. There are so many fascinating distractions along the way!
In any case, the next step in this journey was to connect the parse-WordNet-concept pieces together to generate a symbolic representation of the meaning in a normal English sentence. As with much of this project, it’s easy to get a simple system going and much harder to make it work generally.
I feel like I’m “standing on the shoulders of giants” to steal a phrase from Stephen Hawking (who borrowed it from others as well). There has been an enormous amount of work done by the Natural Language Toolkit (nltk) folks to implement NLP algorithms in Python. Virtually everything I’m doing uses the software they have written. When I say, “I have built”, or “I have written” you must understand that what I really mean is that I have labored mightilly to stick a couple of lines of glue code between calls to the nltk functions.
So, I have successfully connected the WordNet lexical database to a recursive-descent parser. The parser is running a simple context-free grammar (CFG) that covers a small fraction of the English language. Even so, it does surprisingly well. For example, it correctly parses “the old man the ship” as a noun phrase (the old), a verb (man), and a noun phrase (the ship):
(S (NP (Det the) (Nom (N old))) (VP (V man) (NP (Det the) (Nom (N ship)))))
This is a sentence that would not be obvious to a person but is easy for the machine because it’s not confused (yet) by the fact that “man” is generally a noun not a verb.
What it doesn’t do yet is handle punctuation or capitalization. For example, it fails to parse “The” as “the” at the moment and chokes on commas, semicolon, quotes, etc. Some of these are easy to fix, others might require more low-level coding to replace the functions already part of NLTK.
The other thing that seems to be facing me is that a CFG is unlikely to be flexible enough for general language parsing. There are simply too many special cases. That’s why I need the semantic concepts data. I’m going to use a simpler, non-generic parser even though it can’t weed out nonsense sentences like, “the dog flew water.” I’ll use a semantic filter to get at the meaning of the sentence, if any. I think that will still not be enough to get rid of the ambiguity, but it might make a CFG good enough to use as a parser.
word, parse, and concept
I’m slowly making progress and have a word-finder that looks up a word and gets its hypernyms from WordNet, a parser that takes a sentence and gets all the parts of speech for the words, and a concept structure for the conceptual dependency information. All three with simple web interfaces.
Now, to wire it all together!
A swamp of details
Well, not much progress on the main task has occurred in recent days. I’m slowly sinking into a swamp of details around how to implement ideas in code.
There was a long trek through C#, mono and MySQL that turned out to mostly be a dead end. It is possible to put the WordNet data into MySQL, and to access the database from C#/mono on Linux, but it’s not easy. I think the Linux tools are not really ready to be used quite that way by people with my limited skills, and I’m not willing to lock myself into a Windows-only platform.
The current frontrunner seems to be Python with the NLTK toolkit, which offers a lot of high quality AI code that promises to be useful. It seems fairly straightforward to get the whole thing running with a web front end. Unfortunately, NLTK really wants to be running on Python 2.5+ and my server is on Ubuntu 6.06 LTS (with Python 2.4). I guess it’s time to upgrade the server to Ubuntu 8.04 LTS anyway.
Once all this is done maybe the main task can resume. I have run across a bunch of work that has been done on the Semantic Web that seems very close to what I’m trying to do. Things like the Resource Description Framework (RDF) and Web Ontology Language (OWL) look like exactly what I’m after and will likely be among the first “meaning storage” schemes I try.