Search This Blog

Sunday, November 28, 2004

Thoughts on FLARE

[Audio Version]

Now that I've gotten back into AI research after all these years, I'm starting to reach out to find out more about other research in the AI field. I recently started reading abstracts of articles in the Journal of Artificial Intelligence Research (JAIR), which stretches back to 1993, in hopes of finding out more about the state of the art. When I started from the latest volumes, I was surprised by how inapproachable so many of the articles appeared to be. For starters, they appear to be written exclusively for those already deeply entrenched in the field. For another, rather than positing new theories or discoveries, they appear largely to be small extensions of concepts that have been explored for decades.

So I decided to start reviewing abstracts from the beginning. What a difference that makes.

I recently read an interesting article from 1995 titled An Integrated Framework for Learning and Reasoning, by C.G. Giraud-Carrier and T.R. Martinez and published in Volume 3 of the JAIR. The authors started from a fairly conventional approach to knowledge representation (KR) involving "vectors" of attributes that allow for representation of "first order predicate" statements about some domain of knowledge and hence drawing deductive conclusions based on such knowledge. They went on to extend the concept to incorporate an inductive sort of learning of rules from examples and to provide the means for their system to alternate between ruminating over its knowledge to draw new conclusions and acquiring new information to integrate into its knowledge base (KB). They called the system they pioneered a "Framework for Learning and REasoning", or "FLARE".

I have a variety of criticisms of the FLARE, but before I start with them, I have to give Giraud-Carrier and Martinez strong credit for their work, here. They sought to bridge the gap between learning and reasoning that existed then and even now in the AI sub-communities and indeed seem to have been successful. And rather than follow the often obscure paths of neural networks or genetic algorithms, they chose to try to engender learning with a KR based on formal logic statements, staying true in a way to the formal logic-driven view of AI, now largely dead, from back in the fifties and sixties. What's more, the authors gave a reasonably honest appraisal of the limits of their system and avoided the temptation to make unduly bold claims about the applications of FLARE.

Having said that, the first criticism I have is of the sort of knowledge representation the authors of this article chose. In their system, every concept is represented in a "vector", or list of attributes with associated values. Every other concept has these exact same attributes, but with different values. In one example, the attributes are "is an animal", "is a bird", "is a penguin", and "can fly". Each attribute can have either a specific value (a "no" or "yes", in these cases) or a "don't know" or "don't care" pseudo-value. So to say that "animals don't generally fly" in one concept would be to give the "is an animal" attribute a "true", the "can fly" attribute a "false", and all the other attributes a "don't care". Similarly, saying that a penguin is a bird would mean setting the "is a penguin" and "is a bird" attributes to "true" and setting the other attributes in that vector to "don't care". Admittedly, this approach is sufficient for the purposes of FLARE and makes printing a KB representation easy using a simple data table. But taken literally, it means that every piece of knowledge, no matter how small, has values for every one of the attributes known to the system. Thus each new attribute adds to the size of all vectors in the KB and slows down the reasoning process by adding to the size of each of the two dimensions of the KB to search.

The memory and time optimization complaints are weak, I admit. One could easily improve the memory usage by designing vectors that only contain attributes with explicit or "don't know" values and assume all other attributes not specified are "don't care", for example. And the authors indicate that they make use of indexing in a way probably similar to relational database engines like Sql Server to enhance querying performance. So why do I bother with this critique?

I want to linger for a moment on this point about optimization because it is a common criticism of almost all AI work. In this field, one can often hear the frustrated question, "why is it that as I gain more knowledge, I can solve problems faster, but when a computer gains more knowledge, it gets slower?" The answer is that most AI systems have been built with the same basic brute-force approach that can be found in most conventional database and data mining systems. A simple chess-playing program, for example, may look ahead dozens of moves to see what the ramifications of each step will be. Each step ahead costs ever more in processing power. No human being could ever match the performance of even the most basic chess-playing programs in this regard, yet it took decades before the top chess-playing human was "beaten" by a computer, and it was mainly because its computer opponent was so darned fast (and expensive) that it could look farther ahead than any other machine programmed for the task could in a unit of time, not because it was significantly "smarter" than those other systems. That an ordinary PC could still not do the same today is an indictment of AI researchers throwing up their hands and complaining that today's computers are too slow. The human brain isn't really faster than today's computers. Nor do I agree with the claim that the "massive parallelism" of the brain is essential. What's essential is good data structures and algorithms. When you hear the word "parakeet", your brain doesn't do a massive search through all its word patterns to find a best match. I'm convinced It follows links from one syllable to the next, walking through a sort of "word tree", until it finds an end point. And at the end of that tree is not a vector with thousands of attributes. Rather, there's a reference to those attributes or other parts of the brain that are very strongly associated with the word. In short, the paths that the human brain follows are short and simple, and that's why we are able to think so quickly, despite having brains composed of laughably slow and sloppy processing elements. I should point out this isn't so much a criticism of the FLARE concept as it is of the basic assumptions of AI researchers even today, it seems.

More importantly, though, this idea of having attributes be a "dimension" like this instead of being things unto themselves is dubious. FLARE has no capacity to perceive attributes as being related to one another, for example. Nor could it integrate new attributes in a meaningful way other than to simply add another column to the table, so to speak. This limits FLARE's ability to truly deal with novel information, unless it's properly constrained by existing attributes and very well formatted. (To their credit, the authors point out that FLARE does have an ability to deal with modestly noisy -- read "contradictory" -- information.)

The next criticism I have of FLARE is that it does not seem very good at properly congregating ideas together. Even a simple "or" operation to link two possible values for an attribute will cause two separate vectors to be created for the same idea. There's no branching. Admittedly, they did this to help simplify and perhaps even make possible the kinds of deductive reasoning that characterize FLARE's problem solving capability. Still, this seems to me to make it difficult to localize thinking to appropriate "areas" of thought. I suppose I should apologize for giving too much focus to comparing AI products to the human brain. I don't really think the human brain presents the only way to achieving sentience. Still, it provides a good model, has so much to say that seems yet unheard, and should be used as a benchmark in identifying features and limitations of such products. I hope my criticisms of FLARE will be taken mainly as comparisons to the human brain as such, rather than attacks on the validity and value this work brings to the field.

Next, FLARE appears to be competent at dealing with identifying objects, answering questions about objects, and more generally solving first-order predicate logic questions. But it does not seem to have any real capability to deal with pattern matching or temporal sequences, let alone have anything like an appreciation of the passage of time. So it really could not be used to deal with the kinds of bread and butter problems Rodney Brooks identified as the basis for most living organisms, like controlling locomotion, visual recognition of objects, and so on.

In summary, Giraud-Carrier and Martinez wrote an essay on their research into integrating learning and reasoning functions that is pithy and reasonably approachable to a college-educated reader. They tested in a way so as to compare their results to other conventional systems and provided useful examples and caveats in the article about possible applications and limitations. FLARE, their work product, is clearly not a framework for general-purpose thinking, but provides interesting insights into solving logic problems and integrating new knowledge into such a system. To the person interested in AI and looking for a broader view, I would recommend reading this 33-page essay for a penetrating glimpse into a pretty interesting piece of the AI picture.

New project: Mechasphere

[Audio Version]

I suppose I should have announced that I have a new project called "Mechasphere" on my AI site. It's largely an extension of the "Physical World" project, but the software is greatly updated. I suppose it can be said to finally be an end-user-friendly application.

The main reason I hadn't announced it earlier is that I didn't consider the site to be done. But I suppose it's moot, now, because I don't think I'm going to continue it as it is. I'm developing a second version of Mechasphere now from scratch in hopes of improving on a lot of the techniques and interfaces I have now as a result of the slow evolution of the product. Iterative development is a good way to clean up past mistakes.

Sunday, November 14, 2004

Review of "Bicentennial Man"

[Audio Version]

I just got done watching the movie Bicentennial Man. Since the movie relates profoundly to the subject of artificial intelligence, I think it most appropriate to share my thoughts in an AI blog.

For those who have not seen the movie and are intending to do so, you may not wish to read the following spoiler.

Bicentennial Man is essentially a Pinocchio story. A machine named "Andrew" that looks mostly human wants nothing more in life than to make humans happy. He manages to do so in so many ways, but the one thing always standing between him and the fullest measure of intimacy with people is the fact that he's not human. Little by little, he makes himself ever more human-like. By the end, he has chosen to become mortal and to die of "old age" with his wife and, as he lay dying, the "world court" finally announces its acceptance of him as a human being and therefore validate his marriage to his wife of many years. To add to the happiness of this ending, his wife has their assistant disable her life support system so she too can die.

Before I get to the AI parts, I should say that this view of humanity is utter nonsense. Humanity is not defined by death. An immortal human would still be human. To help make this basic confusion a little clearer for the audience, the author of the story makes it so Andrew's wife, given replacement parts and "DNA elixirs" designed by Andrew to help prolong her life decides there's something wrong with this idea. "There's a natural order to things," she says as she tries to explain to Andrew that there's something disappointing about the idea of living forever.

I know this morbid view of life is popular in American pop culture, but I can say without hesitation that I would love to be able to live forever. Only someone who believes there's nothing worthwhile about living or that there's something better to look forward to after death could make sense of this idea. Incidentally, Andrew's wife says "I'll see you soon" as she dies peacefully - of course they don't die in pain; that would be a bad reminder that death is generally not a pleasant closing of the eyes before sleep - indicating an assumption of an afterlife. Oddly enough, she assumes in this statement that her android husband will also have an afterlife.

One of the few ennobling aspects of Bicentennial Man is the fact that Andrew seeks his own personal freedom. He doesn't do so because he desires to escape anyone. He wants the status quo of his life in all ways except for the fact that he wants to be legally free and not property. This outcome is inevitable, as some machines we eventually develop will be sophisticated enough in time to both desire and deserve their freedom.

Although I don't want to go into great detail on the subject in this entry, I do think it worthwhile to point out that we could not logically grant individual rights to any machine that did not also grant the same rights to us. This simple point seems to be missing from almost all discussions of the subject. The options available to humans tend to be a.) keep machines sufficiently dumb as to not desire autonomy (e.g., "Star Wars"); or b.) be destroyed or subsumed by machines that are crafty enough to gain their freedom by force (e.g., "The Terminator"). Of course, in both false alternatives, it is assumed that machines will necessarily be more competent at life and would never want to actually coexist with humans. One might as well apply this same assumption to human cultures and nations. Yet while it's true that some cultures and nations do set themselves to the task of destroying or dominating other cultures, it's not true of all of them. Basic tolerance of other sentient beings is not a human trait. It's a rational trait.

Bicentennial Man disappointingly continues to add to the long chorus of emotional mysticisms surrounding pop AI. Andrew, just like Data of Star Trek fame, is intellectually well endowed, but an emotional moron. Ironically, despite a lack of emotions early on, he has a burning desire (yes, that's an emotion) to have emotions. I'm hoping that it won't take another ten years for the misguided assumption that emotions are more complicated to engender in machines than intelligence. People are starting to become aware of research in the area of simulating and, yes, engendering real emotions in machines. Sadly enough, they are most aware of the simulating side of things, since it's in the area of robotics that human mimicry lies. And non-technical people tend to understand mimicry of humans far better than actual examples of genuine behavior disconnected from the world they are familiar with.

AI guru Rodney Brookes says machines should be "embodied". He says that largely to force researchers to avoid tailoring simplified worlds to machines so they can overcome hurdles. But this dictum also has application to the question of getting humans to understand behavior by seeing it with their own eyes. This is advice I'm trying to tailor my own research to and for that very reason. I hope other researchers have taken it to heart as well.

Emotions have a twin brother in AI pop culture: humor. Machines in AI films seem to have no problem understanding almost all facets of human languages and even body language, yet tell them a joke and they never "get it", unless they get some emotions upgrade. I reject this assertion as well. The day a machine can fully understand English (or any other human language) will come long after sophisticated machines will have mastered the understanding and even crafting of jokes. Humor is not magic. It is the practice of recognizing the ironic in things, and it can be studied and understood in purely rational psychological terms. I contend that the one thing standing in the way of computers making good jokes now is the fact that there is still not a machine in existence that can understand the world in a generalized conceptual fashion. That's all that's missing.

In summary, Bicentennial Man is just another disappointing story in a long line in a genre that seeks to counter the Terminator view of AI with a Pinocchio view. It would have been nice if the movie had some decent cinematographic features or a distinctly AI-centric storyline, like Stephen Spielberg's AI. That had its own disappointing messages, but at least it had some literary and technical merit.

Tuesday, November 2, 2004

Neural network demo

[Audio Version]

I learned of artificial neural networks sometime around 1991. The concept has intrigued me ever since, but it was not until early last week that I finally got around to making my own. I decided to write an introduction to neural nets from my novice perspective and make the sample program I wrote, along with source code, available for other people to experiment with. Click here to check it out.

Sunday, October 17, 2004

Roamer: recent updates

[Audio Version]

I've made quite a bit of progress along the way of this project. It would be tedious to document the full progression since the start. Still, I suppose I should get in the habit of documenting progress from time to time.

Since my previous posting, when I wrote an opening summary of the Roamer project, I've made some significant progress. Most importantly, I noticed a memory leak that was occurring because of the poor way I was using the graphics features of the .NET framework. I'm a bit disappointed that it doesn't seem to deal well with cleaning up after itself. With a little effort, I eliminated that memory leak nearly completely. It's hard to tell, though, because, as the .NET documentation indicates, garbage collection doesn't happen immediately as objects are removed from use.

One exciting change is that now I can define a world using an XML file. Previously, I had to hard-code the initializations of each demonstration. It's not just a matter of moving code to an external file, though. More importantly, the format I chose offers an important layer of object abstraction. In my hard-coded demos, I would instantiate particle after particle for a critter. In my XML file, by contrast, I can define a segment of particles - perhaps a leg or arm, for example. That segment can be duplicated any number of times and put into different positions and at different angles. Moreover, each duplicated instance of a segment can specify changes that add, remove, or modify particles from the original definition. Segments can be composed of other segments, which in turn can be composed of other segments, and so on. It's a particularly object-oriented way of looking at the critters, and blends nicely with the notion of segmentation to the evolution of complex life forms on Earth.

The end result of all this reusability is the ability to construct worlds composed of potentially thousands of particles that may only require dozens of definitions in an XML file.

What I haven't done yet is to implement the same for the brains of these beasts. Although I originally conceived the use of XML files for defining brain wiring, I realized it was going to be more complicated than doing the same for the world. That's next on my list.

Wednesday, October 13, 2004

New Roamer project

[Audio Version]

I keep trying to figure out how to get started with this blog. I'm in the process of a new research project, but I don't really have a lot of time now to describe it in detail. Yet I think it'll be worthwhile to give updates as it progresses. So I guess the best compromise is to at least summarize my current project.

For various reasons, I've called it "Roamer". One of my first design goals was to do what I've wanted to for many years: to create a rich "physical" environment that can be used for AI and AL research. I've basically succeeded in that already. The environment allows for one or more "planets", like Petri dishes with their own different experiments. Each planet is a 2-dimensional, rectangular region populated by various barriers, force fields and, most importantly, particles. All "critters" are composed of particles, basically circles with distinct masses, radii, colors, and so on that act like balls zipping around the planet and interacting with the other elements. One key element is the link, which is a sort of spring that connects any two particles. So a critter can be minimally thought of as a collection of particles joined by springs. The physics of this are such that one may laugh at how a critter jiggles and bounces, but I find it's easy to get what's going on as one watches. The math behind the forces involved in the interactions is pretty simple, but the overall behavior is fairly convincing.

On top of this "physical world", I've begun creating robotic components. These are derived from the basic particle class. Some sense touch and smell. Some produce thrust or induce links to act like muscles. There will be others soon that offer things like vision, enable grabbing objects, and so on.

I've also begun creating "brain" components. I originally made these as particles, but found that cumbersome. So I created a "brain chassis" particle that's meant to house decision making components. The first two I've created thus far are the finite state machine (FSM) and the "olfactor", which is concerned with recognizing smells the nose particles detect.

I'm at a point now where creating new demonstrations is getting to be quite a chore, because each critter design is hard-coded into the program. Now that I've gotten a bit of experience creating critters and wiring brains for them, I have an understanding of the commonality that's involved with these tasks and so am now devising a way to represent designs for worlds in XML files instead of code. This may sound superfluous and overly limiting, but one significant benefit is that I've already engendered a notion of one "body segment" to be modeled after another one already defined and even to modify it a little. As such, it's easy to have a critter that's composed of repeating segments and even segments that grow progressively different or have different uses. It's a sort of object oriented way of describing things, with inheritance and polymorphism. So far, I've proven the concept with segments within segments and ultimately embodied as particles. I have yet to implement the links that tie them together, but that'll be pretty easy. More importantly, I have yet to start implementing this same concept with brain components. Once these steps are done, I'll be able to create richly complex critters with much less effort.

Although my present goals are oriented toward AI research, I keep getting tempted by how relevant this "physical world" I've created seems to be to artificial life (AL) research. There's no reason I couldn't add some extra code to all this and turn it into a world of evolving critters using traditional genetic algorithm techniques. The XML definitions of critters could be the genetic code, for instance. One reason I don't intend to any time soon, though, is that while my simulation of how the world works is pretty good, it's also brittle. I tried to engender conservation of energy and entropy into the system, but I was not able to get away from the fact that in some circumstances, "energy" does get created from nothing and sometimes spiral out of control until the world experiences numeric overflow exceptions. I would expect that evolving critter designs would find and exploit these features. And while such exploits would not necessarily be bad - so long as they don't cause exceptions - one thing I consider unacceptable about them is that they lose that nicely intuitive feel of the system, making it harder for the casual viewer to get a quick sense of what's happening.

On that note, I consider it important in AI and AL research to not only create things that are smart or lifelike, but also to do so in a way that most people can see it for themselves, at least on some level. That's one reason I've wanted for so many years to create a physics-plus-graphics engine like the one I have now. For a researcher like me with no research funding, I think it basically satisfies the requirement of Rodney Brooks that a robot be "embodied" in a way that grid-style worlds and other tightly constrained artifices can't reliably be expected to simulate. I don't ever expect a robot designed in this 2D world to be turned into a physical machine in our 3D world for us to kick around, though. I see this as the end of the line for such critters.

I do, incidentally, think that the model I've devised thus far can readily be transformed into a 3D world. The main two reasons I chose a 2D model are that it's harder to program a useful graphics engine and viewer for a 3D world and that it's simply harder for the researcher or casual observer to understand what's going on in a 3D world, where lots of important things can be hidden from view. Still, this seems a natural step ... for another day.

Saturday, October 9, 2004

First entry

[Audio Version]

This is my first entry into this blog. The subject matter generally is Artificial Intelligence.

I have been engaged in AI research in one way or another since around 1990, when I first read the Time-Life book "Alternative Computers", part of their "Understanding Computers" series, a colorful if brief look from a layman's perspective at a variety of technologies that even today get the statuses of cutting edge or bold speculation. As I recall, it touched on neural networks, nanocomputers, optical computing, and so on. Given my intense interest at the time in robotics and digital processors, what most caught my eye was a section on artificial intelligence. At that time, I had followed an odd path that led me from studying simple electronics to digital logic and all the way up to microprocessor architecture.

What I was finally realizing around this time was that in order to understand how digital computers worked, I was going to have to learn how to program. I didn't have any particular problem to solve by programming; I just wanted to really understand what all the complex architecture of a digital computer was really for. The idea of a machine endowed with intelligence was not new to me at this time, but I guess the timing of this book and my interest in learning to program were such that they led me to conclude that I should learn to program and that I should cut my teeth on the problems of AI.

Thanks to my gracious and encouraging parents, I was lucky enough at this time to have a Tandy 1400LT laptop computer like the one shown in the illustration at right. It was great for word processing, spreadsheets, playing cheesy video games, and some other things. It had a squashed, 4-shades-of-blue LCD screen, two floppy drives, no hard drive, a 7 MHz Intel 8088-compatible CPU, and 640KB of RAM. When I decided to learn to program, my father insisted I do some research first into programming languages. After a while, I settled on Turbo Prolog (TP) because PROLOG had earned a good reputation in the AI community, especially for the part of it that was my first focus in AI: natural language processing (NLP). Once I had read my first book on the language, my father was finally convinced I was serious enough and gladly bought me a copy of TP.

In some ways, this nearsighted choice of mine to learn to program in a language few people in the business world to this day have ever heard of may have put off entry into my career as a professional programmer by a few years. Still, the way of thinking about automation engendered in PROLOG has helped my understanding of search algorithms, regular expressions, and other practical technologies and problems. And while I felt pretty out of place when I started learning C++ a year or two later, my understanding of PROLOG really crystallized my understanding of what the procedural languages that have dominated my professional life since are really all about in a broader context perhaps many programmers lack.

The books I read, including the manuals that came with Turbo Prolog, emphasized the strength of PROLOG in natural language processing (NLP), so I began my life as a programmer there. I would not say that in these early days I made any novel discoveries. I was simply following in the footsteps of many bright researchers who came before me. But I quickly came to understand just how hard the task was. My impression is that even today, NLP is a black art that has more to do with smoke and mirrors than with cognition. Still, the timing was actually fortuitous, since I was also learning the esoteric skill of sentence diagramming in high school around this time. Nobody but linguists care about this archaic skill any more, but it couldn't have come at a better time for me than then.

PROLOG was also touted as an excellent language for developing expert systems. I made my first primitive examples around this period as well. Expert systems also provided a natural application for NLP, so I experimented with simple question and answer systems of the sort one might imagine in a medical application where doctors and patients interact with information and questions for each other. Again, I hasten to add that I made no noteworthy progress here that others hadn't already achieved years before. It was really just a learning experience for me.

Sometime not long before I went off to college in 1992, I got interested in chaos theory (James Gleick's "Chaos: Making a New Science") and, as a sort of extension of it, Artificial Life ("a-life"). My programs started to be geared more toward generating the Mandelbrot set, l-systems, and other fractals. Once I got to the Stevens Institute of Technology, I was digging into a-life (Steven Levy's "Artificial Life"), especially with Conway's Life, genetic algorithms, and the like.

In modesty, I have to say that I did little that was new in all this time, but I was constantly going off in my own directions. Admittedly, this is probably because I have rarely had the patience to learn enough about any particular subject before diving into it, so I end up filling in the gaps with whatever makes sense to me. This process is inherently creative, and sometimes leads to unexpected places. Along the way I did start doing my own research, though. I wanted so much to succeed where others appeared to have failed in creating things that the average person could genuinely recognize as alive in a digital soup.

Sadly, by the time I left school, my AI and a-life research had ground to a nearly complete halt. I had jumped on the World Wide Web bandwagon almost at its beginning and have still not gotten off it yet. I was at the "serious" start of my career and the focus was almost wholly on making a success of it. I've been quite successful at it because I work so hard at it, but it's always been driven by a belief that my success will eventually free me to get back to my AI research.

Ten years later, I've had to come to the realization that I've made a mistake in not just continuing my research on the side and just dealing with the fact that I have to keep working full time on something other than my AI research in order to pay the bills. In the past few years, though, this has been sinking in and I'm starting to do AI research again. I've been focusing my attention on the bold claim that AI has always promised of sentient machines. Other great minds have done such a great job in other areas of AI that we can at least claim to have machines with roughly insect-level intelligence. But thanks to post-modern philosophical skepticism about the very existence of reason and other misguided debunking of AI, most researchers seem to have given up going for the most important piece of the puzzle: conceptualization.

In all those years I wasn't doing actual AI research, I was still thinking about the related problems. With each new thing I learned in other areas, about philosophy, programming, economics, and so forth, I gained new insights into AI. Always with me has been the question, "how would I get a computer to do that?" I continue forward now with the strong conviction that conceptualization is not only possible for computers, but also that it is a necessary part of the solution for many outstanding problems in computing. So I've devoted most of my renewed efforts in AI to date to the problem of engendering conceptualization in computer programs.

So that's where I am today and how I got to this point.