Thoughts on FLARE

[Audio Version]

Now that I've gotten back into AI research after all these years, I'm starting to reach out to find out more about other research in the AI field. I recently started reading abstracts of articles in the Journal of Artificial Intelligence Research (JAIR), which stretches back to 1993, in hopes of finding out more about the state of the art. When I started from the latest volumes, I was surprised by how inapproachable so many of the articles appeared to be. For starters, they appear to be written exclusively for those already deeply entrenched in the field. For another, rather than positing new theories or discoveries, they appear largely to be small extensions of concepts that have been explored for decades.

So I decided to start reviewing abstracts from the beginning. What a difference that makes.

I recently read an interesting article from 1995 titled An Integrated Framework for Learning and Reasoning, by C.G. Giraud-Carrier and T.R. Martinez and published in Volume 3 of the JAIR. The authors started from a fairly conventional approach to knowledge representation (KR) involving "vectors" of attributes that allow for representation of "first order predicate" statements about some domain of knowledge and hence drawing deductive conclusions based on such knowledge. They went on to extend the concept to incorporate an inductive sort of learning of rules from examples and to provide the means for their system to alternate between ruminating over its knowledge to draw new conclusions and acquiring new information to integrate into its knowledge base (KB). They called the system they pioneered a "Framework for Learning and REasoning", or "FLARE".

I have a variety of criticisms of the FLARE, but before I start with them, I have to give Giraud-Carrier and Martinez strong credit for their work, here. They sought to bridge the gap between learning and reasoning that existed then and even now in the AI sub-communities and indeed seem to have been successful. And rather than follow the often obscure paths of neural networks or genetic algorithms, they chose to try to engender learning with a KR based on formal logic statements, staying true in a way to the formal logic-driven view of AI, now largely dead, from back in the fifties and sixties. What's more, the authors gave a reasonably honest appraisal of the limits of their system and avoided the temptation to make unduly bold claims about the applications of FLARE.

Having said that, the first criticism I have is of the sort of knowledge representation the authors of this article chose. In their system, every concept is represented in a "vector", or list of attributes with associated values. Every other concept has these exact same attributes, but with different values. In one example, the attributes are "is an animal", "is a bird", "is a penguin", and "can fly". Each attribute can have either a specific value (a "no" or "yes", in these cases) or a "don't know" or "don't care" pseudo-value. So to say that "animals don't generally fly" in one concept would be to give the "is an animal" attribute a "true", the "can fly" attribute a "false", and all the other attributes a "don't care". Similarly, saying that a penguin is a bird would mean setting the "is a penguin" and "is a bird" attributes to "true" and setting the other attributes in that vector to "don't care". Admittedly, this approach is sufficient for the purposes of FLARE and makes printing a KB representation easy using a simple data table. But taken literally, it means that every piece of knowledge, no matter how small, has values for every one of the attributes known to the system. Thus each new attribute adds to the size of all vectors in the KB and slows down the reasoning process by adding to the size of each of the two dimensions of the KB to search.

The memory and time optimization complaints are weak, I admit. One could easily improve the memory usage by designing vectors that only contain attributes with explicit or "don't know" values and assume all other attributes not specified are "don't care", for example. And the authors indicate that they make use of indexing in a way probably similar to relational database engines like Sql Server to enhance querying performance. So why do I bother with this critique?

I want to linger for a moment on this point about optimization because it is a common criticism of almost all AI work. In this field, one can often hear the frustrated question, "why is it that as I gain more knowledge, I can solve problems faster, but when a computer gains more knowledge, it gets slower?" The answer is that most AI systems have been built with the same basic brute-force approach that can be found in most conventional database and data mining systems. A simple chess-playing program, for example, may look ahead dozens of moves to see what the ramifications of each step will be. Each step ahead costs ever more in processing power. No human being could ever match the performance of even the most basic chess-playing programs in this regard, yet it took decades before the top chess-playing human was "beaten" by a computer, and it was mainly because its computer opponent was so darned fast (and expensive) that it could look farther ahead than any other machine programmed for the task could in a unit of time, not because it was significantly "smarter" than those other systems. That an ordinary PC could still not do the same today is an indictment of AI researchers throwing up their hands and complaining that today's computers are too slow. The human brain isn't really faster than today's computers. Nor do I agree with the claim that the "massive parallelism" of the brain is essential. What's essential is good data structures and algorithms. When you hear the word "parakeet", your brain doesn't do a massive search through all its word patterns to find a best match. I'm convinced It follows links from one syllable to the next, walking through a sort of "word tree", until it finds an end point. And at the end of that tree is not a vector with thousands of attributes. Rather, there's a reference to those attributes or other parts of the brain that are very strongly associated with the word. In short, the paths that the human brain follows are short and simple, and that's why we are able to think so quickly, despite having brains composed of laughably slow and sloppy processing elements. I should point out this isn't so much a criticism of the FLARE concept as it is of the basic assumptions of AI researchers even today, it seems.

More importantly, though, this idea of having attributes be a "dimension" like this instead of being things unto themselves is dubious. FLARE has no capacity to perceive attributes as being related to one another, for example. Nor could it integrate new attributes in a meaningful way other than to simply add another column to the table, so to speak. This limits FLARE's ability to truly deal with novel information, unless it's properly constrained by existing attributes and very well formatted. (To their credit, the authors point out that FLARE does have an ability to deal with modestly noisy -- read "contradictory" -- information.)

The next criticism I have of FLARE is that it does not seem very good at properly congregating ideas together. Even a simple "or" operation to link two possible values for an attribute will cause two separate vectors to be created for the same idea. There's no branching. Admittedly, they did this to help simplify and perhaps even make possible the kinds of deductive reasoning that characterize FLARE's problem solving capability. Still, this seems to me to make it difficult to localize thinking to appropriate "areas" of thought. I suppose I should apologize for giving too much focus to comparing AI products to the human brain. I don't really think the human brain presents the only way to achieving sentience. Still, it provides a good model, has so much to say that seems yet unheard, and should be used as a benchmark in identifying features and limitations of such products. I hope my criticisms of FLARE will be taken mainly as comparisons to the human brain as such, rather than attacks on the validity and value this work brings to the field.

Next, FLARE appears to be competent at dealing with identifying objects, answering questions about objects, and more generally solving first-order predicate logic questions. But it does not seem to have any real capability to deal with pattern matching or temporal sequences, let alone have anything like an appreciation of the passage of time. So it really could not be used to deal with the kinds of bread and butter problems Rodney Brooks identified as the basis for most living organisms, like controlling locomotion, visual recognition of objects, and so on.

In summary, Giraud-Carrier and Martinez wrote an essay on their research into integrating learning and reasoning functions that is pithy and reasonably approachable to a college-educated reader. They tested in a way so as to compare their results to other conventional systems and provided useful examples and caveats in the article about possible applications and limitations. FLARE, their work product, is clearly not a framework for general-purpose thinking, but provides interesting insights into solving logic problems and integrating new knowledge into such a system. To the person interested in AI and looking for a broader view, I would recommend reading this 33-page essay for a penetrating glimpse into a pretty interesting piece of the AI picture.


Popular posts from this blog

Neural network in C# with multicore parallelization / MNIST digits demo

Discovering English syntax

Virtual lexicon vs Brown corpus