Search This Blog

Thursday, November 10, 2005

Neuron banks and learning

[Audio Version]

I've been thinking more about perceptual-level thinking and how to implement it in software. In doing so, I've started formulating a model of how cortical neural networks might work, at least in part. I'm sure it's not an entirely new idea, but I haven't run across it in quite this form, so far.

One of the key questions I ask myself is: how does human neural tissue learn? And, building on Jeff Hawkins' memory-prediction model, I came up with at least one plausible answer. First, however, let me say that I use the term "neuron" here loosely. The mechanisms I ascribe to individual neurons may turn out to be more a function of groups of them working in concert.

Let me start with the notion of a group of neurons in a "neural bank". A bank is simply a group of neurons that are all looking at the same inputs, as illustrated in the following figure:

Figure: Schematic view of neuron bank.

Perhaps it's a region of the input coming from the auditory nerves. Or perhaps it's looking at more refined input from several different senses. Or perhaps even a more abstract set of concepts at a still higher level. It may not be that there are large numbers of neurons that all look at the same chunk of inputs -- it may be more messy than that -- but this is a helpful idea, as we'll soon see. Further, while I'll speak of neural banks as though they all fall into a single "layer" in the sense that traditional artificial neural networks are arranged, it's more likely that this neural bank idea applies to an entire patch of 6-layered cortical tissue in one's brain. Still, I don't want to get mired in such details in this discussion.

Each neuron in a bank is hungry to contribute to the whole process. In a naive state, they might all simply fire, but such a cacophony would probably be counterproductive. In fact, our neural banks could be hard-wired to favor having a minimal number of neurons in a bank firing at any given time -- ideally, zero or one. So each neuron is eager to fire, but the bank, as a whole, doesn't want them to fire all at once.

These two forces act in tension to balance things out. How? Imagine that each neuron in a bank is such that when it fires, its signal tends to suppress the other neurons in the bank. Suppress how? Two ways: firing and learning. When a neuron is highly sure that it is perceiving a pattern it has learned, it fires very strongly. Other neurons that may be firing because they have weak matches would be self-silenced by these louder neurons, on the assumption that the louder neurons must have more reason to be sure of the patterns they perceive. Consider the following figured, modified from above to show this feedback:

Figure: Neuron bank with feedback from neighbors.

But what about learning? What does a neuron learn and why would we want other neurons to suppress it? First, what is learned by a neuron is one or more patterns. For simplicity, let's say it's a simple, binary pattern. For each dendritic synapse looking at input from outside axons that a neuron has, we'll say it either cares or doesn't care and, if it does, it prefers either a firing or not-firing value. The following figure illustrates this, schematically:

Figure: Detail of a synapse.

Following is a logical behavior table. It is equivalent to a logical exclusive or (XOR) operation:

Preferred InputActual InputMatches

Let's describe the desired input pattern in terms of a string of zeros (not firing), ones (firing), and exes (don't care). For example, a neuron might prefer to see "x x 0 x 1 0 x 1 0 0 x 0 x x 1". When it sees this exact pattern, it fires strongly. But maybe when it sees all but one of the inputs it cares about doesn't fit. It still fires, but not as strongly. If another neuron is firing more strongly, this one shuts up.

That's what's learned but not how it's learned. Let's consider that more directly. A neuron that fires on a regular basis is "happy" with what it knows. It's useful. It doesn't need to learn anything else, it seems. But what about a neuron that never gets a chance to fire because its pattern doesn't match much of anything? I argue that this "unhappy" neuron wants very much to be useful. It searches for novel patterns. What does this mean? There are many possible mechanisms, but let's consider just one. We'll assume all the neurons started out with random synaptic settings (0, 1, or x). Now let's say that there is a certain combination of inputs that no neuron in the bank shouts out to say "I got this one". Some of these neurons see that some of the inputs do match. These are inclined to believe that this input is probably a pattern that can be learned, so they change some of their "wrong" settings to better match the current input. The more strongly the match already is for a given unhappy neuron, the more changes that neuron is likely to make to conform to this new input.

Now let's say this particular combination of input values (0s and 1s) continues to appear. At least one neuron will continue to grow ever more biased towards matching that pattern that eventually it will start shouting out like other "happy" neurons do.

This does seem to satisfy a basic definition for learning. But it does leave many questions unanswered. One is: how does it decide whether or not to care about an input? I don't know the answer, but here's one plausible answer. A neuron -- whether "happy" or "unhappy" with what it knows -- can allow its synaptic settings to change over time. Consider a happy one. It continues to see its favored pattern and fires whenever it does. Seeing no other neurons contending for being the best at matching its pattern, it is free to continue learning in a new way. In particular, it looks for patterns at the individual synapse level. If one synaptic input is constantly the same value whenever this one fires, it favors setting that synapse to "do care". If, conversely, it changes with some regularity, this neuron will favor setting that one to "don't care".

Interestingly, this leads to a new set of possible contentions and opportunities for new knowledge. One key problem in conceptualization is learning when to recognize that two concepts should be merged and when one concept should be subdivided into other narrower ones. When do you learn to recognize two different dogs are actually part of the same group of objects called "dogs"? And why do you decide that a chimpanzee, which looks like a person, is really a wholly new kind of thing that deserves its own concept?

Imagine that there is one neuron in a bank of them that has mastered the art of recognizing a basset hound dog. And let's say that's the only kind of dog this brain has ever seen before. It has seen many different bassets, but no other breed. This neuron's pattern recognition is greedy, seeing all the particular facets of bassets as essential to what dogs are all about. Then, one day, this brain sees a Doberman pinscher for the first time. To this neuron, it seems very like a basset, but there are enough features to be doubtful. Still, nobody else is firing strongly, so this one might as well, considering itself to have the best guess. This neuron is strongly invested in a specific kind of dog, though. It would be worthwhile to have another neuron devoted to recognizing this other kind of dog. What's more, it would be valuable to have yet another neuron that recognizes dogs more generally. How would that come about?

In theory, there are other neurons in this bank that are hungry to learn new patterns. One of them could see the lack of a strong response from any other neuron as an opportunity to learn either the more specific Dobie pattern or of the more general dog pattern.

One potential problem is that the neurons that detect more specific features -- bassets versus all dogs, for example -- might tend to make more general concepts like "dog" go away. There must be some incentive. One explanation could be frequency. The dog neuron might not have as many matching features to consider as the basset neuron does, but if this brain sees lots of different dogs and only occasionally bassets, the dog neuron would get exercised more frequently, even if it doesn't shout the loudest when a basset is seen. So perhaps both frequency and strength of matching are strong prompts for a neuron that it's learned well.

I have no doubt that there's much more to learning and the neocortex, more generally. Still, this seems a plausible model for how learning could happen there.

Thursday, November 3, 2005

A standardized test of perceptual capability

[Audio Version]

I've been getting too lost in the idiosyncrasies of machine vision of late and missing my more important focus on intelligence, per se. I'm changing direction, now.

My recent experiences have shown me that one thing we haven't really done well is in the area of perceptual level intelligence. We have great sensors and cool algorithms for generating interesting but primitive information about the world. Edge detection, for example, can be used to generate a series of lines in a visual scene. But so what? Lines are just about as disconnected from intelligence as the raw pixel colors are.

Where do primitive visual features become percepts? Naturally, we have plenty of systems designed to instantly translate visual (or other sensory) information into known percepts. Put little red dots around a room, for instance, and a visual system can easily cue in on them as being key markers for a controlled-environment system. This is the sort of thinking that is used in vision-based quality control systems, too.

But what we don't have yet is a way for a machine to learn to recognize new percepts and learn to characterize and predict their behavior. I've been spending many years thinking about this problem. While I can't say I have a complete answer yet, I do have some ideas. I want to try them out. Recently, while thinking about the problem, I formulated an interesting way to test a perceptual-level machine's ability to learn and make predictions. I think it can be readily reproduced on many other systems and extended for ever more capable systems.

The test involves a very simplified, visual world composed of a black rectangular "planet" and populated by a white "ball". The ball, a small circle whose size never changes, moves around this 2D world in a variety of ways that, for the most part, are formulaic. One way, for example, might be thought of as a ball in a box in space. Another can be thought of as a ball in a box standing upright on Earth, meaning it bounces around in parabolic paths as though in the presence of a gravitational field. Other variants might involve random adjustments to velocity, just to make prediction more difficult.

The test "organism" would be able to see the whole of this world. It would have a "pointer". Its goal would be to move this pointer to wherever it believes the ball will be in the next moment. It would be able to tell where the pointer currently points using a direct sense separate from its vision.

Predicting where the ball will be in the future is a very interesting test of an organism's ability to learn to understand the nature of a percept. Measuring the competency of a test organism would be very easy, too. For each moment, there is a prediction, in the form of the pointer pointing to where it believes the ball will be in the next moment. When that moment comes, the distance between the predicted and actual positions of the ball is calculated. For any given set series of moments, the average distance would be the score of the organism in that context.

It would be easy for different researchers to compare their test organisms against others, but would require a little bit of care to put each test in a clear context. The context would be defined by a few variables. First is the ball behavior algorithm that is used. Each such behavior should be given a unique name and a formal description that can be easily implemented in code in just about any programming language. Second, the number of moments used to "warm up", which we'll call the "warm up period". That is, it should take a while for an organism to learn about the ball's behavior before it can be any good at making predictions. Third, the "test period"; i.e., the number of moments after the warm-up period is done in which test measurements are taken. The final score in this context, then, would be the average of all the distances measured between prediction and actual position.

There would be two standard figures that should be disclosed with any given test results. One is that the best possible score is 0, which means the predictions are always correct. The second is the best possible score for a "lazy" organism. In this case, a lazy organism is one that always guesses that the ball will be in the same place in the next moment that it is now. Naturally, a naive organism would do worse than this cheap approximation, but a competent organism should do better. The "lazy score" for a specific test run would be calculated as the average of all distances from each moment's ball position to its next moment's position. A weighted score for the organism could then be calculated as a ratio of actual score to lazy score. A value of zero would be the best possible. A value of one would indicate that the predictions are no better than the lazy score. A value greater than one would indicate that the predictions are actually worse than the lazy algorithm.

Some might quip that I'm just proposing a "blocks world" type experiment and that an "organism" competent to play this game wouldn't have to be very smart. I disagree. Yes, a programmer could preprogram an organism with all the knowledge it needs to solve the problem and even get a perfect score. A proper disclosure of the algorithm used would let fellow researchers quickly disqualify such trickery. So would testing that single program against novel ball behaviors. What's more, I think a sincere attempt to develop organisms that can solve this sort of problem in a generalizable way will result in algorithms that can be generalized to more sophisticated problems like vision in natural settings.

Naturally, this test can also be extended in sophistication. Perhaps there could be a series of levels defined for the test. This might be Level I. Level II might involve multiple balls of different colors. And so on.

I probably will draft a formal specification for this test soon. I welcome input from others interested in the idea.