Best before yesterday: learning

Showing posts with label learning. Show all posts

27 November 2009

Brembs (2006) Brains as output/input devices

I just finished reading an excellent blog post (paper?) by Björn Brembs entitled Brains as output/input devices. I admit that I too have tended to think of brains as stimulus-response machines, paying little attention to spontaneous behaviour and operant learning. Indeed, the old iPlant programming section on my website used to begin "Like a digital computer, the brain generates output from input". On the contrary, Brembs argues, the brain generates input from output. That is, the main function of brains is to control the environment they're in - and thus the sensory input and rewards/punishments they receive - by figuring out the right motor output for a given situation.

"input/output transformations may only account for a small fraction of what brains are doing. Maybe a much more significant portion of the brain is occupied with the ongoing modelling of the world and how it might react to our actions?"

Furthermore, Brembs argues, the variability we observe in spontaneous behaviour is a feature of operant learning: it is a way for the brain to find and develop patterns of behaviour that give it optimal control over its environment. "Faced with novel situations, humans and most animals spontaneously increase their behavioural variability", presumably in order to figure out how this particular environment responds to behaviour. It's the environment that responds, not the animal. Perhaps even the subtle variability we see in the invertebrate feeding system is an expression of the molluscan brain trying to figure out the best way to eat this particular sea-weed. If so, such variability should be selectively enhanced by reward learning. Is it? Other questions:

How is behavioural/neuronal variability generated in small and large brains?
What % of behavioural/neuronal variability is really subject to learning/operant control in different networks?
What features of neuronal activity are most likely to be subject to learning/operant control? In other words, where do we look? Spike rate of individual neurons? Network patterns? Duration of the different phases of motor programs?
How is reward conditioning/operant control of spontaneous variability instantiated in small and large brains?
How can we incorporate output-input functions in artificial neural networks and robotics? That is, what kind of tasks could such networks realistically perform today or in the future?
What is the cultural effect (within in the neuroscience community and generally) of treating brains as output/input devices rather than input/output devices?

P.S. It was particularly stupid of me to emphasise the input-output side of things on the iPlant website, as the whole point of conditional rewarding brain stimulation is to modify output-input learning by rewarding beneficial but endogenously under-rewarded behavioural variations (rigorous exercise in morbidly obese patients etc.) with electrical pulses to the reward system.

14 December 2008

Riding a bike

I was 5 when I learned to ride a bike. I remember the place - a thin strip of asphalt surrounded by very green grass, just beside a small patch of forest - where I first managed to ride it for a stretch without dad holding it steady and without falling over. Imagine my brain at the time. Imagine two pulsating groups of active neurons in each prefrontal cortex, slowly circling each other. One, oscillating slowly, projecting to the motor cortex right at the top of my head and down between the lobes, driving the oscillating contractions of leg and foot on the pedal. The other, pulsating a seemingly patternless pattern to the motor cortex below the upper sides of the skull, constantly adjusting the handle bar with a cramped grip, keeping the whole circus upright. Both groups, closely connected to eachother and to their mirror images in the opposite lobe, receiving a constant barrage of input: sight from the back of the brain and the colliculus, balance from the inner ear, kinetics from the spinal cord. Both groups constantly adjusting their output to maximize the flow of reinforcing dopamine from their respective midbrains. And that's why I remember it so well, that moment when the groups finally got the output right, for a time, and were showered with dopamine as all regions of the brain reported success. The dopamine reinforced them, and with them every other process that was active in my frontal lobes at the time - the location, the weather, the color of the grass.

After that of course, I've kept on biking, for years and years and years, milking those two groups for all the dopamine they were worth, until they were neat and trimmed and refined to a point where almost all the oscillation and rotation and complex feedback loops have been moved over to small, dedicated central pattern generators in my motor cortex and spinal cord.

04 December 2008

Thoughts on forks

Whenever there is purposeful, goal-directed behavior going on, dopamine is involved. Without dopamine, there is no purposeful behavior. At 1% of normal dopamine concentrations even feeding and drinking stops. In Parkinson's disease the parts of the brain most intimately involved with purposeful movement (as opposed to thinking) are deprived of dopamine, and at 20% of normal concentrations it becomes hard to move. Dopamine is key to organized behavior and motivation: it reinforces activity in brain tissue and strengthens synapses.

So whenever you see purposeful behavior going on you might ask yourself "where is the dopamine coming from?". Only rarely do we see humans engaged in behavior that directly activates dopamine neurons. Eating comes to mind: sensory neurons in the mouth detect the presence of good food and immediately activate dopamine neurons in an attempt, if you will, to reinforce whatever behavior brought the food there. But that behavior, the movement of the fork or the chop-sticks, was learnt at some point. Suckling is the only form of eating we know from the start. We suckled the breast and the milk was warm and sweet and we stopped crying. We suckled the twig and it was cold and hard and we cried. When suckling brought warmth and sugar, dopamine was released and the neuronal ensembles generating the behavior and remembering the context were reinforced. Later we learned a bottle works almost as well. We learned to hold the bottle. And a few years and millions of dopaminergic learning experiences later we learned to use the fork.

08 September 2008

DA-5HT in 2D

07 June 2008

Live and learn

It should be clear by now that I've got a learning problem, but unfortunately I've got a learning problem and this salient fact still hasn't registered properly. Take the pizza I had today for example. It wasn't even very good and I still gorged it, with three servings of oily pizza salad and a sugary soda. I wasn't exactly surprised when sickening lethargy was squeezing the life-blood out of my mood and stomach fifteen minutes later, but if I had anticipated it - if I had properly imagined it at the time I sat down at the restaurant - maybe I would have had something with more fiber and less saturated fat for dinner. Something less tasty.

Forgive me this humorless rant but if I can't learn from past mistakes like a normal mammal the least I can do is write them down. Failure to re-evaluate established reward-contingencies is a hallmark of attention deficit disorders and an underactive dopamine system. The natural extinction of behaviors that are no longer reinforcing, or that have become associated with additional, negative outcomes, is impaired. Perhaps the free will gland is disrupted. Appetite, in one form or another, overrides reason.

Anyway, note to self: huge portions of pizza and pizza salad are a waste of time.