An Animal Trainer's Introduction To
Operant and
Classical Conditioning
Part One
Stacy
Braslau-Schneck, MA
Learning Theory and Learning Theory
"Learning
Theory" is a discipline of psychology that attempts
to explain how an organism learns. It consists of
many different theories of learning, including instincts,
social facilitation, observation, formal teaching,
memory, mimicry, and classical and operant conditioning.
It is these last two that are of most interest to
animal trainers.
Why should
animal trainers be bothered with learning the theory
behind how their animals learn? Many excellent trainers
have no formal schooling or organized understanding
of how their training is effective or how their charges
work. But training is both an art and a science. More
and more trainers - pet owners, show competitors,
horseback riders, show-business trainers, zookeepers,
aquarium trainers and more - are finding that an understanding
of learning theory helps them understand their animals'
behaviors better, and plan their training accordingly.
So trainers are learning the theory of learning theory!
Classical
or "Pavlovian" Conditioning
Theory
Classical
Conditioning is the type of learning made famous by
Pavlov's experiments with dogs. The gist of the experiment
is this: Pavlov presented dogs with food, and measured
their salivary response (how much they drooled). Then
he began ringing a bell just before presenting the
food. At first, the dogs did not begin salivating
until the food was presented. After a while, however,
the dogs began to salivate when the sound of the bell
was presented. They learned to associate the sound
of the bell with the presentation of the food. As
far as their immediate physiological responses were
concerned, the sound of the bell became equivalent
to the presentation of the food.
Classical
conditioning is used by trainers for two purposes:
To condition (train) autonomic responses, such as
the drooling, producing adrenaline, or reducing adrenaline
(calming) without using the stimuli that would naturally
create such a response; and, to create an association
between a stimulus that normally would not have any
effect on the animal and a stimulus that would.
Stimuli that
animals react to without training are called primary
or unconditioned stimuli (US). They include
food, pain, and other "hard-wired" or "instinctive"
stimuli. Animals do not have to learn to react to
an electric shock, for example. Pavlov's dogs did
not need to learn about food.
Stimuli that
animals react to only after learning about them are
called secondary or conditioned stimuli
(CS). These are stimuli that have been associated
with a primary stimulus. In Pavlov's experiment, the
sound of the bell meant nothing to the dogs at first.
After its sound was associated with the presentation
of food, it became a conditioned stimulus. If a warning
buzzer is associated with the shock, the animals will
learn to fear it.
Secondary
stimuli are things that the trainee has to learn to
like or dislike. Examples include school grades and
money. A slip of paper with an "A" or an
"F" written on it has no meaning to a person
who has never learned the meaning of the grade. Yet
students work hard to gain "A's" and avoid
"F's". A coin or piece of paper money has
no meaning to a person who doesn't use that sort of
system. Yet people have been known to work hard to
gain this secondary reinforcer.
Application
Classical
conditioning is very important to animal trainers,
because it is difficult to supply an animal with one
of the things it naturally likes (or dislikes) in
time for it to be an important consequence of the
behavior. In other words, it's hard to toss a fish
to a dolphin while it's in the middle of a jump or
finding a piece of equipment on the ocean floor a
hundred meters below. So trainers will associate something
that's easier to "deliver" with something
the animal wants through classical conditioning. Some
trainers call this a bridge
(because it bridges the time between when
the animal performs a desired behavior and when it
gets its reward). Marine mammal trainers use a whistle.
Many other trainers use a clicker, a cricket-like box with a metal tongue
that makes a click-click sound when you press it.
You
can classically condition a clicker by clicking it
and delivering some desirable treat, many times in
a row. Simply click the clicker, pause a moment, and
give the dog (or other animal) the treat. After you've
done this a few times, you may see the animal visibly
startle, look towards the treat, or look to you. This
indicates that she's starting to form the association.
Some clicker trainers call this "charging up
the clicker". It's also called "creating
a conditioned reinforcer". The click sound becomes
a signal for an upcoming reinforcement. As a shorthand,
some clicker trainers will say that the click = the
treat.
Operant
Conditioning
Classical
conditioning forms an association between two
stimuli. Operant conditioning forms an association
between a behavior and a consequence. (It is also
called response-stimulus or RS conditioning
because it forms an association between the animal's
response [behavior] and the stimulus that follows
[consequence])
Four Possible Consequences
There are
four possible consequences to any behavior. They are:
Something
Good can start or be presented;
Something Good can end
or be taken away;
Something Bad can start
or be presented;
Something Bad can end
or be taken away.
Consequences
have to be immediate, or clearly linked to the behavior.
With verbal humans, we can explain the connection
between the consequence and the behavior, even if
they are separated in time. For example, you might
tell a friend that you'll buy dinner for them since
they helped you move, or a parent might explain that
the child can't go to summer camp because of her bad
grades. With very young children, humans who don't
have verbal skills, and animals, you can't explain
the connection between the consequence and the behavior.
For the animal, the consequence has to be immediate.
The way to work around this is to use a bridge (see
above).
Technical
Terms
The technical
terms for "start or be presented" is positive,
since it's something that's added to the
animal's environment.
The technical
terms for "end or be taken away" is negative,
since it's something that's subtractedfrom
the animal's environment.
Anything that
increases a behavior - makes it occur more
frequently, makes it stronger, or makes it more likely
to occur - is termed a reinforcer. Often, an animal (or person)
will perceive "starting Something Good"
or "ending Something Bad" as something worth
pursuing, and they will repeat the behaviors that
seem to cause these consequences. These consequences
will increase the behaviors that lead to them they
are reinforcers.
These are consequences the animal will work to attain,
so they strengthen the behavior.
Anything that
decreases a behavior - makes it occur less
frequently, makes it weaker, or makes it less likely
to occur - is termed a punisher. Often, an animal (or person)
will perceive "ending Something Good" or
"starting Something Bad" as something worth
avoiding, and they will not repeat the behaviors that
seem to cause these consequences. These consequences
will decrease the behaviors that lead to them they
are punishers.
Applying these
terms to the Four Possible Consequences, you get:
Something
Good can start or be presented = Positive
Reinforcement (R+)
Something
Good can end or be taken away = Negative
Punishment (P-)
Something
Bad can start or be presented = Positive
Punishment (P+)
Something
Bad can end or be taken away = Negative Reinforcement (R-)
| |
Reinforcement
(behavior
increases) |
Punishment
(behavior
decreases) |
| Positive
(something
added) |
Positive
Reinforcement
Something
added increases behavior |
Positive
Punishment
Something
added decreases behavior |
| Negative
(something
removed) |
Negative
Reinforcement
Something
removed increases behavior |
Negative
Punishment
Something
removed decreases behavior |
Remember that these definitions are based on their
actual effect on the behavior in question: they must
reduce or strengthen the behavior to be considered
a consequence and be defined as a punishment or reinforcement.
Pleasures meant as rewards but that do not strengthen
a behavior are indulgences, not reinforcement; aversives
meant as a behavior weakener but which do not weaken
a behavior are abuse, not punishment.
--------------------------------------------------------------------------------------------------
Copyright Stacy Braslau-Schneck, 1998. Please
feel free to print out and distribute for non-commercial
use with my name and this webpage address on your
print-out www.wagntrain.com.
Reproduced with permission of author on www.Southwestk9services.com
--------------------------------------------------------------------------------------------------
|