What is Operant Conditioning?

A look at how operant conditioning works and what this means for us when we train our horses.

"Behavior is a difficult subject matter, not because it is inaccessible, but because it is extremely complex. Since it is a process, rather than a thing, it cannot easily be held still for observation. It is changing, fluid, evanescent, and for this reason it makes great technical demands upon the ingenuity and energy of the scientist."

B. F. Skinner
Source: Cooper, J. O., Heron, T.E., Heward, W.E. (2007). Applied Behavior Analysis. Second Edition.

We all train a bit differently.  We have different backgrounds and skillsets, but the general consensus is that good horsemanship is good horsemanship.  We consider the horse, what their needs are and what makes them tick, and we do the best we can to work with them.  But even the most naturally talented trainers need to understand the basics or the start of behavioral science—the four quadrants of operant conditioning (i.e, the 4Qs).  We live and breathe horse behavior and creating a clearer picture of HOW a horse is learning is paramount. 

And although behavior isn’t everything—we understand that our horses have emotions, and an inner world—the more that we understand behavior, the better we can create a framework within which we can effectively communicate with both our horses and our clients. 

So let me get a bit nerdy on you for a moment here. 

You can’t talk about operant conditioning without bringing up B. F. Skinner.  He was one the most influential scientists in the field of behavior, and his research brought into perspective two kinds of behavior: respondent and operant. 

Respondent behaviors are reflexive and can be brought out by a stimulus that precedes them—think of a pupil constricting in response to a bright light or Ivan Pavlov’s dogs salivating in response to the sound of a bell.

Operant behaviors on the other hand are not elicited or reflexive but are informed by past behavior consequences.  The behaviors you see have been selected, shaped, and maintained by stuff in the past—making it more likely to see the same behavior (or “behavior group”…I won’t get technical, I won’t get technical, gah!) in the future.  A horse who is in an operant state is a thinking horse who thoughtfully responds to cues and aids.

But Skinner did something incredible in the field of behavior, he put the focus on the consequence.  Before Skinner, it was all about stimulus and response, and after Skinner, the focus shifted to stimulus, response, and consequence.

Reinforcers and Punishers – The 4Qs

What do we mean when we talk about consequences?  Operant conditioning works through reinforcers and punishers.  The proverbial carrot and stick.  The basic concept is simple: the horse will seek out things they do want and avoid things they don’t want.   

Reinforcement has occurred if you see more of the behavior in the future.  Reinforcement can happen in two ways, either something the horse wants is added or increased (food and scratches)—positive reinforcement (R+)—or something they don’t want is removed or decreased (tapping with a whip or a leg aid)—negative reinforcement (R–).

I am required to inform you that “negative” and “positive” in this sense do not denote “good” or “bad”, rather adding or subtracting  (or increasing or reducing) a stimulus.  Do not, I repeat, do not get these confused for their emotional meanings in our every day language!  Negative reinforcement is often confused with something that is undesirable, simply because of the word “negative” – but again, this simply means something was removed in order to strengthen a behavior. 

Punishment, on the other hand, occurs when a behavior is followed by a stimulus change that decreases the future frequency of behavior in similar conditions.  Like reinforcement, the modifiers can be positive or negative, (i.e., added or removed, and have nothing to do with how desirable or ethical the behavior change procedure is).  Positive punishment (P+) refers to the addition of a stimulus to reduce the future frequency of a behavior.  Negative punishment (P–)refers to the removal of a stimulus to reduce the future frequency of a behavior.  

Just replace horses with the dogs in the diagram below for a visual.

Image Source: sproutsschools.com

So what’s more desirable when we train?  Generally, it is better to have a reinforcement mindset rather than a punishment mindset. Focus on what you DO want instead of preventing what you don’t.  Does this mean we will never use punishment? No, but we can limit it’s use to the smallest amount possible.  

In general, animals feel better when they are being reinforced. 

Once a behavior has been established with reinforcement, it does not require continuous reinforcement following every repetition.  Behaviors can (and should) be maintained through intermittent reinforcement.  Contrary to what you may have been told, you cannot eliminate all reinforcement and expect the behavior to maintain itself.

If all reinforcement is withheld for a previously reinforced behavior, the principle of extinction will apply. Extinction is sometimes referred to as the fifth quadrant – so if you hear someone say 5Qs, this is what they mean.  A horse experiencing extinction will often go through an extinction burst—they will exhibit more of the behavior or show signs of anger and frustration before the behavior goes back to its pre-reinforcement level or goes away all together. When I suddenly decided my toddler needed to forgo fruit snacks while waiting in the car line with her (yes, I offered a healthy alternative), she was NOT happy and things got far worse before they got better.  But extinction is something we generally try to avoid. If we don’t we continually move the goal post, we can avoid setting precedents. It’s just something to be aware of because no matter what our intentions, it does happen.

Conditioned vs. Unconditioned Reinforcers and Punishers

Primary (unconditioned) reinforcers such as food, water, and sexual stimulation are reinforcers that support the biological maintenance of the animal and survival of the species.  For unconditioned reinforcers to be effective, there are conditions that need to be set.  For example, not that you should ever starve your horse, but there is a certain level of food depravation that is necessary for food to function as a reinforcer. (Pro tip: make sure you feed before you train so that you don’t have a horse kite in your training session).   For we humans, if you’ve just eaten a full meal, food may not be that reinforcing to you.  However, if it’s just before dinner time, food would be highly reinforcing in that moment. Similarly, an unconditioned punisher (such as pain, bright light, loud noises, or temperature extremes) is a stimulus change that can decrease the future frequency of any behavior that precedes it without pairing with any other form of punishment beforehand. 

Conditioned reinforcers and punishers, on the other hand, function as reinforcers or punishers only because of their prior pairing with unconditioned reinforcers or punishers.  Pavlov’s dogs salivated to the sound of the bell, not because the bell itself was reinforcing, but because the sound of the bell had been classically conditioned to the presentation of a food reward.  The food reward was the unconditioned reinforcer, and the bell became the conditioned reinforcer.  This is how clicker training works.  The sound of the clicker in and of itself is not reinforcing, unless paired with the presentation of food, until the sound of the clicker itself becomes a conditioned reinforcer. 

It’s important to note that reinforcers and punishers are not fixed objects but are defined in the context of how they change behavior.  One thing may be reinforcing in one moment and situation but not in the next.  We need to be constantly aware of what is reinforcing or punishing to our horses in each encounter

Behavior is complex, and often more than one variable will control a response.  Sometimes, when we block one behavior, another may appear.  This is why emotions are so important in understanding the underlying cause of behavior.  The four quadrants of operant conditioning give us the language we need to describe the outward expression of behavior but we need to take the whole of our environment and the complexity of our horses into account.

The next time you go out to spend time with your horse, ask yourself why they are doing what they are doing.  Instead of getting frustrated with them kicking out at the canter, evading the bit, or diving in on the circle, examine the environment and develop a hypothesis on what might be reinforcing that behavior.  Remember, if the behavior is persisting, something is reinforcing it.

Close

Get Started Today!

This free course is our gift to you.  We want you and your horse to be successful right from the start.  No strings attached. Â