📚

 > 

🧠 

 > 

📚

4.3 Operant Conditioning

5 min readjune 18, 2024

John Mohl

John Mohl

Haseung Jun

Haseung Jun

Dalia Savy

Dalia Savy

John Mohl

John Mohl

Haseung Jun

Haseung Jun

Dalia Savy

Dalia Savy

Operant conditioning refers to when a behavior leads to an environmental response, which affects the likelihood of the behavior happening again. 

Image Courtesy of Verywell mind.

One of the earliest contributors to this aspect of learning was E.L. Thorndike, who found that behaviors that had a favorable outcome became stronger. In contrast, behaviors that had an unfavorable outcome became weaker. He referred to this as his Law of Effect.  

B.F. Skinner took this principle further and described different types of consequences that can occur and how they could affect the presentation of the behavior. He created what is called the "Skinner box", where animals in the box only receive food if they press a lever or peck a disk. The food was used as a reinforcer, using the food as reinforcement

🎥 Watch: AP PsychologyOperant Conditioning with Pigeons

Reinforcement and Punishment

Reinforcing behavior means there is a greater likelihood that the behavior will occur again. Contrarily, punishing a behavior will create a lessened probability that the behavior will happen again. In the box, 

Image Courtesy of Verywell Mind.

Describing a consequence as "positive" does not indicate a synonym of "good," similarly with "negative" and "bad." Instead, the use of the word "positive" suggests the presence of a result, whereas "negative" indicates the absence or disappearance. 

Thus, when a behavior is positively reinforced, it means something is presented (usually something pleasant) to increase the likelihood of the behavior happening again. When something is negatively reinforced, something is taken away (usually unpleasant) to encourage that behavior to happen again. 

When something is positively punished, something is presented (usually unpleasant), making the behavior happen less often. In contrast, something negatively punished has something taken away (usually something pleasant) to make that behavior happen less often.

Operant Conditioning TermDescriptionOutcomeExample
Positive ReinforcementAdd or increase a pleasant stimulusBehavior is strengthened You get a cookie for an “A.” 🍪
Negative Reinforcement Reduce or remove an unpleasant stimulusBehavior is strengthenedTaking painkillers (removes pain), the behavior of taking painkillers is strengthened. 
Positive PunishmentAdd an unpleasant stimulusBehavior is weakenedGive more homework for misbehavior ✍️
Negative PunishmentReduce or remove pleasant stimulus Behavior is weakenedNo phone 📱 after breaking curfew 

Table adapted from Open Source Textbook.

Skinner used negative and positive reinforcements and punishments to train the rats inside the Skinner box. But sometimes, we can't wait forever for the rat to learn. So in order to speed up the process, we can use shaping. Shaping reinforces the steps to reach the end goal. For example, the rat could be reinforced for touching the lever with any part of its body. Because we're rewarding any behavior close to the behavior we want, we have a greater chance of getting the rat to stumble upon the behavior we want. 

There's also something called chaining. When the animals are trained enough, they can perform multiple tasks in order to get the reward. A good example of this would be going through an obstacle course to get the final reward. 

There are basic conditioning phenomena also describe the process of operant conditioning. 

AcquisitionRat learns to press the food lever for food
ExtinctionThe rat unlearns that connection between the lever and food and the rat stops pressing the lever
Spontaneous RecoveryAfter a period of rest without the learned behavior, the rat presses the lever
GeneralizationThe rat presses anything that looks like a lever, thinking it will give it food
DiscriminationThe rat learns to only press a certain lever

🎥 Watch: AP PsychologyPositive and Negative Punishments

Limitations to Operant Conditioning

Despite stringent behaviorists’ claims, there are limitations to classical conditioning. When presented with a puzzle 🧩, some organisms can discover the solution to the problem without proper reinforcements to guide them to the solution. This phenomenon is known as insight learning. Insight learning is sometimes referred to as the “a-ha moment” when one suddenly realizes the solution to a problem💡.

Edward Tolman found that rats did not show any noticeable improvement in getting through a maze in the absence of reinforcement. However, when reinforcement was provided, he found a marked decrease in the time needed to finish the maze, suggesting that the rats knew the solution to the maze but did not express it behaviorally, meaning that they had a cognitive map of the maze. Tolman called this latent learning⏳.

Image Courtesy of Wikimedia Commons.

Not all types of stimuli will necessarily be conditioned with all types of responses. John Garcia found that people are more readily predisposed to be conditioned to taste if the corresponding response is internal. For example, the behavioral response of nausea 🤢 is more likely to be conditioned to a taste stimulus than an external stimulus, such as a sound 🔊 .

Other research has shown that cognitive interpretations of conditioning also play a role. If a person believes that a particular stimulus, as opposed to the intended stimulus, causes the conditioning, then the stimulus designed to produce the conditioning will not occur.

Reinforcement Schedules

The probability of successful operative conditioning depends upon how the reinforcements are presented. 

When something is produced on a fixed schedule, reinforcement occurs in a predictable (but not continuous) pattern. One knows when the subsequent reinforcement will be given, assuming behaviors are performed. When reinforcement is presented on a variable schedule, it means that reinforcement is not predictable, and it is not apparent when the next reinforcement will exactly occur. 

When reinforcement is given on an interval schedule, it means a certain amount of time must pass by, assuming the behavior is performed before reinforcement is given. When reinforcement is given on a ratio schedule, a certain number of behaviors must be performed before the reinforcement is provided. 

Altogether, this makes four different types of schedules of reinforcement. 

Reinforcement ScheduleExplanation Real World Example 
Fixed RatioRewarded after a specific number of responses #️⃣You get paid $100 bucks after writing two columns. 
Variable RatioRewarded after an average but unpredictable number of responses Put money in a slot machine.  It pays out after several plays, but the player is uncertain of the number because it varies. 
Fixed Interval Rewarded after a set amount of time has elapsed 📅People who earn a monthly salary
Variable Interval Rewarded after an average but unpredictable amount of time has elapsedPerson checks email messages and is rewarded with a message at varying times.  

Table adapted from Open Source Textbook.

🎥Watch: AP PsychologyOperant Conditioning

<< Hide Menu

📚

 > 

🧠 

 > 

📚

4.3 Operant Conditioning

5 min readjune 18, 2024

John Mohl

John Mohl

Haseung Jun

Haseung Jun

Dalia Savy

Dalia Savy

John Mohl

John Mohl

Haseung Jun

Haseung Jun

Dalia Savy

Dalia Savy

Operant conditioning refers to when a behavior leads to an environmental response, which affects the likelihood of the behavior happening again. 

Image Courtesy of Verywell mind.

One of the earliest contributors to this aspect of learning was E.L. Thorndike, who found that behaviors that had a favorable outcome became stronger. In contrast, behaviors that had an unfavorable outcome became weaker. He referred to this as his Law of Effect.  

B.F. Skinner took this principle further and described different types of consequences that can occur and how they could affect the presentation of the behavior. He created what is called the "Skinner box", where animals in the box only receive food if they press a lever or peck a disk. The food was used as a reinforcer, using the food as reinforcement

🎥 Watch: AP PsychologyOperant Conditioning with Pigeons

Reinforcement and Punishment

Reinforcing behavior means there is a greater likelihood that the behavior will occur again. Contrarily, punishing a behavior will create a lessened probability that the behavior will happen again. In the box, 

Image Courtesy of Verywell Mind.

Describing a consequence as "positive" does not indicate a synonym of "good," similarly with "negative" and "bad." Instead, the use of the word "positive" suggests the presence of a result, whereas "negative" indicates the absence or disappearance. 

Thus, when a behavior is positively reinforced, it means something is presented (usually something pleasant) to increase the likelihood of the behavior happening again. When something is negatively reinforced, something is taken away (usually unpleasant) to encourage that behavior to happen again. 

When something is positively punished, something is presented (usually unpleasant), making the behavior happen less often. In contrast, something negatively punished has something taken away (usually something pleasant) to make that behavior happen less often.

Operant Conditioning TermDescriptionOutcomeExample
Positive ReinforcementAdd or increase a pleasant stimulusBehavior is strengthened You get a cookie for an “A.” 🍪
Negative Reinforcement Reduce or remove an unpleasant stimulusBehavior is strengthenedTaking painkillers (removes pain), the behavior of taking painkillers is strengthened. 
Positive PunishmentAdd an unpleasant stimulusBehavior is weakenedGive more homework for misbehavior ✍️
Negative PunishmentReduce or remove pleasant stimulus Behavior is weakenedNo phone 📱 after breaking curfew 

Table adapted from Open Source Textbook.

Skinner used negative and positive reinforcements and punishments to train the rats inside the Skinner box. But sometimes, we can't wait forever for the rat to learn. So in order to speed up the process, we can use shaping. Shaping reinforces the steps to reach the end goal. For example, the rat could be reinforced for touching the lever with any part of its body. Because we're rewarding any behavior close to the behavior we want, we have a greater chance of getting the rat to stumble upon the behavior we want. 

There's also something called chaining. When the animals are trained enough, they can perform multiple tasks in order to get the reward. A good example of this would be going through an obstacle course to get the final reward. 

There are basic conditioning phenomena also describe the process of operant conditioning. 

AcquisitionRat learns to press the food lever for food
ExtinctionThe rat unlearns that connection between the lever and food and the rat stops pressing the lever
Spontaneous RecoveryAfter a period of rest without the learned behavior, the rat presses the lever
GeneralizationThe rat presses anything that looks like a lever, thinking it will give it food
DiscriminationThe rat learns to only press a certain lever

🎥 Watch: AP PsychologyPositive and Negative Punishments

Limitations to Operant Conditioning

Despite stringent behaviorists’ claims, there are limitations to classical conditioning. When presented with a puzzle 🧩, some organisms can discover the solution to the problem without proper reinforcements to guide them to the solution. This phenomenon is known as insight learning. Insight learning is sometimes referred to as the “a-ha moment” when one suddenly realizes the solution to a problem💡.

Edward Tolman found that rats did not show any noticeable improvement in getting through a maze in the absence of reinforcement. However, when reinforcement was provided, he found a marked decrease in the time needed to finish the maze, suggesting that the rats knew the solution to the maze but did not express it behaviorally, meaning that they had a cognitive map of the maze. Tolman called this latent learning⏳.

Image Courtesy of Wikimedia Commons.

Not all types of stimuli will necessarily be conditioned with all types of responses. John Garcia found that people are more readily predisposed to be conditioned to taste if the corresponding response is internal. For example, the behavioral response of nausea 🤢 is more likely to be conditioned to a taste stimulus than an external stimulus, such as a sound 🔊 .

Other research has shown that cognitive interpretations of conditioning also play a role. If a person believes that a particular stimulus, as opposed to the intended stimulus, causes the conditioning, then the stimulus designed to produce the conditioning will not occur.

Reinforcement Schedules

The probability of successful operative conditioning depends upon how the reinforcements are presented. 

When something is produced on a fixed schedule, reinforcement occurs in a predictable (but not continuous) pattern. One knows when the subsequent reinforcement will be given, assuming behaviors are performed. When reinforcement is presented on a variable schedule, it means that reinforcement is not predictable, and it is not apparent when the next reinforcement will exactly occur. 

When reinforcement is given on an interval schedule, it means a certain amount of time must pass by, assuming the behavior is performed before reinforcement is given. When reinforcement is given on a ratio schedule, a certain number of behaviors must be performed before the reinforcement is provided. 

Altogether, this makes four different types of schedules of reinforcement. 

Reinforcement ScheduleExplanation Real World Example 
Fixed RatioRewarded after a specific number of responses #️⃣You get paid $100 bucks after writing two columns. 
Variable RatioRewarded after an average but unpredictable number of responses Put money in a slot machine.  It pays out after several plays, but the player is uncertain of the number because it varies. 
Fixed Interval Rewarded after a set amount of time has elapsed 📅People who earn a monthly salary
Variable Interval Rewarded after an average but unpredictable amount of time has elapsedPerson checks email messages and is rewarded with a message at varying times.  

Table adapted from Open Source Textbook.

🎥Watch: AP PsychologyOperant Conditioning