Operant Conditioning, Reinforcement, Reward Punishment, Skinner Box

Подписаться 153

50% 1

Sometimes referred to as Instrumental Conditioning, Operant Conditioning, is a method of learning that employs rewards and punishments for behavior. It creates an association between a behavior and its consequence.
History
Operant conditioning theory of learning, was first described by behaviorist B.F. Skinner, he believed that it was not really necessary to look at internal thoughts and motivations in order to explain behavior. Instead, we should look only at the external, observable causes of human behavior.
Skinner used the term operant to refer to any "active behavior that operates upon the environment to generate consequences." Skinner's theory explained how we acquire the range of learned behaviors we exhibit every day.
His theory was heavily influenced by the work of psychologist Edward Thorndike, who had proposed what he called the law of effect. According to this principle, actions that are followed by desirable outcomes are more likely to be repeated, while those followed by undesirable outcomes are less likely to be repeated.
Operant conditioning relies on a fairly simple premise: Actions that are followed by reinforcement will be strengthened and more likely to occur again in the future. If you tell a funny story in class and everybody laughs, You are more likely to tell that story again in the future.
Conversely, actions that result in punishment or undesirable consequences will be weakened and less likely to occur again in the future. If you tell the same story again in another class but nobody laughs this time, you will be less likely to repeat the story again in the future.
Two Types of Behaviors
Respondent behaviors are those that occur automatically as a reflex, you don’t have to learn them, such as pulling your hand back from a hot stove or jerking your leg when the doctor taps on your knee.
Operant behaviors, on the other hand, are those under our conscious control. Some may occur spontaneously and others purposely, their consequences influence whether or not they occur again in the future.
Our actions in the environment and the consequences of these actions make up an important part of the learning process.
While classical conditioning only accounts for respondent behaviors, it ignores a great deal of learning.
Devices
Skinner invented different devices during his boyhood and he put these skills to work during his studies on operant conditioning. He created a device known as an operant conditioning chamber, often referred to today as a Skinner box. The chamber could hold a small animal, such as a rat or pigeon. The box also contained a bar or key that the animal could press in order to receive a reward.
In order to track responses, Skinner also developed a device known as a cumulative recorder. The device recorded responses as an upward movement of a line so that response rates could be read by looking at the slope of the line.
Components of Operant Conditioning
There are five key concepts in operant conditioning.
1. Positive reinforcers are favorable events or outcomes that are presented after the behavior.
2. Negative reinforcers involve the removal of an unfavorable events or outcomes after the display of a behavior.
3. Positive punishment, punishment by application, presents an unfavorable event or outcome in order to weaken the response it follows.
4. Negative punishment, also known as punishment by removal, occurs when a favorable event or outcome is removed after a behavior occurs.
5. Extinction, fading and disappearance of the behavior.
Continuous reinforcement involves delivering a reinforcement every time a response occurs. Learning tends to occur relatively quickly, yet the response rate is quite low. Extinction also occurs very quickly once reinforcement is halted.
Fixed-ratio schedules are a type of partial reinforcement. Responses are reinforced only after a specific number of occurrences.
Fixed-interval schedules are another form of partial reinforcement. Reinforcement occurs only after a certain interval of time has elapsed. Response rates remain fairly steady and start to increase as the reinforcement time draws near, but slows immediately after the reinforcement has been delivered.
Variable-ratio schedules are also a type of partial reinforcement that involve reinforcing behavior after a varied number of responses. This leads to both a high response rate and slow extinction rates.
Variable-interval schedules are the final form of partial reinforcement That Skinner described. This schedule involves delivering reinforcement after a variable amount of time has elapsed. This also tends to lead to a fast response rate, slow extinction rate and steady performance.
Educational Psychology
Learning through consequences, operant conditioning experiment with rats and pigeons.
Important learning theories and approaches
Classical Conditioning V/s Operant Conditioning
Difference between classical Conditioning and operant Conditioning