2. Conditioning and Learning
Conditioning is the process by which an activity
originates or is changed through reacting to an
encountered situation provided that the change in
activity can not be explained on the basis or native
tendencies, maturation or temporary states. Hilgard
(1956)
Conditioning is the learning of relations
among events so as to allow the
organism to represent its environment.
Rescorla (1988)
S-R
S-S
3. WHAT PRODUCES
CONDITIONING: CONTIGUITY OR
CONTINGENCY?
Power of reinforcement to shape and
sustain operant behavior is pervasive.
So, too, is its potential as a tool in a
wide range of practical applications.
But, what is it about response-reinforcer
relation that produces conditioning?
5. WHAT PRODUCES
CONDITIONING: CONTIGUITY OR
CONTINGENCY?
Two answers have been offered:
Contiguity
– Temporal proximity between response and
reinforcer
Contingency
– Probabilistic relation with reinforcement of
responding and not responding
temporal relationship; temporal contiguity refers
to the delivery of the reinforcer immediately after the
response
causal relationship; response-reinforcer contingency
refers to the extent to which the response is necessary
and sufficient for the occurrence of the reinforcer
6. Evidence for Contiguity
Generally, as delay between response
and reinforcer increases, rate of
operant responding decreases.
7. Evidence for Contiguity
Generally, as delay between response and
reinforcer increases, rate of operant
responding decreases.
8. Evidence for Contiguity
Generally, as delay between response
and reinforcer increases, rate of
operant responding decreases.
Temporal contiguity is thus necessary
for conditioning to occur.
But, is temporal contiguity also
sufficient?
Can’t tell because most studies require
responses to produce reinforcers.
9. Evidence for Contiguity
Skinner’s 1948
superstition project :
Studied 8 hungry
pigeons.
Food given every 15 sec
regardless of pigeon’s
behavior.
6/8 pigeons performed
idiosyncratic patterns of
unnecessary behavior.
Responding rose as
time to food
approached.
Why did all of this
happen?
10. Evidence for Contiguity
Food happened to follow something
each pigeon was doing.
Different behaviors were strengthened
for different pigeons.
Higher the rate of response, the more
likely food would again follow response.
Responding rose as time to food
neared because P of response-food
pairing rose the longer the time since
last food.
11. Evidence for Contiguity
Skinner thus concluded that necessary
and sufficient condition for operant
conditioning was that a reinforcer
closely follow a response.
Why is response-reinforcer contingency
so effective?
It guarantees response-reinforcer
contiguity.
12. Evidence for Contiguity
Skinner’s results and conclusions have
been questioned.
His results may be difficult to replicate.
His conclusions may not be general.
But, beyond superstition experiment,
there may be good evidence to support
importance of response-reinforcer
contiguity.
13. What leads to conditioning?
Contiguity
– Stimuli that are close
to one another in
time and in space
become associated
Co-occurrence
– Proximity critical
Contingency
– When one stimulus
depends on the
other, they will
become associated
Information
– Predictive value
critical
14. Evidence for Contiguity
Thomas (1981) study on contiguity-
promoting schedules.
P(Food|Press) = P(Food|No Press)
15. Trial 1 Trial 2
Trial 1 Trial 2
20s
No Response:
Subject
Responds:
Reward
S*
Reward
S*
Reward
S*
Reward
S*
Bar press
R
Bar press
R
17. Evidence for Contiguity
Thomas (1981) study on contiguity-
promoting schedules.
P(Food|Press) = P(Food|No Press)
No press-food contingency.
But, response-food contiguity was
promoted by novel schedule.
So too was lever pressing. Rats
increased lever pressing
18. Evidence for Contiguity
Extra wrinkle of Thomas (1981) study:
P(Food|Press) < P(Food|No Press)
Thus, negative press-food contingency.
19. Trial 1 Trial 2
Trial 1 Trial 2
20s
No Response:
Subject
Responds:
Reward
S*
NO
S*
Reward
S*
Reward
S*
Bar press
R
Bar press
R
Rewarded
response causes
next 20s trial to
be unrewarded
Trial 4Trial 3
21. Evidence for Contiguity
Extra wrinkle of Thomas (1981) study:
P(Food|Press) < P(Food|No Press)
Thus, negative press-food contingency.
Response-food contiguity was still promoted
by second schedule.
So too was lever pressing. Rats increased
lever pressing
Power of contiguity is very strong; can even
override effects of contingency.
22. What leads to conditioning?
Contiguity
– Stimuli that are close
to one another in
time and in space
become associated
Co-occurrence
– Proximity critical
Contingency
– When one stimulus
depends on the
other, they will
become associated
Information
– Predictive value
critical
24. CONTINGENCY LEARNING
Attempts to assess contingency
learning in operant conditioning parallel
studies in Pavlovian conditioning.
Operant studies suggest that organisms
can distinguish dependence from
independence between response and
reinforcer.
Cause and effect
25. What leads to conditioning?
Contiguity
– Stimuli that are close
to one another in
time and in space
become associated
Co-occurrence
– Proximity critical
Contingency
– When one stimulus
depends on the
other, they will
become associated
Information
– Predictive value
critical
28. Contiguity without Contingency
10 20
20 40
airplane
no plane
no
S* 2 S* 2
a b
c d
S+
1
No S+
1
bird and
plane are paired
A quick test for contingency
a·d > c·b
then positive
a·d = c·b
zero contingency
a·d < c·b
then negative
no
bird bird
prob.
(birdplane) = .33
prob.
(birdno plane) = .33
10/30 20/60
29. You can have a positive contingency even when
pairing is the least frequent possibility
Example: can you learn that
and “cat” are associated?
“cat” no “cat”
100 900 1,000
200 9,800 10,000
see
no
p (“cat” ) = .10
p (“cat”no ) = .02
hear
positive contingency
Learning:
Seeking cause
and effect
relationships
30. CONTINGENCY LEARNING
Head turn, mobile, infants given positive
contingency procedure (Watson, 1967):
– Infants’ head turning increased, plus they
smiled when mobile moved
Infants put on zero contingency
procedure:
– Infants’ head turning did not increase, plus
they stopped smiling when mobile moved
Apple martinis
32. CONTINGENCY LEARNING
Infants discriminate response-
dependent from response-independent
reinforcement: shown by head turning.
Infants differentially enjoy response-
dependent and response-independent
reinforcers: shown by smiling and
cooing.
Both cognition and affect may be
changed by control by consequence.
Cause and effect
34. panel
Learned Helplessness (Seligman)
Phase I - Learning to Escape
Control Dogs Yoked Dogs
Shock
•A long lasting shock is given to both groups
every once in a while
•Control dogs can turn shock off by pushing a
panel
•Yoked dogs’ shock turns off too, when control
dog pushes panel
•Yoked dogs can do nothing themselves to escape
shock
35. Contiguity or Contingency?
Spot
Periodically shocked
Can terminate shock
by pressing lever
with his nose
Lassie
Periodically shocked
Has no control over
shocks, but when
Spot’s shock is
terminated, so is
Lassie’s
36. Phase 2 - Avoidance Learning
•shock delivered to one side of box
•if dog jumps hurdle to other side
there is no shock
Control dogs learn to avoid shock
Yoked dogs don’t
Yoked dogs have learned that they can’t stop shock
They have learned to be helpless
hurdle
37. Learned Helplessness
Yoked dog seems to have learned that
its behavior does not matter:
– It not only fails to learn
– It stops reacting to shock
Phenomenon of learned helplessness
strongly suggests that organisms can
discriminate response-dependent from
response-independent events.
38. Learned Helplessness
Animals must learn to jump
barrier to avoid shock
Results
– Spot learns, Lassie yelps
but eventually becomes
passive and accepts shocks
Contingency
– Spot learns his actions
matter
– Lassie learned that it was
helpless
Contiguity
– Spot learned to press lever
– Lassie learned to act
passively
39. Seligman’s Learned Helplessness Study
Two groups of dogs are exposed to
shock
– control group could escape shock
– “no escape” group could NOT escape
shock
Later, when escape was possible, “no
escape” dogs didn’t even try
Learned that they had NO CONTROL
40. OPERANT CONDITIONING:
WHAT IS LEARNED?
In any operant conditioning study, three
events need to be considered:
– Response (R)
– Reinforcer or punisher it produces (O*)
– Stimulus situation in which response
occurs (S)
– Three occur in S-R-S* sequence
What associations among three
elements are formed when animal
learns to make operant response?
41. OPERANT CONDITIONING:
WHAT IS LEARNED?
R-O* association
Seems to require foresight: acting in
accord with future consequences.
Thorndike famously denied that
animals know what consequence of
their behavior will be.
Law of effect thus emphasized past
consequences.
43. OPERANT CONDITIONING:
WHAT IS LEARNED?
S+
-R association
Thorndike’s idea
Situation evokes behavior (S-R).
Reinforcers strengthen S-R bond.
S+
becomes more likely to evoke R.
45. OPERANT CONDITIONING:
WHAT IS LEARNED?
Two-process theory:
– S-R association (operant)
– S-O* association (Pavlovian)
Sight of lever not only triggers lever
pressing, but it also makes animal
“think” about upcoming food.
Anticipation of reinforcer motivates
operant response.
46. OPERANT CONDITIONING:
R-S* Learning
Strongest evidence comes from studies
using devaluation procedure (Colwill
& Rescorla, 1985).
Chain Pull→Sugar Water
Lever Press→Food Pellet
Food Pellet→Illness (Devaluation)
Choice: chain pull versus lever press
Rats pull chain much more than press
lever.
47. R-O* association
Colwill & Rescorla (1985)
Training Devaluation Test
R1 O1
R2 O2
O1 LiCL
O2 nothing
R1 and R2
1
2
3
4
5
6
7
Mean
resp/min
R1 -outcome
was devalued
Time
R2 -outcome
not devalued
48. OPERANT CONDITIONING:
R-O* Learning
Association of food with illness does not
change stimulus aspects of situation
that might generate responses.
Lever press does not occur because it
is associated with chamber (S-R), but
because it is associated with reinforcer
(R-S*).
When value of reinforcer is eliminated,
so too is impetus for response.
49. OPERANT CONDITIONING:
R-O* Learning
Operant conditioning involves learning
to expect responses to produce reward.
Rats not only expect reward, but a
particular kind of reward.
Devaluation procedure could not work
unless rats had specifically
remembered that one response
produced food pellets and other
produced sugar water.
51. OPERANT CONDITIONING:
S-O* Learning
Rats trained to panel press.
A light or noise was always present.
S1 = sugar water and S2 = food pellets.
Lever press = sugar water and chain
pull = food pellets.
S1 increased lever pressing, but not
chain pulling.
S2 increased chain pulling, but not lever
pressing.
53. OPERANT CONDITIONING:
S-O* Learning
For rat to show these selective
increases in responding, it must have
learned which stimulus was associated
with which reward.
Therefore, this study provides evidence
of S-O* associations in operant
conditioning (Colwill & Rescorla, 1988).
54. S-O association
Colwill & Rescorla (1988)
Sd training Response training Test
S1 R1 O1
S2 R2 O2
R3 O1
R4 O2
S1: R3 vs R4
S2: R3 vs R4
2
4
6
10
Mean
resp/min Different
outcome
Trials
Same
outcome
8
55. OPERANT CONDITIONING:
S-R Learning
Devaluation studies find a reduction in
response that leads to devalued
reward.
But, response is rarely eliminated.
Residual responding may represent
behavior triggered by stimulus situation
in which responding was rewarded.
56. OPERANT CONDITIONING:
WHAT IS LEARNED?
Summary statement:
Research suggests organisms learn
associations between response and
reinforcer (R-O*), environmental stimuli
and reinforcer (S-O*), and stimuli and
response (S-R).
The “simple” process of operant
conditioning is not so simple after all.
Editor's Notes
Definition one: animal passive, learning= behavior. Reflexive (S-R learning later in class) Do you learn automatically? Definition 2: does not mention the behavior, can you learn something that does not show up in behavior? Views animal as an information processor S-S vs. S-R Hull Tolman
e.g. Colwill & Rescorla (1985) Pressing a lever allowed rats to obtain food pellets Pulling a chain gave access to sugar water. One group was given free access to food pellets then made ill When both the lever and chain were present (without reinforcement) this group made few lever presses but continued to pull the chain. Illness after drinking sugar water had the opposite effect. Operant responses are produced because they are associated with their consequence