BARKING UP THE WRONG TREE — FOR 110 YEARS?

Dr. Ian Dunbar and Dune the American Bulldog

Edward Lee Thorndike showed that behavior is modified by its consequences and in 1905, he published his Law of Effect, basically stating: Any behavior followed by pleasant consequences will increase in frequency and be more likely to occur in the future, whereas any behavior followed by unpleasant consequences will decrease in frequency and be less likely to occur in the future. The notion of binary feedback is the quintessence of learning theory. The Law of Effect was a wonderful start but as theory was put to practice in education and training, something went very wrong along the way. Over the years, dog training has become overly complicated, time-consuming, technical, mechanical and impersonal — lacking in communication, interaction and relationship. I feel that dog training has lost its way, its voice and its soul. We simply have to get things back on track before dog training … (dare I say it?) … goes to the dogs.

 

I first introduced science-based training techniques to dog trainers way back in 1971. However, over the next decade, I realized that much of laboratory learning theory was either irrelevant or unworkable when people trained animals in real life situations. Few people have the exquisite timing or tireless consistency of a computer, which are both beyond essential when using punishment and even fewer people can compute the variable reinforcement schedules that were most effective for maintaining rates or responding and instilling a stellar work ethic. Consequently, in 1982, I proposed a much simplified practical learning theory that was first published in the original SIRIUS® Manual and later in my second book, How To Teach A New Dog Old Tricks.

In SIRIUS puppy classes, training was conducted off-leash within the play session in order to prevent physical prompts (especially leash tugging and jerking) from becoming a crutch — an extremely difficult-to-dispense-with crutch. The basic training sequence is so simple and comprised: 1. Request - 2, Lure - 3. Response - 4. Reward. The sequence is progressively modified in three stages.

Stage 1. Teaching Dogs WHAT We Want Them To Do — by using food lures to entice the dog to watch the owner’s hand movements (signals) and then phasing out food lures during the very first training session, often on the 9th trial. The foodless handsignal is then used as a lure. Since the verbal request predicts the handsignal, the dog eventually responds after the verbal request but before the handsignal, i.e., the dog has learnt the meaning of the verbal request. Using food lures makes training lightning fast by accelerating the process of teaching handsignals and verbal cues. The sequence becomes: 1. Request - 3. Response - 4. Reward.

Stage 2. Teaching Dogs To WANT To Do What We Want Them To Do — by phasing out food rewards and replacing them with more powerful, life rewards, interactive games and cued behavior “problems”. The reintroduction of food rewards was a boon for many novice trainers who had difficulty praising their dogs like they really meant it. Moreover, as an added bonus, Lure/Reward Training, does not require consistency, good timing or super-human computing power. Even utterly random reinforcement is highly effective at maintaining high-levels of responding. And when owners are a tad tardy in rewarding dogs for a prompt sit, down, or sit-up, they end up effectively reinforcing mini-stays. In fact, teaching random length stay-delays when randomizing body position changes is an effective training technique in itself. Even so, although classical conditioning food rewards should never be phased out, food rewards for teaching manners were only intended as a temporary training tool and were quickly phased out by asking “more for less”. For example, when all goes according to plan during the first session, the first eight trials produce 38 responses for just 10 food rewards, i.e., already a Response:Reward Ratio of 3.8:1! And of course, using the more effective “life rewards” is the real fun part of training. Once the dog is sufficiently motivated, external rewards are no longer necessary because the dog is self-motivated and internally reinforced and the sequence becomes: 1. Request - 3. Response.

Stage 3. Insisting on Compliance with Instructive Reprimands — only of course, once the dog understands the task at hand (tested by assessing Response Reliability Percentages for each cue) and has been motivated to want to comply in most circumstances. The instructive nature of aversive punishment depends almost entirely on split-second timing and the effectiveness of punishment-training requires 100% consistency. Punishment worked extremely well in laboratory learning theory experiments but not very well in real life, especially with human trainers. Ill-timed or inconsistent aversive feedback makes it difficult for dogs (and horses, students, employees, spouses and children) to learn anything but a dislike for “training” and the “trainer”. Even when aversive punishment does work, the result is woefully insufficient. When dogs are non-compliant, or otherwise misbehave, in addition to 1. Inhibiting undesirable behavior, we also want to 2. Get the dog on track as quickly as possible and 3. Inform the dog of the potential danger of non-compliance. A single spoken word — an instructive reprimand — conveys all three pieces of information, producing high levels of compliance.

Back in the 80s, I thought it was necessary to instructively reprimand in a louder voice. However, we have since discovered that raising the voice is unnecessary to achieve on-demand, high levels of compliance. Exciting stuff! When dogs err, the key is clear instruction and calm insistence. Thus, the Request now becomes a Warning that signals to the dog that you will follow up and gently insist on compliance: 1. Request/Warning - 3. Response - 4. Insistence. No shouting, no fear and no pain. It’s all really so very simple.

Throughout the 90s, dog training was progressing in leaps and bounds. Nearly all puppy training classes, workshops and events, such as the K9 Games, were held off-leash and so the dogs developed brilliant bite inhibition, were well socialized with people and other dogs and were under off-leash verbal control (without the continued need for training aids, such as leashes, collars, halters and harnesses). Food lures and rewards had been used to great effect and phased out so that the dog’s compliance was not dependent on the owner having food in their hand or pocket. All in all, things looked good and augured well for the future of pet dog training. And so what went wrong?

 

  • So many puppy classes and workshops are conducted on leash and in my opinion, today’s dogs are much more reactive around other dogs, more fearful of people and the acquisition of bite inhibition has suffered. (In my Reliability and Games workshops over the past two years, at least a dozen dogs were attacked and four people were bitten.) Also, on-leash training does little to provide owners with off-leash verbal control for dog parks, or at home even.
  • Food lures and food rewards are not being phased out and so food becomes bribes as soon as the dog develops competing doggy interests and is no longer willing to comply.
  • Far too many food rewards are dispensed indiscriminately via rich and relatively ineffective schedules, e.g., continuous reinforcement, that reinforce just as many below-average responses as above-average responses — rates of responding progressively decrease as food loses its reinforcing power and the quality of responses seldom improves.
  • Reward-training has become overly complicated, time-consuming, technical and beyond the capabilities of many owners, who out of frustration turn to different techniques, e.g., on-leash and shock training, which give the illusion of rapid resolution — provided the dog remains within arm’s reach, on-leash or wears a shock collar, of course.
  • As pet dog training became a separate field from obedience/working training, we moved away from ongoing quantification and without a doubt, standards have dropped considerably. Multi-minute stays and off-leash heeling have become somewhat of a rarity in puppy training classes. Lack of quantification and solid data have fueled numerous needless arguments that have virtually cleaved the dog training profession.
  • Perhaps the single biggest detrimental change to dog training is the disappearance of verbal instruction prior to task and verbal feedback.

 

Punishment

I should mention that Thorndike is one of my heroes and his Law of Effect came so very close: behavior is changed by consequences and consequential feedback should be binary. The reward aspect of the Law of Effect is pure in its simplicity and so very effective. Just wait for, or better yet, encourage, the trainee to do something “right” and then immediately praise and reward and the “task” is virtually 95% complete. However, the use of aversive punishment was just so very, absolutely and completely wrong from the outset.

Obviously, it is essential to teach owners how to inhibit and eliminate undesirable and potentially dangerous behaviors, otherwise owners will become frustrated and likely seek help elsewhere. It would be too silly to ignore non-compliance and misbehavior hoping that they will “go away” (extinguish). Nonetheless, it is not necessary to make failure to learn a more unpleasant experience than it already is? Maybe the teacher or the teaching contributed to the slow learning but regardless, who on earth would want to frighten or hurt children, pupils, employees and animals for failing to learn?

Food and shock were used in laboratory training experiments because computers could not explain the task, praise, reprimand, or reinstruct. Food and shock were highly effective in the laboratory because both were administered consistently, with exquisite timing and according to computed reinforcement and punishment schedules. Some reinforcement schedules worked well to increase rates of responding and punishment was extremely effective at inhibiting and eliminating unwanted behavior. However, a computer training caged rats and pigeons is very different from people training animals in real life. For example, shock punishments worked in the laboratory because they were administered immediately and consistently and the animals were caged and could not escape. Consequently, it was assumed that painful punishment was the best solution for unwanted behavior. However, few people have the consistency or timing of a computer to administer punishment effectively, which creates many additional problems. And in real life, the dog may run away, or force the trainer to run away.

Administering aversive punishment without exacerbating problems, or causing others, requires considerable experience and expertise. Therefore, aversive punishment is not a good choice for when people train animals or other people. Most people are pretty inconsistent and don’t have brilliant timing. Whereas inconsistency and lousy timing can actually work quite well with reinforcement, either one destroys any punishment-training program. People simply cannot punish effectively — advertised by the fact that leashes and shock collars often become lifetime management tools. Punishments should decrease in frequency and eventually be eliminated as undesirable behavior is inhibited and eliminated. Continued “punishment” is evidence that it is not working and therefore, by definition, the aversive feedback cannot be defined as punishment.

It is actually a surprisingly simple endeavor to deal effectively with behavior problems and non-compliance without casing fear or pain. Without a doubt, if, a hundred years ago, Thorndike had consulted kindergarten teachers and grandparents rather than dog trainers with regards how to deal with misbehavior, instead of a Food Reward vs. Physical Punishment binary feedback, we would probably have Praise vs. Specific Redirection, or Reinstruction, i.e., gentle guidance and insistence.

Rather than treating dogs as adversaries in the training arena, we should consider dog training to be on par with teaching a child to read, learning how to tango, or being taught golf. This is education, not war. Without a doubt, adopting non-aversive means for inhibiting undesirable behavior and non-compliance is paramount. Beyond overdue. I am completely serious when I say that dog trainers (and parents) can learn so much from grandparents and kindergarten teachers.

 

Food Lures and Rewards

Bringing back food lures and rewards certainly did wonders for dog training. Food should be mandatory for classical conditioning. (There is no practical alternative to teach dogs to like children, men and strangers.) Food lures make teaching verbal cues lightning fast and food rewards are extremely effective, especially for trainers without great affect.

We have three wonderful reward-based training techniques at our disposal. They have pros and cons in terms of ease and efficiency (speed) but all three are effective and enjoyable for dogs and their owners:

Lure/Reward Training is most certainly, by far the quickest way to put behaviors on cue and the trainer may (must) teach several behaviors concurrently. Since the behaviors are predictably lured, the trainer may cue the dog beforehand on the very first trial and employ a differential reinforcement as early as the second trial. Timing between the verbal cue and the lure is critical, however consistent reinforcement is not necessary. Food (or toy) lures and rewards are the easiest of training tools to phase out. As with many training techniques, perhaps the biggest drawback of Lure/Reward Training is that it is done executed correctly — few owners are instructed to phase out the food lure in the first session and so, food becomes a bribe, which the dog eventually “blows off” and the owners get frustrated.

Shaping (Clicker Training) is the method of choice for fine-tuning behavior and for teaching behaviors that cannot be lured, for example, handstand pirouettes. Shaping is complicated and time-consuming and requires a considerable skill set. Consistency and exquisite timing are absolutely essential. Trainers may only shape one behavior at a time and must first employ a continuous reinforcement schedule to “capture” each successive approximation before eventually, putting the final performance on cue. Many trainers experience difficulties phasing out the clicker and food rewards. Many younger trainers are maybe unaware that shaping took a lot of hurt out of dog training. Shaping provided a fairly fast means (usually quicker than luring) for teaching non-retrieving dogs to reliably retrieve and thus, abolished the barbaric ear-pinch (negative reinforcement) procedure. Additionally, since behaviors are not always lured or prompted, owners have no expectations regarding their dogs’ performance and are simply happy when their dogs do something worth rewarding. Since the dog can never be “wrong”, owners seldom become frustrated. In fact, the hallmark of a Clicker class is exiting owners with smiley faces and dogs with waggy tails.

All-or-None Reward Training is the method of choice for training out-of-control, inattentive, hyperactive adolescent dogs that blow off food bribes — an absolutely brilliant technique for teaching dogs to calm down, sit-stay and focus. Once the dog calms down and focuses, it is easy to revert to Lure/Reward Training but this time, phase out the food lures and rewards. All-or-None Reward Training is time-consuming but quicker than Shaping because when the dog eventually “gets it right”, he gets it right all at once. For example, either the dog is not-sitting, or sitting, either the dog is barking, or quiet, either the dog is looking away, or looking at the trainer. All-or-None Reward Training is by far the easiest of training techniques because consistency, good timing, or a doctorate in learning theory are all unnecessary.

All the above techniques are effective and thoroughly enjoyable for owners and dogs but in terms of ease and efficiency, my Lure/Reward Training is always my Plan A for teaching novice owners how to train puppies and adult dogs. For hyperactive, inattentive dogs that are blowing off food bribes, I would use Plan B: All-or-None Reward Training to gain focus and calmness and then revert back to Plan A. Once the dogs have acquired a few easy-to-lure basics: Come, Sit, Down, Stand, Roll Over, Stay, Follow, Heel, Walk-on-Leash, Quickly, Steady, Speak, Shush, Off, Take it, Thank You, Hug, Fetch (differential retrieves), Go To…(people and places), Bed, Beg, Bow, Bang, Back-up, Creep, Twirl, Dance, etc., I would introduce Plan C: Shaping.

Yes, we should absolutely use food lures and rewards in dog training but phase them out as soon as possible. It has been well over 40 years since I reintroduced the use of food (and toy) lures and rewards and fun and games to dog training and initially, it was a hard sell trying to explain to obedience trainers that luring and rewarding with food was very different from bribing. Ironically, nowadays, I would say that far too many trainers are indiscriminately dolling out far too many food treats and not phasing out food lures within the first session and so, as soon as the dog develops competing interests and becomes unwilling to comply with the owner’s wishes, food becomes a bribe, which the dog blows off. And then, many trainers forget the prime principle of training — Thorndike’s Law of Effect — that behavior in changed by its consequences and instead, try to alter the frequency of behaviors by changing antecedents. For example, by increasing the value of food lures, being animated and speaking in a soft squeaky voice to entice dogs to approach, or speaking in a loud, stern voice to make dogs cease and desist. Of course, altering antecedents may cause temporary changes in behavior. The dog may respond now but will be unlikely to do so in the future, especially without the improved food lure, or altered tone/volume of voice. Consequential feedback, however, effects permanent changes in behavior.

Most learning theory experiments focused on maintaining rates of responding but nearly all of the schedules — continuous, fixed interval, fixed rate, variable interval, variable rate and random reinforcement — all reward animals irrelevant of the quality of the behavior, i.e., the animal is rewarded for just as many below-average responses as above average-responses, which is really pretty silly.

 

Quantification

Reinforcement should reflect the quantitative and qualitative aspects of behavior. The quantitative aspects of behavior are easily measured, for example, speed of recalls and length of stays. Differential reinforcement is the only way to go, i.e., the dog should only be rewarded for responses that meet minimal criteria, better responses earn better rewards and the best responses win jackpots. As a rule of thumb, the dog should be rewarded for no more than a third of all responses. With at least a 3:1 Response: Reward Ratio, not being rewarded is actually as informative and as motivating as being rewarded. (What happens in the four years after coming fourth in the Olympics?) Therefore, trainers need to calculate average responses, e.g., average recall time. Thus, with a DR10.20.30, the dog is only rewarded for recalls 10% quicker than average. For recalls 20% quicker than average, the dog receives better or more rewards and for recalls 30% quicker than average, the dog receives a celebration!

Quantification is motivational for dogs, owners and trainers. Quantification allows owners and trainers to detect “baby steps” — early improvement. For example, when a sit-stay improves from 0.2 to 1.2 seconds after just three trials, that’s a colossal 600% improvement. Just four more equivalent 600% improvements and the Jack Russell will be sit-staying for four minutes and 29.2 seconds. Not too shabby! And each time the owner and dog surpass a personal best, the trainer may praise the owner and encourage them to praise the dog for a good job well done. Moreover, at the end of each session, the trainer may remind the owner, “Well, this is a lot better than when we started”, meaning, “Aren’t you the lucky one to be training your dog with me rather than another “trainer” who doesn’t fully appreciate a little science”.

Quantification prevents needless argument. Debating theoretical issues often consumes valuable training time and at best, much of the argument centers on non-issues and at worst, the rigmarole is akin to a straw man beating a dead dog. For example, the Quadrant. (I do so wish that we hadn’t come up with the Quadrant. The Quadrant was only intended as a memory aid to help decipher the needlessly complicated and ambiguous terminology of positive/negative reinforcement/punishment.) A trainer might insist that they are totally “positive” when they time-out (negatively punish) a ”bully” from a play session but interview the dog, and he might say, “But this is positive punishment, she put me in prison.”

Rather than wasting time with banter, badinage, disagreement, argument, or personal slurs about what we think is the best way to train dogs, we should first prove that we are, in fact, training dogs. We must separate what we think or believe from what we know. Behavior and behavior-change (training) is observable and quantifiable and therefore, the facts are irrefutable.

Trainers must not be afraid of numbers. Indeed, trainers need to calculate Time & Trials to Criterion, Response Reliability Percentages, Response:Reward Ratios and especially Response:Punishment Ratios (extra-especially if using aversive punishments). If the number of responses does not appreciably increase relative to the number or Rewards, then the dog is not learning much but at least he’s having a great time. However, if the number of Punishments does not decrease relative to the number of Responses, then the dog is not learning and he is having an unpleasant time.

People may not be as consistent, or have such great timing, or be able to compute like computers BUT people are smarter in different ways. Didn’t we invent computers? People can easily assess qualitative aspects of behavior. People can quantify quality (style, panache, pizazz and cuteness) in an eye-blink and we can give the most sophisticated, differential, consequential, binary feedback, i.e., verbal feedback.

 

Verbal Feedback

In an attempt to emulate computer-generated learning theory, verbal feedback has all but disappeared from dog training and there’s hardly a “thank you”, a celebration, or a redirection, or reinstruction to be heard. Rather than trying to teach owners to emulate computer reinforcement and punishment schedules using quantum rewards and/or punishments (clicks, treats, shouts, jerks and shocks) for which the instructive value absolutely depends on consistency and precision timing, (which of course most people don’t have), we should be teaching people how to praise and reprimand (gently insist).

The goal of dog training is to produce an internally-reinforced, self-motivated dog that is under reliable verbal control when off-leash, at a distance and distracted, and without the continued need of any training aid. Many of the above constraints on learning would be moot, if we went back to training with verbal feedback and reintroduced some feeling to the science of dog training.

Food and shock were used in laboratory training experiments because computers could not explain the task, praise, reprimand, or reinstruct … BUT WE CAN! Verbal feedback is binary, analogue and instructive — precise, rich, efficient and effective, especially for eliminating misbehavior and lack of compliance.

Dogs need to know whether they got it “right” or “wrong” (binary feedback). When dogs get it “right”, in addition to learning that she got it “right”, a dog needs to know how well she did. (Differential reinforcement.) Verbal feedback is effortlessly and naturally analogue — the degree of praise differs to reflect the quality of the behavior.

Moreover, verbal feedback allows us to transcend the laws of laboratory learning theory that dictate: when a dog misbehaves, the consequences must be unpleasant. Instead, when a dog gets it “wrong”, a single spoken word effectively conveys three essential pieces of information: 1. What you’re doing is “wrong”, 2. What you’re doing is potentially dangerous and 3. This is what you should be doing, i.e., verbal feedback is instructive and offers by far the best way to get the dog back on track as quickly as possible without causing fear or pain, in fact, without even raising our voice.

 

This may well be my last year on the seminar circuit. I know I’ll never completely retire from the world of dogs but for 2016, I have only planned two seminars. This year, as a potential last hurrah, I have scheduled three seminar tours for the East Coast (June), Great Britain (July) and the Midwest (September).

 “Barking Up The Wrong Tree” seminars in Ft. Lauderdale FL (6th June), Atlanta GA (13th June), Reading UK (11th July), Iowa City IA (10th September), Minneapolis MN (12th September) and Kansas City MO (16th September)

“Pros & Cons of Five Reward-based Training Techniques” seminars in Reading UK (12th July) and Cardiff UK (26th July)

“Dr. Ian Dunbar UNLEASHED!” seminars in Orlando FL (11th June), Washington DC (20th June), East Hanover NJ (27th June) and Madison WI (26th September). In “UNLEASHED!”, I plan to candidly express what I think is wrong with the world of dogs and suggest how we can best put things right. 

 

 

 

Products from Dr. Ian Dunbar