Puppy Testing


Many breeders and trainers employ aptitude/temperament tests for the evaluation of litters, or to determine the tractability and trainability of individual puppy prospects. The several tests in existence are basically variations on the same theme, and the prototypical test comprises evaluations of: eagerness to approach and follow (both assessments of social attraction towards people); response to handling and restraint; activity level; and reactivity to sound, sudden movement and other physical and social stimuli. However, the maximal benefits of testing are seldom realized because test scores are not validated, test results are frequently misinterpreted, and tests are rarely used in the intended manner. Understanding the limitations of aptitude tests considerably enhances their usefulness for the evaluation and prediction of behavior, temperament and above all, training.

Validity of Test Scores  

It is highly unlikely that the results of a one-time, standardized test offer a valid representation of a puppy's overall behavior and temperament. A single test indicates how the pup faired on a certain day, with a particular tester, under strictly monitored conditions. This may, or may not, tell us how the pup might fair on the same test with the same tester later the same day. Similarly, it is unknown how the pup might fair with different testers and different tests under a variety of different test conditions.

Just like people, dogs have likes and dislikes, and they have good days and bad days. Perhaps on the test day the dog was just too tired, or too hot, or perhaps it had just had a disagreement with another animal, or perhaps some silly person had recently stepped on its tail, or frightened the pup by dropping a pile of feeding pans, or perhaps the dog was just having a bad-hair day. On the other hand, perhaps the dog had been isolated or confined for a lengthy period immediately prior to testing and so appeared friendlier and more sociable than usual. Or, the dog may have been isolated for several weeks and therefore appeared to be more asocial than it might have acted if given the benefit of an enriched (i.e., normal) social environment. Perhaps the dog did not like the tester. Perhaps the puppies had never before encountered a man with a beard wearing dark glasses. Perhaps the tester didn't like the dog, or was not overly enamored of the breed. There are so many unknown variables with a one-time test. Retesting the dog is the only way to determine which variables affected the dog's performance in the test.

To establish the validity of test scores, the pup must be retested — a number of times. By all means test at 49 days of age but also, retest later the same day and on subsequent days to ascertain inter-test reliability. Similar scores over successive tests with different testers in a variety of locations and under a number of different test conditions, demonstrate that the test results are valid, and it is unlikely that the tester, the time, or the place have confounded the results. The greater the number of tests and the better the correlation between scores, the more reliable the evaluation.

Interpretation of Test Scores

Even provided good inter-test reliability and a high validity of scores, test results are frequently misinterpreted, such that the tester's conclusions present an erroneous evaluation of the puppy's temperament. To illustrate, consider a visitation test in which three pups responded as follows: Pup#1 charged out of his cage, jumped-up and munched the tester's hand; Pup#2 approached enthusiastically with tail-a-wag, licked the tester's hand and then sat and gazed into his eyes; and Pup#3 remained cowering in his kennel. Usually, Pup#2 is described as highly socialized and trainable — the ideal pet, but Pups #1 and #3 are often deemed unsuitable as pets; Pup#1 because he is overly aggressive and difficult to train, and Pup#3 because he is under-socialized and fearful.

It is presumptuous to assume Pup#3 would make an unsuitable pet. As is, the shy-little (or shy-big) critter may make an ideal companion for an elderly person living alone. But why not just eliminate the shyness altogether? There is no reason why the pup's sensitivity should be allowed to develop into a fulminating fearfulness. Sensitivity is an extremely desirable trait, especially for obedience and working dogs. By all means maintain the pup's sensitivity, which makes for easy training but also, build up the pup's confidence before the pup's potential timidity and fearfulness effectively destroy his worth as a competition dog, or companion animal. Routine socialization and commonsense canine husbandry easily prevent the otherwise predictable course of development of sensitivity into shyness and timidity, into apprehensiveness and fearfulness and ultimately, into aggression.

To presume Pup #1 would be difficult to train is utterly unfounded, especially considering he demonstrated the speediest recall of the entire litter. With a little education — simply instructing him to sit before impacting mid-chest, (the dog can not jump-up and bite and remain sitting at the same time) — the dog's rambunctious exuberance may be re-channeled into eager obedience. To automatically condemn Pup#1 as being overly aggressive is equally erroneous.  This interpretation confuses behavior with temperament — confusing puppy-biting behavior (a normal and natural behavior of all group-living carnivores) with aggressiveness (an undesirable trait). Rather, it is the puppy that does not mouth and bite as a youngster that augurs ill for the future. Play-biting is absolutely essential for the puppy to develop bite-inhibition; First to learn to inhibit the force of his bites and second, to learn situations when it is inappropriate to mouth at all.

In reality, many so-called aptitude/temperament tests are in fact, simple behavior tests. Simple one-time observations of a puppy's behavior are used to make sweeping generalizations about the pup's future temperament. Results of a simple visitation test have been grossly extrapolated to draw quite complicated conclusions about trainability and temperament. If the intention is to evaluate these various attributes, we would do much better to specifically appraise the pup's learning speed, attention span, recognition, recall and memory, or to investigate reactiveness, bounce-back (forgiveness), specific sensitivities and fears.

Therapeutic Value of Tests

The purpose of aptitude testing is not merely to laud the winners and discard the losers of a one-time test but rather, to evaluate individual qualities, both good and bad, that may have predictive value for the dog's adult behavior and temperament, so that it is possible to capitalize further on the dog's good points, and, to judiciously intervene and prevent the development of potential or anticipated bad qualities.

Aptitude testing is actually a misnomer, since it implies the evaluation of inborn or innate abilities. In reality, the pup's competence has also been radically affected by early experience, including the experience of being tested. In this sense, a well-designed aptitude test becomes the necessary therapeutic intervention for potential or incipient problems. Furthermore, therapeutic procedures represent the delayed equivalent of the socialization and training, that the pup should have received during early development, i.e., the smart breeder would have intervened at three- and four-weeks of age to prevent these problems from developing at all, by ensuring the puppies had the benefit of an enriched physical and social environment.

Experiment I: To Demonstrate The Effect of Repeated Tests

Starting with shy Pup#3 (or an adult dog) in his crate or kennel, open the door, step back ten feet and time how long it takes for the pup to establish contact with an experimenter (contact latency). Record the time, give the pup a food treat, put him back in his crate and retest. (For rambunctious Pup#1, before offering the treat, use it as a lure to induce the pup to sit and then give the treat as a reward for sitting.) Repeat the above ten times in a row. Then, change the experimenter and retest the dog ten more times and then, similarly test with eight more experimenters (i.e., strangers).

Plot the results (100 scores in total) on a graph with Contact Latency on the vertical axis and Test # along the horizontal. For a shy pup (e.g., Pup#3), the graph will reveal a marked scalloping effect, with a progressive reduction in contact latency over successive trials with each tester, but with an increase in contact latency following the first few changes in experimenter. In addition, both the increase in contact latency following each change of experimenter and the scalloping effect become less pronounced because they are masked by a progressive overall reduction in contact latencies as the pup is tested with more and more experimenters, i.e., as the pup learns how to cope with strangers. After many trials with a number of different testers, the pup will approach a complete stranger in a fraction of the time that it took in the first test. In addition to the quantitative improvement, a shy pup appears to be more eager and confident about approaching the experimenters as he becomes more familiar and at ease with the test procedure, i.e., greeting strangers.

For a rambunctious pup (e.g., Pup#1), it would be difficult to improve further on the already asymptotic speedy approach and minimal contact latencies. Instead the improvement is qualitative rather than quantitative. Over repeated tests, the pup learns to combine a lightning approach with an equally fast and eager sit, followed by a rock-solid sit-stay — what a wonderful way to greet people.

With regards perfect Pup#2... he was perfect. Well, at least he was perfect in his first test. Of course, without continued practice, the previously perfect puppy performance, (whether the legacy of aptitude or experience), most probably will progressively deteriorate over time, as the puppy grows up and learns that there are far more enjoyable things to do than visit boring old strangers — there's urine to sniff, squirrels to tree and dogs to play with. Consequently, simply maintaining a good performance throughout adolescence may be seen as improvement. In addition, no matter how good a dog's response, there is always room for improvement. The pup could learn to sit straight and to sit with pizzazz and panache, and not to lick, goose, or nuzzle unless requested.  

Some pups' responses in initial tests are often marred by over-exuberance, shyness, or simply, not knowing what is expected. With repeated trials, however, the pups learn how to succeed in the test, i.e., temperament testing becomes temperament training, which forms the necessary foundation for obedience training — maintaining the dog's good performance throughout adolescence and adulthood.

Predictive Value of Tests  
Although it is widely assumed that it is possible to accurately predict a dog's adult temperament from early testing, this has never actually been satisfactorily demonstrated in any scientific study. An interested breeder or trainer can endeavor to answer this question.

Experiment II: To Ascertain the Predictive Value of Early Tests
In addition to performing a battery of at least five repeated tests on pups at six- to eight-weeks of age, re-evaluate the entire litter with the same series of tests at three months, four months, six months, one year and two years of age. Average the scores for each dog for each age group and plot the mean scores on a graph with Test Score on the vertical axis and Age of Dog along the horizontal. Connect the points for each puppy and examine the graph to see if each dog's absolute performance (contact latencies), or relative performance (rank of performance compared with other littermates) vary over time.

Basically, the predictive value of early testing will depend on a number of factors: 1. The greater the inter-test reliability of each series of tests, 2. The age of the pup at the time of testing and 3. Human intervention in the dog's development.

With a large amount of variation between the test results in any series, the predictive value is minimal. If anything, it predicts a lack of prediction — producing no reliable assessment of how the dog acted in the past, it offers little help in forecasting the dog's actions in the future.

The older the dog at the time of each test, the greater is the predictive value of the results. Young animals have a high degree of plasticity in terms of their temperament and behavior repertoire, which are continually modified with each new experience. This is reflected by gross changes in each pup's absolute performance (average contact latencies) and by changes in the pus' relative performance (vis a vis other littermates) prior to six months of age. With older dogs, the rank order of approach times remains more a less the same. Adolescent and adult dogs are much more resistant to change, even with human intervention. A six-month old dog would require hundreds of trials to significantly improve its contact latency with strangers.

Early testing only predicts how the puppies might develop if left to their own devices and if all treated equally. But pups and adolescents are never treated equally. Some grow up in great homes, others in good homes and yet others in bad homes — some poor dogs grow up with owners who should not be allowed to keep a rock, let alone a dog. The owner-variable far overshadows potential puppy predispositions. And without a doubt, a poor puppy prospect in a good home almost always becomes a better canine companion than a good puppy prospect in a poor home.

Human intervention is certainly the most important factor determining predictability: whether or not the new owners capitalized on the dog's potential good characteristics and/or resolved expected or incipient bad characteristics that were revealed in earlier testing. It would be naive to expect a dog to cure his own faults. Behavior and temperament are in a state of constant flux, and without human guidance, faults generally tend to get worse rather than better. It would be tantamount to stupidity to test a pup and discover that he is fearful, rambunctious, or aggressive, yet leave him to develop in this expected fashion. Surely a major reason for early testing is to locate potential or incipient problems and solve them before they become full-blown. Similarly, it would be utter folly to assume that a dog's naturally good temperament will necessarily remain that way indefinitely. As soon as owners become presumptively audacious about their pet paragon with the perfect personality, the dog's demeanor will predictably begin to deteriorate.  

Don't let this happen to your dogs. Monitor their progress. A breeder or trainer may learn a lot from retesting a litter and observing its development. And the dogs’ owners will continue to learn from repeated contacts with a conscientious breeder or trainer. As with obedience training, behavior modification and temperament training are never complete. You should train your dog throughout his sunset years. And when your dog dies, it’s time to say, “I could have done better, but I did a really good job with that wonderful dog. I’ll remember him well. The King is dead. Long live the King!”

This article is based on Dr. Dunbar's Behavior column in the February 1989 issue of the American Kennel Gazette. Reprinted with the permission of the author and the American Kennel Club.

The Free Course Collection for Dog Owners, Trainers, Breeders, Veterinarians, Shelters/Rescues and Pet Stores