When you think about the use of rewards in dog training, what is the first thing that pops into your mind? For some, it is a food treat, for others a ball or toy, and for others a reward equals verbal praise which may or may not be coupled with physical praise (petting, scratching, etc.). Whatever it is that you equate with the word “reward,” chances are good that you may be limiting the power of your reward system. In living with and working with dogs, rewards as well as bribes and lures have a distinct place and value at certain points. What exactly is a reward, a bribe or a lure?
The dictionary definition of reward: “something that is given in return for good or evil done or received; and especially that is offered or given for some service…”
A bribe, on other hand, is something “that serves to induce or influence.”
A lure (from the Latin for “to invite”) is defined as “to tempt with a promise of pleasure or gain; implies a drawing into . . . through attracting and deceiving.”
To my mind, the most important difference between a lure and a bribe is the intent behind the offer. A bribe is a deceitful attempt to gain or regain control, while a lure is a more pure hearted, genuine attempt to ease the way and make the learning of a lesson a little more pleasant.
A lure is extremely useful when teaching new tasks, overcoming uncertainty or fear on the animal’s part, and as a means of magnifying the interest and importance of you and/or your actions. An animal who is uncertain about a given task or working on a piece of equipment or unusual flooring can often be lured successfully. By using a lure to make yourself and your actions of greater interest to the animal, a lure can be a quick way to establish a relationship and gain cooperation from animals you do not know well. A lure is offered before a behavior is elicited and either directly assists in guiding/shaping the behavior or minimizing/eliminating the stumbling blocks of confusion or fear.
A bribe is an offer made in an attempt to get a dog to do something he chooses not to. This offer most often occurs just prior to (“if you do that, you can have this”) or concurrent with the command (“I’ll make it worthwhile to comply”). I do use bribes – sparingly, and do so to quickly lower the value of a particular object or activity while increasing the attractiveness of what I am asking from the dog. This is NOT training – simply an effective means of temporarily solving a particularly dangerous or frustrating situation. In that setting, bribes can be powerful tools. A dog who is delightedly charging around the house with a chicken carcass may not drop it on command (if so, teach that particular skill very thoroughly at some other, less critical moment) but may be quite willing to “trade it” for a bribe of cheese that is dramatically offered or the appearance of a favorite toy. Staying out in the yard and playing “catch me if you can” is a frustrating game loved by dogs (especially adolescents) and loathed by owners, but one whose fun can be offset by an unexpected shake of the liver treats can, or whatever floats the dog’s boat at that moment.
A reward is a chance to say, “Thanks – I really like it when you do that!” This can range from a quiet thanks, or pat on the head, to an exuberant dance of delight or a shower of treats. A reward is always unexpected, unseen and comes after the appropriate behavior or response. There are three major criteria for implementing successful systems of rewards: timing, intensity, variety and frequency.
Timing is perhaps the most obvious aspect of any reward system. To be effective, a reward should occur within 3 seconds of a desired behavior (and ideally, that behavior alone), or the dog may inadvertently perceive the reward as one given for another subsequent behavior or even a concurrent behavior. While teaching one of my dogs to bark at the back door to go out, I misjudged his response. I was so focused on getting a bark that I was ignoring his accompanying behavior of leaping around like a lunatic. When he finally did bark, I rewarded him instantly with praise and an open door. To my mind, mission accomplished. Unfortunately, he had been in mid- air when he barked and was rewarded, and was thus led to believe that the combination of leaping & barking was the behavior that earned a reward. It took me some time to change his mind!
Timing is an important part of your definition of the criteria for success. In other words, when teaching a puppy, I might allow as long as 15-30 seconds for her to process my request and comply. An older dog, or a puppy with more training, might only have 5-10 seconds in which to be successful. This time period is the criteria for success. If my pre-determined time period is exceeded, I then act on options already set in my mind. I could: extend the time period, change the request, assist with a lure or placement, or break off the exercise altogether. I am very precise about the criteria for success, and thus the dog is offered clear guidelines as to what constitutes successful performance, at least in terms of time allowed.
As the dog’s skill increases, the criteria for success is narrowed. When my youngest dog was a puppy, she was being taught to sit or lay down before being allowed out the door. Initially, she had as long as 30 seconds to figure it out and still receive the reward of the door being opened. Gradually, over time, that criteria for success was narrowed. Now 18 months old, Otter gets 3 seconds in which to comply or I simply “withdraw” the offer by walking away and ignoring her for a few seconds before asking again.
Inconsistent performance has many roots, but before you blame your dog, carefully evaluate your timing. Excellent observers, dogs know more about your timing than you may expect! Extreme predictability (i.e., you always call the dog from a stay 7.4 seconds after turning around to face him) can lead to training problems as easily as extreme variance (sometimes you reinforce a command 3 seconds after it’s given, sometimes not until 9 seconds later).
Timing is everything when trying to communicate precise concepts. Think of it as driving down a highway, waiting for a friend to tell you which exit you need to take. If her timing is excellent, you will choose the correct exit. If her timing is poor, you may miss the exit or misinterpret her communication to mean that the next exit is the one she desires. Keep your timing sharp, and check by actual count or watch that you have not accidentally become predictable about when you will give a command, whether a recall, a send out or even a release.
To be effective, the intensity of the reward must match the action’s degree of difficulty. Difficulty can be physical, mental or emotional, as well as a combination of these three. A fearful dog who allows a stranger to exam her (high degree of emotional difficulty) should receive a reward of greater intensity than would a dog who found the entire exercise not particularly stressful or difficult. Learning a new task is far more mentally difficult than performing a learned, habituated response. Scaling a six foot wall may rate higher on a physical difficulty scale than does hopping over an 8″ board, depending on your dog’s physical abilities.
The degree of difficulty of any exercise will also depend on your dog’s inherent breed characteristics, structural/functional abilities, temperament and desire to work on that particular task. Teaching heeling to a Border Collie might rank lower on the difficulty scale than teaching that same task at the same level of precision to a Scottish Terrier. But even among Border Collies, a dog with physical problems, poor temperament and less than ideal working drive might find heeling a far more difficult exercise. If you’re beginning to get the idea that a long list of variables makes it impossible to say what the difficulty level and thus appropriate reward intensity might be for any dog working on any given task, then you’re right! Each dog is an individual, and intensity of reward must be calibrated to each individual.
Over a period of time, the appropriate pairing of reward intensity with the degree of difficulty results in a sliding scale approach to rewards. As a task becomes less difficult for the dog, less reward intensity is required to maintain that level of performance. It is not appropriate or useful to offer a fully trained dog the same reward/intensity for sitting as you did when he was just a pup and learning it all for the first time. This would be as silly as making a big deal over an adult signing his name for the thousandth time that year, though such a fuss would be appropriate for a first grader trying to master the basics of penmanship.
A reward’s intensity is strictly dependent on the dog’s perception of its intensity. A dog who does not particularly enjoy playing fetch would find a tennis ball a very low intensity reward (and possible rate it as no reward at all.) For a retrieving fanatic, you might not find anything that had greater intensity. I know dogs that would disregard entire steaks if their favorite bumper or ball were offered, and others who will accept a toy but far prefer food. Still others will pass up food or toys in exchange for exuberant, highly physical praise from their handler, eating the liver or grabbing the ball only after the emotional peak has passed.
Intensity is also dependent on the frequency with which the reward is offered. A reward that the dog rates a very high intensity rarely loses its appeal, no matter how often it is used; lower intensity rewards can lose their appeal more quickly. Do you know what your dog’s top five rewards are, and how they would rank on your dog’s reward intensity scale? Even more importantly, are you yourself on your dog’s list of rewards?
A good rule of thumb is that the less intrinsically rewarding (naturally enjoyable to the dog or in line with his instinctual behavior) a task, the more reward intensity required. For example, a retriever will retrieve almost endlessly – this is a behavior he enjoys without the need for much, if any, rewards other than the activity itself. But if you are trying to teach a Scottish Deerhound to retrieve, the reward intensity may need to be very high. (This helps to offset the reality that Deerhounds, as a rule, do not particularly enjoy or see a purpose in running after objects and returning them to the careless owner who threw them away in the first place!) To improve and then maintain this retrieving behavior in a Deerhound, reward intensity will have to remain relatively high even when the behavior is learned or, since it deviates so drastically from his inherent behaviors, this is a behavior that will rapidly deteriorate. Simply put, the more a dog enjoys an activity in and of itself, the less reward intensity will be required to teach, improve and maintain that behavior.
One day, feeling particularly generous, you perform an unexpected act of kindness for a friend. She is so surprised at your gesture & thoughtfulness that she sends you a thank you card and a small bouquet of flowers. You in turn are surprised and pleased at this unexpected “reward” for your actions. A few weeks later, you think of your friend while shopping and on a whim, pick up a jar of preserves you know she loves. You are expecting no reward, just wishing to express your affection. Once again, she sends a thank you card and a small bouquet of flowers. The same exact arrangement. The same thank you card.
You brush off any puzzlement about her response, but the next time you do her a favor, and the same card and same bouquet arrives, you begin to wonder. The pleasure and surprise you felt the first time you received that card and those flowers has begun to somehow dim into a vague annoyance and anticipation of the same damn thank you card and stupid flower arrangement. You begin to question the value of your gifts to her – whether picking up a quart of milk for her or driving an hour out of your way to pick up her mother-in-law at the airport, her response is always the same. That card and those flowers. There is not only a lack of appropriate response, but the grinding repetition begins to bore you. You might begin to lose your motivation to help her or bring her unexpected gifts. She’s so bloody predictable!
But what if tickets to a Broadway play arrived with her thanks after your airport run? What if in response to the milk pickup she stopped by with a warm cinnamon bun, or some fresh herbs? What if you arrived home from a weekend away to discover that she had weeded your vegetable garden and put a perfectly silly hat on your scarecrow as a way of saying, “Thanks for babysitting that afternoon”? The variety of her “rewards” to your “behaviors” would be highly motivational, and encourage an ongoing relationship of give and take. The rewards would also rank higher in intensity because they were novel and unpredictable.
Do you think a dog is any different? Variety and intensity are closely linked. My dogs will work for food, for praise/petting, for tennis balls, sticks or Frisbees, and seem to live for the thrill of attacking a running garden hose. They will work especially hard for certain privileges that allow them to be with me, such as an “only dog” ride in the truck or the privilege of being “barn dog” for evening chores. Most important of all, they all work happily (and to the extent I insist on it, precisely) because they do not know what the reward may be. I may call a younger dog from a play group and surprise her with some liver from my back pocket. Another time, I may call her and reward with generous verbal praise and a long hug before sending her off to play again. Another day it may be toy I found, or the chance to play tag together or simply share an exuberant “good dog” dance. It’s never just the same old liver or ball or anything – it’s a variety of rewards.
Frequently, handlers tell me that their dog only works for balls, or food, or whatever. They are serious! Stretching the example a little, imagine a husband who tells you that only diamonds make his wife happy. Wouldn’t you question that relationship? Either the wife is very shallow and limited in her definitions of pleasurable experiences, or the husband offers no “rewards” but diamonds. In my experience, handlers whose dogs work for only one reward do so because they have taught the dog that a ball or a treat or whatever is the only reward. It is up to the handler to discover as many ways to please, excite, thrill and motivate the dog as possible and use them all as rewards when training. A dog must understand that this is a reward. My dogs all grow up learning that silly games are fun, and fun is always a useful reward. (Be cautious before deciding that you too will now reward your previously food/toy trained dog with silly games. You may end up leaping around like a fool while the dog stares at you in amazement, wondering what the hell happened to his liver treat or Frisbee!)
The more rewards you have at your disposal, the more training you can do in almost any situation. You don’t need to hunt up that special treat or toy or whatever – you can use as a reward nothing more than your own excitement, sincere pleasure and a willingness to entertain your dog with a silly game of tag. Handlers have a tendency to lean on just one or two rewards without realizing that may, in certain circumstances, seriously limit their ability to communicate to the dog how well he did. If, for example, you rely on tennis balls and verbal praise, what happens when your dog is 13, deaf and unable to chase a ball? What happens if you lose your voice or cannot throw a ball? If I were able to move nothing more than my facial muscles, I’ll still have some useful rewards left since my dogs have learned many silly games revolving around blinks and winks and changes of expression.
At one camp, we did a “silent” training – a standard obedience routine but handlers were completely silent. Half of the dogs actually became scared and very confused. Another 25% of the dogs, while anxious, could get sufficient clues from smiles and body posture to keep working, though the relief on their faces was palpable when the handlers “regained” their voices! The remaining dogs had no problem working as usual, since their handlers relied on many different forms of communicating and rewarding.
Look at it this way. Your employer may have only one way to reward employees – a raise on a predictable time table. How motivational is that to you? If you are like many people, you know that unless you really screw it up, the raise will occur. Regardless of the year end speech of how much your hard work has been appreciated, chances are the reward occurs far too long after the action to be really motivational to you.
But what if your employer wandered by one afternoon, looked over your shoulder, and thrilled with your work on a project, immediately handed you your coat and $100 with the best wishes for a pleasant afternoon off? What if, on a random basis, you received a 10% bonus in your weekly paycheck as a reward for a job well done in the last 7 days? Or simply a T-shirt proclaiming you employee of the day? Or a new mug or a small box of chocolates on your desk one morning with a note of thanks from the boss for working overtime? Chances are good that you might find yourself more motivated to work harder. Why? Because they were true rewards – unseen before they arrived, unexpected, and novel in their variety.
Get creative. Develop your own tool kit of rewards of varying intensities (everything from a simple “thanks” to a singing telegram of “WOW! What a dog!”) and varying modalities: touch, taste, smell, hearing, sight, tenderness, excitement, laughter, active, passive, freedom, intimacy, a “day off”, and all the many wondrous things that make your dog glad that he’s alive. This may be as simple as a good hug for your Basset and then a long walk where he’s allowed to sniff the world to his heart’s content with nary a “No sniff” to be heard. It may mean throwing that ball 20 times more than you really wanted to, because your dog loves that best of all. If you pay attention, you and your dog will discover the world is full of rewards, the greatest of which is simply being together.
FREQUENCY & SCHEDULING (Lassie Goes to Vegas)
It is in the application of rewards, i.e., the frequency or schedule on which the rewards are given, that many handlers lose their way and inadvertently create training problems. Although at first glance it seems to make no sense, a reward for every action is NOT motivational. A random reinforcement of a learned response is the most powerful reward of all. Why?
In a study with rats, researchers set up two groups of rats who were trained to press a lever to receive food. In the first group, every time a rat pressed a lever, he received a pellet of food. In the second group, a food pellet was delivered at random intervals. After an initial flurry of working the lever for food, the rats in the first group soon found the activity of no great interest, and would only press the lever as often as they needed to eat. In the second group, the rats began to act like Las Vegas gamblers at a slot machine – no response from the machine but they kept feeding in their nickels! Despite the fact that they needed to press the lever many times, and that the food pellet appeared only intermittently, these rats would work very hard. Ultimately, some of the rats in the second group expended more energy pressing the lever than was provided by the food (eliminating hunger/need as the motivation). To put a very human slant on the rat’s behavior, their willingness to continue pressing a lever with no guarantee of reward did not deter them, but increased their interest in pressing that lever one more time. I do wonder if some of them were muttering to themselves, “I know this time it will work – I’m feeling lucky now!”
The power of rewards given at intermittent intervals is what makes gambling (in any form, whether casino style or a daily lottery) so attractive. You may spend $100 in daily lottery tickets over a period of a few weeks, yet a $50 win is so exciting that you do not bother calculating that you’re actually operating at a $50 loss. Of course, hidden in that daily lottery gamble is a “lure” of a much bigger reward than a mere $50.
The scheduling of rewards moves progressively from reward for every effort to randomization as determined by the animal’s demonstration of consistent, reliable performance of any given task. You must be very clear in your own mind about the schedule you are working, or when problems arise, you cannot easily step back a notch or two if needed.
Randomization allows you to: a) gain more correct repetitions of a behavior for the same reward (i.e., 10 sits for one cookie, rather than 1 sit for 1 cookie); b) chain a series of behaviors that must be performed before the reward is given (i.e., heel through an entire Novice heeling pattern before being praised). Keep in mind that when training, you are actually creating a long chain of behaviors. The recall, for example, is a chain of behaviors that simplified, might look like this:
- Behavior 1 – Sit/down stay plus
- Behavior 2 – Move towards owner plus
- Behavior 3 – Sit in front of owner
- All 3 Behaviors combined = RECALL
As the behaviors are chained together, you have already begun to randomize reward, whether you know it or not. Initially, teaching the stay, you rewarded that behavior. When asking the dog to come towards you, you rewarded that behavior. When teaching the front, you rewarded that. Soon, you are no longer rewarding the stay, but offering some reward for coming toward you and the biggest reward occurs when the dog sits in front of you. After a while, the only reward may occur after the sit. When you add a finish to the recall exercise, you’ve chained even more behaviors, and progressively, the reward is delayed until the dog is sitting at heel.