Back To The Future: How a Discredited Research Tool Discarded in the 1960s Has Become Popular Again
A decade ago we did a telephone tracking study for a major cable company to measure awareness of a new advertising campaign supported by about $50mm (in today’s dollars) in ten markets. The questionnaire included every conceivable measure of ad effectiveness including unaided and aided brand and advertising awareness, unaided and aided slogan and message recall, proven and related recall and something new.
The “new” was unaided and aided recognition of four television spots—each of which was described in two sentences. Nine hundred cable subscribers were read each description and asked to name the advertiser (that’s the unaided measure). Then they were read the description again, told who the advertiser was and asked if they had seen the spot in the past 30 days. (That’s the aided measure).
Not surprisingly at the end of the campaign, scores for the various measures were all over the place. From about 3% proven recall to 31% % recall for the best of the aided “new” television commercial recognition measures.
The client Ad Director was unhappy; the agency even more unhappy. Given the high level of media weight purchased in only 10 markets, this campaign was a disaster. Hardly anyone remembered it. The campaign did nothing to change perceptions of the advertised brand, nor did it improve loyalty among current customers and subscribing intentions among prospects.
As an aside, after the campaign was launched in New York, one of my friends, John Bernbach telephoned. John is the former CEO of DDB Needham and currently COO of Engine USA, an integrated communications company. Most important, he is one of the biggest brains in the advertising industry. John said “Kev, I hear you’re working with (the cable company) to measure the effects of their new ad campaign. I just wanted you to know that this may be the worst advertising I’ve seen in my career. I have no idea what the message is. The whole thing is a mystery.” We both laughed and I responded that fortunately in this case we didn’t play a role in creating the advertising we were asked to evaluate.
No one in the client organization wanted to share the results with top management. They were afraid they’d be canned. Maybe they should have been.
The client ad execs had a solution: An even “newer” measure of effectiveness. They asked us to calculate the percentage of people who claimed that they saw any one of the four spots.
I argued that this was pretty crazy. Basically, we were loading the deck in favor of a positive outcome. The client research director said not so subtly that this was exactly what they wanted to do—boost the scores.
So we did what the client demanded and the score jumped to 57% of the sample saying that they saw at least one of the ads. Again we complained that this measure had little value but no one paid any heed.
In sum, for the different measures of awareness and recognition we had scores of 3%, 6%, 7%, 10%, 13%, 15%, 22%, 31% and now 57%.
Which one do you think got shown to top management?
The campaign was allowed to continue because this score was so high—57% appears to be a great number—and because management had been fooled into thinking that the advertising was really working. We got fired for questioning the integrity of the research findings and their misleading conclusions.
I was surprised by the clients’ enthusiasm for this discredited measure for a variety of reasons. Mostly because the tool was popular in the 1950s and 1960s and discarded because of demonstrated invalidity. There is an extensive academic literature on the failures associated with this methodology. Summing up the literature, aided television or print advertising awareness is a measure of recognition and not recall and recognition scores are too problematic to be safely employed.
When a consumer walks down a supermarket aisle, the question is what does she know/remember about the advertised brands on the shelf? What does she know/remember from an ad for Brand X that would motivate her to buy Brand X over Brand Y? Is there a message lodged in the consumer’s memory based on this advertising that when activated would lead to brand purchase?
Now there are a number of terrific measures for capturing that kind of memory. I particularly like the measure of proven recall. Asking someone to tell you what they saw heard or read in the advertising and comparing it to the ad itself is a great way to find out if they really saw and remembered the ad. Aided television commercial awareness is not one of these terrific measures. It’s a measure of recognition, not recall, and recognition scores have been debunked for at least five decades.
Researchers have discovered that the recognition method is not an indicator of memory because scores don’t decline much with the passage of time. Proven recall, for example, drops from 3% or 4% 24 hours after exposure to 0% at the end of a week. If a television spot or full page print ad yields a 20% or 30% recognition score 24 hours after exposure, it will show the same score (plus or minus sampling error) a week or even two weeks later. It’s a measure that keeps on giving.
Equally important, recognition measures as a class are subject to high levels of overstatement. Over a range of published studies, overstatement ranges on average from 15% - 50%, occasionally as much as 100%. In one published study, by our own Kevin Clancy and Lyman Ostlund, in The Journal of Advertising, respondents were shown 8 ads, 4 of which actually ran and 4 never ran (they were created just for the research test). Recognition scores for the two groups were comparable. That is to say, ads which never ran earned recognition scores comparable to those which did. This was a very disturbing finding but not radically different than what has been shown in the academic and trade literature for over 50 years. When someone says “Yes, I saw that ad,” researchers simply don’t know what that means. Maybe they did, maybe they didn’t. Maybe they’re just confused or trying to please the interviewer. Darrell Blaine Lucas, the late, great professor of advertising at New York University, famously said “recognition measures capture what consumers think they might have or should have watched and remembered, not what they actually watched and remembered.”
Yet today an increasingly popular method for measuring advertising effectiveness over time is to expose respondents in an internet survey to a commercial (or commercials) and asking them whether they have seen it. This is a variant of the same method used in the cable company campaign. In the cable study, which was done by phone, people were read descriptions of the ads, while in an internet study they’re actually shown the ads.
Why is this problematic methodology increasingly popular? Consider the background: The essential problem is that advertising today is increasingly ineffective. Peter Krieg and I have written extensively about the negative ROI of contemporary advertising which, in our view, can be traced in large part to the fact that most ads today lack a selling message. If an ad does not give people a reason, tangible or emotional, to buy the advertised product, then chances are they won’t buy it.
Because most advertising is weak , recall scores are low and changes in brand perceptions are hard to find.
Marketers and agencies say “that number is too low. Our advertising is better than that; let’s look at softer indicators of effectiveness, such as the % of people who play back our slogan.” When those numbers are also low, the cry goes out, “focus on the recognition scores! They’ll be better. They’ll be higher.” They are. Unfortunately, they are also relatively useless.
Eventually we reach the point where we are showing people the 30-second ad twice in the same research study and sometimes we are showing multiple commercials from same campaign and asking “Did you see this? Did you see that?” The respondent feels stupid or uncooperative if he or she says anything other than “yes, yes, yes, of course I saw it.” That’s where the 57% came from.
So we’re “Back To The Future:” An advertising testing methodology discredited and discarded 50 years ago because of its invalidity is now routinely employed to mask the poor performance of contemporary ad campaigns.