Experimentation is widely used in marketing research. Marketing experiments have been conducted in such diverse activities as evaluating new products, selecting advertising copy themes, determining the frequency of salespeople's calls, and evaluating all aspects of a movie (including ending, pacing, music, and even the story line). For example, as shown in Exhibit 8.1 the ending of the very successful movie Fatal Attraction was changed because test audiences did not like the original ending.
This chapter discusses the objectives of experimentation illustrates techniques for designing and analyzing marketing experiments including :
· the nature of experimentation
· ingredients of a marketing experiment
· sources of invalidity
· models of experimental design
· panels and experimental design
· difficulties in field experiments in marketing
This chapter further discusses experimental designs and logic for advanced surveys using the capabilities of Qualtrics.com, including the following :
· piping of text, graphics and experimental treatments
· simple branching and compound branching logic, looping and piping of answers
· question and treatment blocks
· quota fulfillment
· randomization of answer choices, questions, treatment blocks and alternative
questionnaire forms.
The Nature of Experimentation
Two general types of experimental designs exist—natural and controlled. A natural experiment is one in which the investigator intervenes only to the extent required for
measurement, and there is no deliberate manipulation of an assumed causal variable. “Nature” produces the changes. In contrast, in a controlled experiment two kinds of intervention are needed :
1. manipulation of at least one assumed causal variable
2. random assignment of subjects to experimental and control groups.
True experiments have both types of intervention, while quasi-experiments manipulate the variables but do not randomly assign the subjects. All true experiments have certain things in common treatments (i.e., assumed causal variables), an outcome measure, units of assignment, and some comparison from which change can be inferred and, it is hoped, attributed to the treatment. Quasi-experiments, on the other hand, have treatment, outcome measures, and experimental units but do not randomly assign subjects to treatments. Rather, subjects already belong to groups that differ from each other in ways other than the presence of a treatment whose effects are being tested (Cook and Campbell, 1990).
Exhibit 8-1 Test Audiences have Profound Effect on Movies
What if E.T. hadn't made it home, or Richard Gere had failed to come back for his pretty woman, Julia Roberts? Believe it or not, in the original versions of these films, E.T. died on American soil and it was Roberts who rejected Gere at the end of "Pretty Woman."
So who changed these potential misses into hits? It wasn't the producers or the writers who prevailed for change – it was the moviegoers. In what may be Hollywood's last and most closely guarded secret, the test audience is having a profound effect on the movies you watch. Scientific Findings?
In the original version of the hit "My Best Friend's Wedding," Rupert Everett had a minor role as Julia Roberts' gay best friend. But test audiences wanted more. So the ending was scrapped, the set rebuilt, and Everett's character came back for one final appearance. That's what can happen if test
audiences love you. But what if they loathe you? In the 1987 thriller Fatal Attraction test audiences so despised Glenn Close's character that they became responsible for having her killed off in the end.
Director Ron Howard and his partner, producer Brian Grazer, are responsible for hits like Far and Away, Ransom, Apollo 13, and A Beautiful Mind. Howard says, "What I would hate to do is put the
movie out there, find out that the audience is confused about something or upset about something that you could have fixed and realize I had no idea they'd respond that way." Grazer and Howard were dealt one surprise when they tested the 1989 film Parenthood. "The audience told us there was too much vulgarity in the movie," Grasier says. "We took out a lot of the vulgarity. The scores went up, made us feel better, the movie played better. It didn't offend anybody." Source: Adapted from Bay (1998)
Objectives
The term experimentation is used in a variety of ways and for a variety of objectives. In the discussion of this chapter we shall use the term “experimentation” to describe an experiment may be conducted for the purpose of identifying relevant variables as well as the functional form of the model that links these variables with the criterion variable under study. Perhaps the characteristic that best distinguishes experimentation from observational studies (which are also employed in measurement and estimation) is that experimentation denotes some researcher intervention and control over the factors affecting the response variable of interest to
the researcher. Experimentation permits the establishment of causal relationships. In contrast, correlation analysis (a useful technique in observational studies) permits the analyst to measure the degree to which changes in two or more variables are associated with each other.
An Industry Example
A national producer of packaged candies was interested in children’s preference for various formulations of one of its well-known candy bars. Type of chocolate, quantity of peanuts, and amount of caramel were independently varied in a factorial design (described later in the chapter) of 2 types of chocolate by 3 quantities of peanuts by 3 amounts of caramel. Paired combinations (i.e., two combinations at a time) comparisons involving the 18 combinations were made up and evaluated by various school children between 8 and 12 years of age. Interestingly enough, the company found that preferences for type of chocolate varied with the amount of caramel. In addition, while children preferred more peanuts to fewer peanuts, the intermediate level of caramel was the most preferred. The company modified its formulation to match the most preferred test combination.
Many other experiments have been carried out involving taste testing, package design, advertising type and quality, price sensitivity, and other marketing variables.
Ingredients of a Marketing Experiment
An experiment involves a series of interrelated steps, as shown in Figure 8-1 (note the similarity to Figure 2-1). Our concern in this chapter is primarily with defining variables, designing the experimental procedure, and conducting the experiment. The other steps are discussed elsewhere in this book, and for experimentation they do not differ from general concepts.
All experiments involve three types of variables. First, there is the “treatment” variable whose effect upon some other variable the experiment is designed to measure. This is the variable that is manipulated, and presumed to be the cause. It is often referred to as the independent variable. Marketing experiments often involve more than one treatment variable. When this is the case, the researcher may be interested in observing the effects of combinations of treatment variables, in addition to the effect of each one individually. In short, there may be interaction effects. Interaction refers to the situation like the candy bar example cited above, where the response to changes in the levels of one treatment variable (type of chocolate) is dependent on the level of some other treatment variable(s) (amount of caramel) in the experiment.
The second broad type of variable in an experiment is the outcome or dependent variable. In the preceding candy bar example, the dependent variable was product preference.
The last category of variables consists of those other than the manipulated independent variables that could influence the observed effects (i.e., dependent variable). These are known as extraneous variables, and unless controlled adequately they are the source of errors in an experiment. These will be discussed in a later section of this chapter.
Test Objects
The terms test units, test objects, and subjects are used interchangeably in discussing experimentation. All are used to refer to the units whose responses to the experimental treatment are being studied. In marketing research the experimenter has a choice of three possible universes of test units people, stores, and market areas. Which is most appropriate depends on the problem forming the basis of the experiment (Banks, 1965, pg. 13-15).
The experimenter must contend with differences among the inherent properties of the test objects. For example, if a researcher is interested in the effect of shelf height on the sales of a
packaged consumer product, it is to be expected that stores will vary in their amount of shopping traffic, placement of shelving units, and the like. If the experimenter is interested in the effect of various shelf heights on product sales over a variety of store sizes, several stores will have to be used in the analysis. If so, he or she may use a technique called covariance analysis, in which responses to the controlled variables (shelf height) are adjusted for inherent differences in the test objects (stores) through measurement of these characteristics before (or during) the experiment.
Measurement, Manipulation, and Experimental Procedures
A critical aspect of all experiments, indeed of all marketing research, is measurement. Our concern at this point is with the operational problems of measurement. The concepts, levels,
and techniques of measurement, and scaling are covered in Chapter 9.
In a marketing experiment, it is the outcome or dependent variable that is measured. Generally, the operational measures used can be classified as physiological or psychological
measures. Physiological measures include those obtained from devices that measure such things as eye movements or electrical conductivity of the skin (psychogalvanometer) during advertising readership. Such devices are used in laboratory experiments. Psychological measures include verbal measures include spoken and written responses, including responses provided interactively with a personal computer. They may additionally include direct measures such as dollar amounts or units of a product that are sold or consumed, and actual behavior (or assumed actual behavior) of people under the conditions of the experiment. Such measures are obtained by self report, observation, or electronic means.
Turning now to manipulation, an experimental treatment must be capable of variation. There are at least three ways in which variation in the independent variable can be achieved. First, the type of variable can be manipulated. For example, a company interested in the effect of
image on sales could conduct an experiment by running a series of advertisements, each of which was designed to convey a different image of its product. Variation is generated in the type of image conveyed.
Turning now to manipulation, an experimental treatment must be capable of variation. There are at least three ways in which variation in the independent variable can be achieved. First, the type of variable can be manipulated. For example, a company interested in the effect of
image on sales could conduct an experiment by running a series of advertisements, each of which was designed to convey a different image of its product. Variation is generated in the type of image conveyed.
Second, there is the presence versus absence technique. For instance, one group of people could be shown a new advertisement and their responses to an attitude measurement could be
compared with the response from a group that did not see the advertisement. Finally, the amount of a variable can be manipulated; different amounts are administered to different groups. This technique is used in such experiments as those where different prices for a product are “tested” and the outcome measured as “units sold.”
As with any approach to marketing research, all phases of an experiment should be carefully planned in advance. After decisions have been made concerning measurement, research subjects, experimental design, control techniques, and manipulation, there is the need to plan everything that will take place in the actual experiment itself through to the end of data collection. This includes the setting of the experiment, physical arrangements, apparatus that will be used, data collection forms, instructions, recording of the dependent variable, and so forth.
In any experiment, control over all possible variables affecting the response is rarely possible. Even in the laboratory it is not possible to control all variables that could conceivably affect the outcome. But compared with the laboratory situation, the researcher who is working in the marketplace has a really difficult control job to do. The marketing researcher must try to design the experiment so that the effects of uncontrolled variables do not obscure and bias the nature of the response to the treatment variables that are being controlled.
There are five general techniques for controlling extraneous variables
1. Elimination of extraneous variables, including controlling the situation in which an experiment is being conducted so as to keep out extraneous forces. Used in this way, control is much easier to achieve in a laboratory environment than in a field setting
2. Constancy of conditions, including the ability to determine which subjects or test units receive a particular treatment at a particular time. Control of the treatment variable helps separate out the effects attributable to irrelevancies that are correlated with a treatment.
3. Matching (sometimes called balancing). Marketing researchers most often match by equating subjects or by holding variables constant. In equating subjects, each group is controlled for all variables except the treatment variable. For example, if it were felt that gender and age of subject would influence taste test results, each group would have the same gender-and-age proportions. Matching by holding variables constant involves creating constancy for all groups. Again referring to a taste-testing experiment, the gender variable could be controlled by using only all male or all female subjects. If time of day is an important extraneous variable that could affect one’s “taste buds,” then the experiment should be conducted at approximately the same time on successive days. Or, if time of day affects subjects’ purchasing behavior (purchase occurs after a certain time), then the experiment should not be conducted before that time; McDonald’s tested pizza for the dinner menu and required that tests be conducted after 4:00 pm. Similarly, using the same facilitator in all versions of the study may control experimenter effects.
4. Counterbalancing is a technique used to control confounding, or the tangling effects of two or more levels of a treatment variable (for two or more treatment variables), used in experimentation. An illustration should make this point clearer. Suppose that a marketing researcher is interested in conducting a series of taste-testing experiments for a new soft drink. Subjective interpretations of, say, “sweetness” may well vary from subject to subject. A control procedure called counterbalancing would have each subject taste each of two drinks on the assumption that ratings will be expressed in terms of differences in sweetness over each subject. To avoid “ordering” effects on responses, the new and the control drinking would be presented in randomized order or one-half of the group would follow the sequence “established new” while the other one-half would use the sequence “new-established.” To reduce carryover tendencies, the subject would be asked to take a sip of water between testing trials.
5. Randomization provides assurance that known and unknown extraneous factors will not cause systematic bias. Statisticians have made a major contribution to experimental design in the development of statistical models that feature randomization over uncontrolled variables. Consequently, it is assumed that the effects of the extraneous variables will affect all groups in an experiment to the same extent. In theory, randomization is supposed to accomplish the conversion of all irrelevant sources of possibly systematic variability into unsystematic variability, i.e, into random error (Brown and Melamed, 1990, p. 3).
The above discussed techniques for control of extraneous variables are a key part of all experiments, however the fact remains that confounding effects can never be entirely eliminated.
Sources of Invalidity
In Chapter 3, we briefly introduced experimental errors as the extraneous forces that can affect the outcome of an experiment. Each extraneous force potentially has a bearing on the validity of an experiment and, consequently may threaten the validity of the results.
In the context of experimentation, the term validity refers to the extent to which we really observe (or measure) what we say we observe (or measure). Four distinct types of validity have been identified (Cook and Campbell, 1990, chap. 2) :
Statistical
conclusion
Internal
Construct
External
A necessary condition for inferring causation is that there be covariation between the independent variables. Statistical conclusion validity involves the specific question as to whether the presumed independent variable, X, and the presumed dependent variable, Y, are indeed related (Rosenthal and Rosnow, 1991, chap. 3). After it has been determined that the
variables covary, the question arises as to whether they are causally related. This is the essence of internal validity. A given experiment is internally valid when the observed effect is due solely to the experimental treatments and not due to some extraneous variables. In short, internal validity is concerned with how good the experiment is, as an experiment. The third type of validity, construct validity, is essentially a measurement issue. The issue revolves around the
extent to which generalizations can be made about higher order constructs from research operations and is applicable to causes and effects. Because construct validity is concerned with generalization, it is a special aspect of external validity. External validity however, refers to the ability to generalize a relationship beyond the circumstances under which it is observed. That is, external validity is concerned with how good an experiment is: to what degree can conclusions of an experiment be applied to and across populations of persons, settings, times, and so on.
To a large extent, the four kinds of validity are not independent of each other. That is, ways of increasing one kind may decrease another kind. Consequently, in planning an experiment it is essential that validity types be prioritized, and this varies with the kind of research being done. For applied marketing research, it has been observed that the priority ordering is internal, external, construct of the effect, statistical conclusion, and construct of the cause (Cook and Campbell, 1990). Accordingly, we now examine those extraneous factors that affect internal and external validity. Construct validity is discussed in Chapter 9, and factors affecting statistical conclusion validity are covered throughout the later sections of this book.
Internal
Validity
Internal validity is concerned with whether or not the observed effect is due solely to the experimental treatments or due to some other extraneous variables. The kind of evidence that is required to support the inference that independent variables other than the one(s) used in an experiment could have caused the observed effect (s) varies, depending on the independent variable (s) being investigated. However, there are some general classes of variables affecting designs that deserve mention. The following factors affect internal validity :
1. History. An extraneous event that takes place
between the pre-measurement and post measurement of the dependent variable has
an impact on the results.
2. Maturation. The results of an experiment are
contaminated by changes within the participants with the passage of time.
3. Testing. A prior measurement can have an effect
on a later measurement.
4. Instrumentation. Changes in the measuring instrument or
process, including interviewers’ instructions, over time, affects the results.
5. Selection. Different kinds of research subjects
have been selected for at least one experimental group than have been selected
for other groups.
6. Mortality. Different types of persons drop out
from experimental groups during the course of an experiment.
7. Statistical Regression. This error may
arise when experimental groups have beenselected on the basis of extreme
pretest scores or correlates of pretest scores.
Example :
To illustrate each of these factors, suppose that a
controlled experiment is setup for salesperson retraining. Assume one group of
salespeople had taken a retraining course during a three month period, while
another group had not been retrained. The brand manager of a particular brand
of detergent wants to determine if retraining is a producer (i.e., a cause) of
sales performance. During the three month period after retraining, sales of the
detergent showed an unusually large increase.
Some of the foregoing error sources affecting internal validity can interact with selection to produce forces that might appear to be treatment effects. Selection-maturation results when experimental groups mature at different speeds. Selection-history can occur when the experimental groups come from different settings. For example, the salespeople receiving the retraining all come from one region of the country, whereas those not retrained come from some other region. In this situation, each group may have a unique local history that might affect outcome variables.
External Validity