1. In this course, we are focusing on the major conceptual issues regarding research design (i.e., how to frame a study, convert theories into hypotheses, reliably and validly operationalize concepts as variables, develop measurement instruments, design questionnaires, draw a sample, collect data, and contrast the value of different data collection approaches such as surveys, experiments, archival research and ethnographic studies). This week, we will consider issues regarding collecting data via an experimental design.
2. According to Rubin and Babbie, data collection premised on an experimental design is the most effective way to engage in controlled observation, to isolate causality and make a causal inference about how an independent or causal factor affects some other dependent or caused factor. Experimental designs are the best way to meet the three basic criteria for establishing causality which sound simple but can be difficult to meet. First, the cause should precede the effect. Second, the two factors or variables should be empirically correlated with one another and vary together such that when one changes in score so does the other positively or negatively, as in the case of height or weight being correlated with age among healthy young children or in the case of self-esteem scores varying with income. Third, the empirical correlation between the two variables should not be a spurious correlation, i.e., it can not be accounted for by other variables that cause the two variables to vary together, as when a correlation between income and self-esteem is found to be spurious because both income and self-esteem are caused by job status.
3. The classical experimental design is where there is an experimental group that is exposed to the causal or independent factor be it a stimulus, treatment, intervention, service or what have you, while a control group is not given that stimulus or treatment. Most often, the people or subjects being studied are randomly assigned to the two groups. Sometimes matched pairs of similar individuals are randomly assigned to the groups with one person from each pair going to the experimental group and one person from each pair going to the control group. It is important to note that randomization is not the same as random selection. The former is emphasized by experiments to ensure internal validity or a valid test of a cause and effect relationship by trying to make sure that the groups compared are equivalent in all respects except for the experimental treatment. The latter is emphasized in cross-sectional studies like surveys to promote external validity or generalizability to a larger represented population.
4. Experiments usually involve testing to see if the people in an experimental group are affected differently than the people in the control group. The ideal controlled experiment is not always feasible, especially in social work settings out in the real world of practice and service delivery. Therefore, there is the need to cosndier not just experimental designs but quasi-experimental designs.
5. The main advantage of experimental designs is that they systematically account for threats to internal validity, i.e., pitfalls that may impede the valid documentation of a relationship between a causal variable and its effect on some other factor. The major threats to validity include:
history—extraneous events occur of the course of time that elapses while an experiment is ongoing and they, not the independent causal factor, affects the dependent or caused factor;
maturation—the subjects being studied change due
to aging or the passage of time (without being affected by either the causal
factor or extraneous events); testing—pre-tests or other testing prime
the subjects so that they answer later during a post-test or other tests
differently than they would otherwise;
testing-the mere fact of being tested creates the effect that you are looking for; subjects can improve their performance on a post-test by taking the pre-test;
instrumentation—the testing is done in a way that affects how people are scored, especially where the test scoring method changes as in asking different questions or using different standards of judgment such that it looks like people changed but they really did not;
statistical regression—often called regression to the mean where the people being studied are first tested when they are likely to have extreme scores that will lapse back toward their more normal or average scores later on when they are given a posttest;
selection bias--where people or subjects are not randomly assigned to groups or are not assigned in a way that ensures the groups are comparable;
experimental mortality--where selected subjects drop out of the study;
ambiguity about the direction of causal influence--wherethere remains confusion about the causal order such that we can not be sure that the cause precedes the effect, as when those who completed a drug treatment program may have already chosen abstinence, such that giving up drugs preceded and in fact promoted successful completion of the treatment program rather than successful completion of treatment engendering abstinence;
diffusion or imitation of treatment--where subjects in the control group are not really treated that differently because the treatment reserved for the experimental group is spread to the control group or is imitated.
6. Various designs, the classic two-group pre/post-test design, the two-group post-test only design, theSolomon four-group design, provide different ways to counter these threats to internal validity and increase the chances of successfully engaging in controlled observation and isolating causality. They all are based on using random assignment or matching with random assignment in order to establish equivalent groups.
7. Yet, even if the threats to internal validity are effectively countered; experiments tend to have a harder time of ensuring external validity or generalizability to a broader population. One major problem experiments have is that there tends to be a tradeoff between internal validity and external validity. The more the experiment is controlled, the more artificial the situation is and the less generalizable it is. Experiments tend to have the problem of research reactivity where subjects behave differently than they will normally outside of the experiment because they are reacting to being in an experiment. This is sometimes referred to as the "Hawthorne Effect."
8. To address this problem sometimes a placebo design is used where one group is given what seems to be the treatment but it is not. This can provide data on the extent to which people are changing their behavior due to simply being in the experiment. Yet this is not always feasible to do and sometimes it is hard to distinguish the placebo effect from the actual treatment effects, especially with counseling and other treatments that resemble the effects of people sensing someone is paying increased attention to them as in an experiment.
9. Given the problems of generalizability, sometimes cross-sectional studies, like surveys of large probability samples, are preferred. Post-hoc controlling through multivariate analysis of the survey data, based on extended survey questions on many different background factors, is used to try to approximate establishing the conditions for making causal inferences.
10. Classic experiments of compared equivalent groups can be done in the field but field conditions of the real world of service delivery can complicate the process of ensuring a real experiment with equivalent groups, random assignment, and controls for threats to internal validity. Field experiments present extra challenges.
11. Classic experiments are themselves not always feasible, especially where we want to study actual programs, services, treatments, etc., as they are being administered in the real world. This is especially the case when it is difficult or impossible to randomly assign clients and create equivalent groups. This is especially the case where there are ethical issues regarding deciding who should or should not get access to a service. Here we turn to quasi-experimental designs. These studies can take various forms to try to create the conditions for controlled observation even where there can not be random assignment and equivalent groups: (1) simple interrupted time-series where there are multiple points in time for measurement of effects; (2) multiple interrupted time-series where nonequivalent groups are both measured at multiple points in time; and (3) basic nonequivalent control group design where two nonequivalent groups, like two different schools or two different programs are compared.
12. Given the limitations of experiments, triangulation to measure the same effect in multiple ways is advised. Triangulation also in the sense of supplementing the quantitative research with qualitative observation techniques is important as well. Otherwise, pitfalls in measurement and observation associated with experiments will result in mischaracterizing the causal relationship you want to examine.