RESEARCH METHODS IN PSYCHIATRY


Confronting the Confounders:
The Meaning, Detection, and Treatment of
Confounders in Research

Anne E Rhodes, MSc1, Elizabeth Lin, PhD2, David L Streiner, PhD, CPsych3


When one variable is studied to try to explain another, the relationship between them may be biased by a third variable. This bias, known as “confounding,” is common and must be minimized in research. This description is deceptively simple, though. Identifying confounding is complex but can be reduced to a stepped procedure. By way of examples, this article describes confounding and how to recognize it.

(Can J Psychiatry 1999;44:175–179)

Key Words: confounding, bias, statistics, epidemiologic methods

What Is Confounding?

If you read enough about research, you will most likely have come across the term “confounding.” Although you can easily find definitions in standard epidemiology textbooks (1–4), the concept is by no means straightforward. To familiarize ourselves with confounding, let’s begin with some examples. Say that we are interested in testing whether caring for an ill family member “causes” depression (Figure 1).

Strein1.JPG
Figure 1. Relationship between caregiving and depression

Our first step is to design a study. Ideally, we would conduct a randomized, controlled trial in which one-half of our participants are assigned to caregiving and one-half are not. Who would agree to participate in such a study? We need to conduct an observational study, such as a cohort, case-control, or cross-sectional study. For this paper, we will assume that the study has been well executed and that we have the data. Also, we will assume that our results show an association between being a caregiver and depression but that in reality there is no such relationship. Why would this happen? It is possible that another factor, such as gender, could produce a spurious association between being a caregiver and depression. Because we are not able to randomly allocate people to caregiving or not, we cannot prevent having unequal numbers of men and women among our caregivers. Since caregivers are often women (5), more caregivers will be women than men in our study. Since women are also more likely than men to suffer from depression (6), our data will show that caregiving is related to depression, when really it is not (Figure 2). That is, the “real” association is between gender and depression, and any other variable that is correlated with gender, such as wearing skirts or putting the toilet seat down, will also be correlated with depression.

Strein2.JPG
Figure 2. Gender as a confounder, leading to a relationship

This example shows how a third variable can make 2 other variables appear to be associated when they are not. Conversely, sometimes a third variable can actually hide or mask a real association rather than produce a false one. For example, suppose in another study, we find that gender is not associated with going to see a therapist. We are automatically suspicious, since many other studies have shown that women typically seek treatment more often than do men (7). When we look at the data more carefully, we find that our sample has many low-income women in comparison to men. (Our study design did not attempt to control for income differences). Since people who have high incomes are more likely to see a therapist, the association between gender and going to see a therapist appears when we control for income. In this example, income actually hides a real association between gender and going to see a therapist, until we control for it (Figure 3).

Strein3.JPG
Figure 3. Income as a confounder, masking a relationship

These examples show that sometimes there is a third variable, or a confounder, that hinders testing a hypothesis of interest. Let’s see how it happens and what can be done about it.

The Lurking Confounder

If we knew the literature well, we might suspect the confounders described in the above examples. But, as in real-life police work, capturing and disarming a confounder is more complicated than identifying a suspect from a lineup. We need to differentiate whether our suspect really is a confounder or merely an innocent bystander.

In the depression and caregiving example, our suspect, gender, is associated with depression and caregiving. So gender looks guilty because it keeps regular company with both depression and caregiving. In the gender and therapy-going example, our suspect, level of income, is linked with both gender and therapy-going. The case against our suspects is building, but we don’t yet have enough information to convict them. Determining whether a suspected variable is associated with both the independent and dependent variables of interest is the first step in assessing confounding.

If the suspect is associated only with the dependent variable or only with the independent variable but not both, then it is not a true confounder. More subtly, the confounder must be related to the independent variable regardless of the dependent variable, and the confounder must be related to the dependent variable regardless of the independent variable. So in the caregiving example, gender is related to being a caregiver in those who are depressed and in those who are not depressed. Also, gender is associated with depression in caregivers and noncaregivers.

Often in our analyses, we come upon variables that are related to one another, and we are not sure which ones are independent variables and which ones might be confounders. The study’s hypotheses and purpose are key to helping us decide. If our goal is simply to predict the dependent variable, then confounding is not an issue. We can happily proceed using stepwise regression (8) or some other technique to select the best predictors. However, if our goal is to explain what causes the dependent variable, then we need to focus on a specific independent variable and worry about confounding.

Let’s use an example to illustrate the difference between prediction and explanation. We have observed a strange neuropsychological phenomenon: on Monday nights, some men become loud and obnoxious. After some observation, we venture that this behaviour is connected to watching football. However, not all men who watch football act this way. After more study, we narrow our hypothesis to the association of 2 substances with this behaviour: potato chips and beer. We can stop here if we are not interested in what causes the behaviour; we just want to identify these men so that we can stay away from them! In this instance, whenever we see a man eating chips and/or drinking beer, we know to head in the opposite direction. However, if we decide that we want to know why some men act this way so as to stop it, then we have to do a bit more detective work. It occurs to us that men who just eat chips behave fine. However, if men eat chips and drink beer together, they are troublesome. Come to think of it, they’re a pain if they just drink beer and don’t eat any chips. We begin to realize that potato chips, on their own, are not associated with annoying behaviours. They only appear to be a problem because they often accompany beer drinking (Figure 4).

Strein3.JPG
Figure 4. Effects on behaviour of eating chips and drinking beer

Alcohol is the likely suspect causing the behaviour. This fits with what we know about the effects of alcohol on the brain; after all, when were we last warned not to eat potato chips and drive? So to return to our study goal, to explain the dependent variable, annoying behaviour, we would focus on beer drinking rather than chip eating. However, we need to consider whether eating chips is a confounding variable. In this example, it is not a confounder, because we find it is related only to the independent variable (beer drinking) but not to the dependent variable (raucousness), so we do not have to control for it. In fact, we would be wise not to control for it. For example, if we put both drinking beer and eating potato chips in an equation to explain annoying behaviours, eating potato chips would steal some of the thunder (variance) from drinking beer because they are highly associated, and beer drinking would not look as important as it should.

Let’s return to our caregiving and depression example. Because we are interested in explaining why people get depressed and not simply who gets depressed, we cannot shy away from the possibility that gender confounds the relationship between caregiving and depression. The next step in our detective work requires some knowledge or theory about the relationships between the variables under study. We are relatively certain that caregiving does not cause gender (Figure 2). But, putting gender aside for a moment, what if our suspected variable were social isolation? It is quite possible that being a caregiver causes one to become socially isolated and that social isolation causes depression. Here, being a caregiver is an indirect cause of depression (Figure 5).

Strein3.JPG
Figure 5. Social isolation in the causal pathway between caregiving and depression

Although social isolation is related to both caregiving and depression in this example, social isolation is not a confounder, because it is in the causal pathway between gender and depression; that is, caregiving leads to isolation, which in turn leads to depression. We would not want to control for social isolation, because the association between caregiving and depression could disappear, leading us to think, incorrectly, that caregiving does not cause depression. We have just identified the second step in identifying a confounder. A confounder gets in the way of understanding relationships and must be controlled for; it is not part of a causal pathway (1).

Our detective work is almost done. In the caregiving and depression example, gender meets the first 2 criteria for being a confounder (that is, it is related to both the dependent and independent variables, and it is not part of the causal pathway). The last step is probably the most difficult to comprehend. In our caregiving study, assume that we have measured depression using the Center for Epidemiologic Studies-Depression (CES-D) scale (9), scores for which can range between 0 and 60. We find that the mean CES-D scores for caregivers in our sample is 25, whereas the mean score for noncaregivers is only 5. We are initially encouraged because this large difference fits our hypothesis that caregivers are more depressed (Figure 6). Next, we compare the mean CES-D scores for caregivers and noncaregivers, but now broken down by gender (Figure 7). In this graph, we see that women have higher mean depression scores than do men but that the difference between caregivers and noncaregivers, whether they are men or women, is small and constant; that is, 2 points. In fact, the 2 lines are parallel. A constant difference between the independent and dependent variables at all levels of the potential confounder is the final piece of evidence we need to identify a true confounder.

Strein3.JPG
Figure 6. Relationship between caregiving and depression using a continuous measure

Strein3.JPG
Figure 7. Caregiving and depression by gender

If the difference were not constant (that is, the lines were not parallel), then this would be evidence of an interaction rather than confounding. Here, the relationship between caregiving and depression would be different for men and women; that is, the distance between the lines for caregivers and noncaregivers would not be the same for men and women. Therefore, we would need to talk about the relationship between caregiving and depression separately for men and women. However, because the relationship is the same for men and women, we can remove these confounding effects (for more information about interactions see [1]). Since gender has met all 3 confounding criteria, we need to control for it to accurately estimate the true association between caregiving and depression.

In this example, we looked at differences between the means, because our dependent variable, the CES-D, is a continuous variable. For categorical dependent variables, the same rules apply, but the detection method differs. Suppose we were dealing with a dichotomous dependent variable, such as whether the person is diagnosed as having a mood disorder or not. Instead of examining the differences between the means, we would examine differences between measures of risk, such as the odds ratio or the relative risk (10).

It should be amply clear now why confounding can be a confusing concept to the uninitiated. It is not an intuitive process; it requires some understanding of statistics—the specific criteria that distinguish a confounder from other types of relationships occurring among independent, dependent, and “suspect” variables (1–4)—as well as substantive knowledge of the clinical area being studied. Most of us would probably like to avoid confounding altogether. This would be a shame, though, because we would either restrict ourselves to prediction types of questions or conduct explanatory studies that fall short by not controlling for confounding. So, read on.

Spotting a Confounder

Now that we’ve discussed what confounding is and when it matters, let’s learn to recognize it. First and foremost, confounding can (and should) be considered at the design phase of a study. For example, in many observational studies, investigators use matching, stratification, or restriction of the study sample to control for confounders (such as age and gender). In clinical trials, to avoid confounding, randomization attempts to make the groups as similar as possible on all variables that may be confounders. Multivariate analyses may also be used to control for confounding variables after the fact; that is, at the analysis stage (1–4).

During the analysis phase of the study, the assessment of confounding involves comparing the initial relationship between the independent variable and dependent variable (for example, caregiving and depression) with the relationship between caregiving and depression now controlled for the suspected confounder, gender. In Figure 6, the mean difference between caregivers and noncaregivers (crude relationship) was 20 points. In Figure 7, after controlling for gender, we found a constant difference of 2 points between caregivers and noncaregivers. The mean difference between caregivers and noncaregivers was reduced from 20 to 2 points once we controlled for gender. Since we feel that a distortion of 18 points with respect to the CES-D scale is quite important, we would need to control for gender.

When we have a dichotomous dependent variable (for example, the presence or absence of a disorder), we assess confounding by comparing the initial risk ratio with the risk ratio controlled for the potential confounder. A general rule is to control for confounding if the difference between these risk ratios is 10% or more (11).

Note that we haven’t mentioned statistical significance. This is because confounding is about distortions in the magnitude of a difference, whereas statistical significance is affected by the magnitude of a difference and the sample size (1). We all know that if the sample is large, we can achieve statistical significance with puny differences. So, the moral of the story is to compare the initial estimates (crude estimates) and estimates controlled for the potential confounder (adjusted estimates), since these give a direct measure of the magnitude of the difference (12). Then a judgement can be made about whether the magnitude of the distortion is clinically relevant. Beware of studies that attempt to reassure the reader by saying, “We didn’t need to control for variables X, Y, and Z, because none of them was statistically significantly related to the dependent variable.” An association between an independent and a dependent variable can still be distorted by a third variable, even if none of these variables is related to each other at a statistically significant level.

The examples we have given in this paper are (relatively) straightforward. In real life, there are often many confounders operating simultaneously. Some of these may be known beforehand, others may be suspected, and still others would be unknown. Thus, it is often a more complex job to detect and correct for them than has been presented here. But after all, that’s what statisticians are paid for.


Summary

  • A third variable can either produce a spurious association between an independent variable and dependent variable or mask a real association between them.
  • Confounding is a hindrance when explaining why the dependent variable occurs but not in predicting the dependent variable.
  • Confounding can be minimized in the design and analysis phases of a study. During the analysis phase, detecting a confounder is not intuitive but requires a specific process.

References

1. Rothman KJ. Modern epidemiology. Toronto: Little Brown and Company; 1986.

2. Kelsey JL, Whittemore AS, Evans AS, Thompson WD. Methods in observational epidemiology. New York: Oxford University Press; 1996.

3. Schlesselman JJ. Case-control studies: design, conduct, analysis. New York: Oxford University Press; 1982.

4. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic research: principles and quantitative methods. New York: Van Nostrand Reinhold; 1982.

5. Mohide EA, Pringle DM, Streiner DL, Gilbert JR, Muir G, Tew M. A randomized trial of family caregiver support in the home management of dementia. J Am Geriatr Soc 1990;38:446–54.

6. Bland RC. Epidemiology of affective disorders: a review. Can J Psychiatry 1997;42:367–77.

7. Lin E, Goering P, Offord DR, Campbell D, Boyle MH. The use of mental health services in Ontario: epidemiologic findings. Can J Psychiatry 1996;41:572–7.

8. Streiner DL. Regression in the service of the superego: the do’s and don’ts of stepwise multiple regression. Can J Psychiatry 1994;39:191–6.

9. Radloff L, Locke B. The community mental health assessment survey and the CES-D Scale. In: M Weissman, J Myers, C Ross, editors. Community surveys of psychiatric disorders. New Brunswick (NJ): Rutgers University Press; 1986. p 177–89.

10. Streiner DL. Risky business: making sense of estimates of risk. Can J Psychiatry 1998;43:411–5.

11. Moldonado G, Greenland S. Simulation study of confounder-selection strategies. Am J Epidemiol 1993;138:923–36.

12. Elwood JM. Causal relationships in medicine. New York: Oxford University Press; 1987.

Résumé

Lorsqu’on étudie une variable pour tenter d’en expliquer une autre, la relation entre elles peut être biaisée par une troisième variable. Ce biais, appelé « confusionnel », est commun et peut être minimisé par la recherche. Pourtant, cette description n’est simple qu’en apparence. Reconnaître le biais confusionnel est complexe mais peut se réduire à une procédure graduelle. Par des exemples, le présent article décrit le biais confusionnel et comment le reconnaître.


Manuscript received February 1998, revised, and accepted May 1998.

This is the 17th article in the series on Research Methods in Psychiatry. For previous articles please see Can J Psychiatry 1990;35:616–20, 1991;36:357–62, 1993;38:9–13, 1993;38:140–8, 1994;39:135–40, 1994;39:191–6, 1995;40:60–6, 1995;40:439–44; 1996;41:137–43, 1996;41:491–7, 1996;41:498–502, 1997;42:388–94, 1998;43:173–9, 1998;43:411–5, 1998;43:737–41, and 1998;43:837–42.

1Research Associate, Arthur Sommer Rotenberg Chair in Suicide Studies, St Michael’s Hospital; PhD Candidate, Epidemiology, Department of Public Health Sciences, Faculty of Medicine, University of Toronto, Toronto, Ontario.

2Research Scientist, Health Systems Research Unit; Assistant Professor, Department of Psychiatry, University of Toronto, Toronto, Ontario.

3Professor, Department of Psychiatry, University of Toronto, Toronto, Ontario; Assistant Vice President of Research and Director, Kunin-Lunenfeld Applied Research Unit, Baycrest Centre for Geriatric Care, North York, Ontario.

Address for correspondence: Dr DL Streiner, Kunin-Lunenfeld Applied Research Unit, Baycrest Centre for Geriatric Care, 3560 Bathurst St, North York, ON  M6A 2E1

email: dstreiner@rotman-baycrest.on.ca

Can J Psychiatry, Vol 44, March 1999