Canadian Psychiatric Association

Editorial Credits/ Crédits éditorials

Subscription Rates /Prix d'abonnements

Advertising Rates / Tarifs publicitaires (PDF)

Guest Editorial
Eating Disorders
Paul E. Garfinkel
PDF

In Review
Pharmacologic Treatment of Eating Disorders
April J Zhu, B Timothy Walsh
PDF

Psychological Treatments for Anorexia Nervosa: A Review of Published Studies and Promising New Directions
Allan S Kaplan

PDF

Original Research
Acute Psychiatric Inpatient Care for People With a Dual Diagnosis: Patient Profiles and Lengths of Stay

Philip Burge, Hélène Ouellette-Kuntz, Haider Saeed, Bruce McCreary, Dana Paquette, Franklin Sim

PDF

Canadian Geriatric Psychiatrists: Why Do They Do It? A Delphi Study
Susan Lieff, Diana Clarke

PDF

Relation of Blood Counts During Clozapine Treatment to Serum Concentrations of Clozapine and Nor-Clozapine
L Kola Oyewumi, Zack Z Cernovsky, David J Freeman, David L Streiner

PDF

Research Methods in Psychiatry
Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data
David L Streiner

PDF

Brief Communciation
Treatment Resistance in Anorexia Nervosa and the Pervasiveness of Ethics in Clinical Decision making
Chris MacDonald

PDF

Topiramate Use in Obese Patients With Binge Eating Disorder: An Open Study
Jose C Appolinario, Leonardo F Fontenelle, Marcelo Papelbaum, Joao R Bueno, Walmir Coutinho

PDF


Book Reviews

The Depressed Child and Adolescent. 2nd ed.

Clinical Assessment of Dangerousness: Empirical Contributions

The Feeling of What Happens: Body and Emotion in the Making of Consciousness

The Evolution of Psychoanalysis: Contemporary Theory and Practice

Psychiatrie gériatrique: esquisse d'une histoire médicale par l'élaboration de son langage

Démystifier les maladies mentales: les troubles de l'enfance et de l'adolescence


Books Received


Letters to the Editor

RE: Who Develops Severe or Fatal Adverse Drug Reactions to Selective Serotonin Reuptake Inhibitors?

RE: Canadian and American Psychiatrists' Attitudes Toward Dissociative Disorder Diagnoses

Acute Onset of Schizophrenia Following Autocastration

The World Trade Center Disaster

Selenium, Thyroid Hormones, Mood, and Behaviour

Research Methods in Psychiatry

Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data

David L Streiner, PhD1

 

Researchers often take variables that are measured on a continuum and then break them into categories (for example, above or below some cut-point), either to place subjects into groups or as an outcome measure. In this article, we show that the rationales given for this practice are weak and that categorization results in lost information, reduced power of statistical tests, and increased probability of a Type II error. Dichotomizing a continuous variable is justified only when the distribution of that variable is highly skewed or its relation with another variable is nonlinear.

(Can J Psychiatry 2002;47:262–266)

Key Words: dichotomizing, data, power, variables

Résumé: La séparation est pénible : le malaise de la dichotomie des données continues


Those of you who are old enough may remember Neil Sedaka singing “Breaking Up is Hard to Do.” If only that were true when it comes to the variables we use in research! Many times (I would say far too many), a researcher uses a continuous measure, such as a depression inventory, as an outcome variable and then dichotomizes it—above or below some cut-point, for example, or the number of people who did and did not show a 50% reduction in their scores from baseline to follow-up (1). Less often, but again far too frequently, researchers may assign patients to different groups by dichotomizing or trichotomizing scores from a continuous scale.

Over the years, several arguments have tried to justify this practice. Perhaps the most common one runs something like this: “Clinicians have to make dichotomous decisions to treat or not to treat, so it makes sense to have a binary outcome.” Another rationale that is offered is, “Physicians find it easier to understand the results when they’re expressed as proportions or odds ratios. They have difficulty grasping the meaning of beta weights and other indices that emerge when we use continuous variables.” In this article, I’ll try to show that you pay a very stiff penalty in terms of power or sample size when continuous variables are broken up, with the consequent risk of a Type II error (that is, failing to detect real differences). But before we begin, let me assume the role of a marriage counsellor and see whether the arguments in favour of splitting up are really viable.

The rationale for dichotomizing outcomes because clinical decisions are binary fails on 3 grounds. The primary one is that it confuses measurement with decision making. The purpose of most research is to discover relations—relations between or among variables or between treatment interventions and outcomes. The more accurate the findings, the better the decisions that we can make; that is, the findings come first and the decision making follows. As we will see, findings come more readily and more accurately when we retain the scaling of continuous variables. The second reason is that all the research using the old dichotomy becomes useless if the cut-point changes. For example, the definition of hypertension used to be 160/95 (2). If we defined the outcome of intervention trials dichotomously—with above 160/95 being hypertensive and below being normotensive—then those findings would become useless after the definition changed to 140/90 (3). If we expressed the outcome as a continuum, however, the values of beta coefficients and similar indices showing the effects of various risk and protective factors would not change at all: if we wanted to use statistics such as odds ratios (ORs) or the percentage of patients who improved, it would be a trivial matter to recalculate the results. We have a similar situation in psychiatry. The diagnosis of antisocial personality disorder (ASP), for example, is a binary one: the person either does or does not satisfy the diagnostic criteria (that is, a certain number of symptoms are present). However, Livesley and others maintain that ASP and many other disorders should actually be seen as a continuum: the more symptoms that are checked off, the more of the trait the person has (4). If the number of symptoms necessary to meet the criteria were to change, as occurred when DSM-IV replaced DSM-III-R, then much previous research using a dichotomous diagnosis would have to be discarded. If the diagnosis were expressed as the number of symptoms present, though, it would be relatively easy to reinterpret the findings using the new criteria.

 

Finally, whether to hospitalize a patient with suicidal ideation or to discharge a patient with symptoms of schizophrenia may be binary decisions, but many treatments—perhaps most— fall along a continuum involving the dosage or strength of a medication and the number and frequency of therapy sessions.

As for the argument that physicians are more comfortable with statistics based on categorical measures, we are likely dealing with both a base canard that they, like old dogs, cannot learn new tricks and a vicious circle. As long as the belief persists, studies will be designed, analyzed, and reported using proportions and ORs, meaning that physicians will not have the opportunity to become more comfortable with other approaches.

First, I’ll give some examples of how dichotomizing can lead us astray, and then I’ll use these examples to discuss why this is the case.


Example 1

Let’s look at the data in Table 1, which shows scores on a scale for 2 groups, each with 10 subjects. Let’s assume that, if we were to dichotomize the scale, we would use a criterion for “caseness” of 15/16: people with scores from 1 to 15 would be considered normal, and those with scores of 16 and over would be defined as cases. The mean for Group 1 is 11.70, and the mean for Group 2 is 16.80. There is slightly more than a 5-point difference between the groups, and the average of the first group is well below the cut-off of 15/16, while the average of the second group is above the cut-point. If we used a t-test to compare the groups, we’d find that t(18) = 2.16, P = 0.045. That is, there is a statistically significant difference between the means. Now, let’s dichotomize the results and count the number of people above and below the cut-point in each group. What we’d find is shown in Table 2. Because 2 of the cells have frequencies below 5, we’d use a Fisher’s exact test, rather than a chi-square test, and we’d find that the P level is 0.057. In other words, the difference is not statistically significant.


Example 2

In the second example, we have 40 subjects, measured on 4 variables, A through D. If we were to correlate these variables, we’d find the results shown in the upper triangle of Table 3. Of the 6 correlations, 5 are significant at the P < 0.01 level. Now, we’ll do a median split on each of these variables, so that roughly one-half of the subjects fall above, and one-half below, the cut-point. If we reran the correlations, we would find the results in the lower triangle of the same table. In every case, the correlations are lower—sometimes substantively so—and only 2 of the 6 correlations are significant at the P < 0.01 level.

Taking this example a bit further, we can run a regression equation, with A as the dependent variable (DV) and B through D as the predictors. Keeping the variables as continua, we’d find the multiple R is 0.767 and R2 = 0.588, which would lead to thoughts of publication and promotion for most people. If we dichotomized the variables, however, we’d find that the multiple R is 0.460, with an associated R2 of 0.211, which might jeopardize that promotion by at least a year. (Purists might say that we should really use a logistic regression with a dichotomous DV. If we did, we’d find the Cox and Snell pseudo-R2 to be an even more disappointing 0.20.)