Archive for the ‘Scientific Articles Related to DUI’ Category

The Myth of Field Sobriety Testing as an Accurate Means to Base DUI Arrest

April 21, 2007

Separating Myth from Fact: A Review of Research on the Field Sobriety Tests 

Spurgeon Cole&Ronald H. NowaczykClemson UniversityClemson, SC 29634
 

For over a decade Marcelline Burns, senior author of an often-cited 1977 NHTSA (National Highway Traffic Safety Administration) report and co-author of a 1981 NHTSA study, has traveled across the country extolling the virtues of the new and improved Field Sobriety test (FST) battery. The FST battery, as recommended by NHTSA, consists of three tests that are supposed to predict an individual’s blood alcohol (BAG) level. The tests are the Horizontal Gaze Nystagmus (HGN) test, the Walk-and-turn test and the One-leg stand test. None of these tests were specifically developed to identify BAG level, but have been used by law enforcement as indicators of driving impairment.NHTSA claims that the new version of the FST battery is scientific and can differentiate between impaired and unimpaired drivers. Until recently Burns’ testimony has gone unchallenged because few attorneys have the prerequisite understanding of statistics and test development to critically evaluate the NHTSA reports and effectively cross-examine NHTSA’s witnesses. Judges who have recently heard the “rFST of the story” are either not admitting the FST entirely or declaring it unscientific and not allowing police to use such terms as “tests” “results” “passed” or “failure.”2 The prosecution in DUI trials has long held a decided advantage over the defense because of misconceptions about the effectiveness of the FST. Even defense attorneys have often accepted the premise that the FST has a measure of value in predicting driving impairment. In essence, NHTSA representatives have for over a decade enjoyed a free ride, but the road has recently developed some serious pot holes. 

Research (Cole & Cole, 1991; Cole & Nowaczyk, 1994) and expert testimony offered by Cole & Nowaczyk have enabled judges and attorneys to better understand the limitations of the FST. In the past, NHTSA representatives have made outlandish claims as to the effectiveness of the FST even though these claims are not supported by their own research data. Because of these sins of omission and an occasional sin of commission, many myths have developed concerning the validity and reliability of the FST battery. The present article attempts to separate the facts from the myths. 

Myth 1: The Field Sobriety TFST (FST) battery predicts driving impairment. 

Fact:        NHTSA never attempted to determine if the FST could predict driving Impairment. There is not a single study linking the recommended FST battery directly to driving impairment. The fact is, there never wilt be a simple roadside coordination task that can predict driving impairment. In one of NHTSA’s own reports, the following statement is made “… even valid, behavioral tests are likely to be poor predictors either of actual behind-the-wheel driving …or of accidents” (p. 2-7, Snapper, Seaver & Schwartz, 1981.) The stated goal in the 1977 study was to determine the relationship between FST and intoxication and driving impairment. However, they did not investigate the relationship between the FST and driving impairment.While there is a relationship between BAC level and driving impairment, the relationship is not likely to be a simple linear one. Therefore, it is not appropriate to assume that 1) if FST performance and BAC are related and 2) if BAC and driving Impairment are related, therefore, 3) FST and driving impairment are related. The relationships among these factors are too complex to assume a simple relationship as NHTSA might like you to conclude. There are comments among NHTSA researchers themselves alluding to this conclusion. In the 1981 NHTSA study, the researchers conclude,”…Individuals vary in alcohol tolerance, and infrequent drinker may be severely impaired at a BAC of 0.05, whereas a heavy drinker may show only minimal Impairment at this level” (p. 19). Dr. Moskowitz, one of the co-authors of both the 1977 and 1981 NHTSA studies, co-authored a later review of research on driving and alcohol, levels and concluded in a presentation at a scientific conference that,”… studies of driving simulator and on-the-road testing varied widely in results. This is due to the wide range of behavioral demands required by diverse control and visual search requirements” (Moskowitz & Robinson, 1987, p. 85). It is obvious that research is needed examining the relationship between FST and driving performance directly. That research has not yet been conducted. Dr. Burns herself indicated that the FST battery has its value in predicting BAG levels (Burns, 1984). 


Myth 2: The FST battery is 80 percent accurate in differentiating between Individuals with BAC levels above or below .10. 

Fact:        The 1981 NHTSA study is the one cited by NHTSA as evidence of an 80 percent accuracy rate with the use of the FST battery. That study tested 296 subjects. Thirty-three percent of the subjects in the study had a BAC level of .00 and 34 percent were given dose levels calculated to raise BAG levels to .05. Another 11 percent of the subjects had BAG levels approximating .15, with some having BACs as high as .18. An officer should have no difficulty correctly identifying totally alcohol-free subjects as being unimpaired. Although slightly more difficult, one would expect officers to correctly classify subjects with BAC levels of .05 as being unimpaired.  They should also have little difficulty correctly classifying subjects with the BAG levels of .15. In effect, 78 percent of the subjects fall into these extreme categories. Only 22 percent of the subjects were in the critical BAC range around .10. When the tests must differentiate in this critical range, they fail miserably. The overall accuracy rate of .80 is. misleading when over two-thirds of the decisions are “gimmies,” people with little or no alcohol or levels of .15. 

For the remaining subjects, the officers have a 50/50 chance of being correct just on the basis of guessing. With the “easy” decisions and a guessing rate of .50, the reported 80 percent accuracy rate does not look exceptionally good. The question should not be how does the FST help officers correctly classify subjects 80 percent of the time. Instead, the question asked should be “Why doesn’t the FST do a better job helping the officers reach the correct decision?” In fact, the 1977 NHTSA report contains the following admonition, “Again, it should be pointed out that all the evidence from these data suggest it is unrealistic to attempt to use behavioral tests to discriminate BACS in a .02 margin around a given level” (P.41). 

Myth 3: The FSTs are tests accepted by the scientific community. 


Fact:  Anastasi (1988) defines a test as being an objective and standardized measure of behavior, in the behavioral sciences, specific criteria must be met for a behavioral test to be accepted. The primary criteria include establishing the reliability, validity, and standardized administration of the test. Reliability and validity involve the consistency of test scores and the relationship of the score to the behavior it is designed to measure. Standardization includes uniformity of procedure in administering the test as well as the scoring of the test. For test scores to be meaningful the test conditions under which the tests are administered must not be causing differences in test scores. A test that has not been standardized or does not outline exact procedures for administration and scoring would not be considered a scientific test. 

An important step in the standardization of a test is the development of norms and as the name suggests, a norm is the normal, average or typical score. Scores can only be interpreted by comparing them with scores obtained by others. There are no adequate norms for the FST battery. Common sense dictates and research supports the belief that motor skills decline with age. The FST, however, provides no basis for interpreting the results for individuals at various age levels. Although, manuals for DWl training suggest that tests should not be given to individuals who are 60 years of age or older or to a person more than 50 pounds overweight, it provides no information on how to evaluate the performance of a 45 year old versus a 20 year old (NHTSA, 1992). 

Examiners cannot adequately interpret a score, unless they know the mean and the standard deviation of the distribution. NHTSA leads us to believe that the “norm” for a sober person would be a test score of 0; that is, no errors in performance. Yet, we know from the 1977 NHTSA study that all of the sober people in that study made at least one error. In fact, the mean number of error “cues” scored among the sober individuals was 10.56.3 Even if NHTSA’s claim that the FST is not a norm-referenced test, but rather a criterion-referenced test (that is, that a certain score (criterion point) indicates failure), there are no data indicating how this criterion score might vary as a function of age, gender, or motor coordination. Even, if such norms were produced from the NHTSA 1977 and 1981 studies, they would be of limited value given that they are based on laboratory testing, not testing in the field. 

 

Myth 4: The field sobriety tests are reliable. 

Fact:  Reliability refers to the consistency n test scores. Reliability scores can range from a low of .00, which indicates no consistency, to 1 .00, which indicates perfect consistency. A test with a reliability value of .90 would indicate that 90 percent of the variability in the test scores is attributed to true differences in performance and 10 percent would be due to error. Most well-established tests (e.g., Wechsler scales for lQ, SAT, GRE) have reliability values greater than .90. The scientific community expects reliability coefficients to be in the upper .80s or .90s for a test to be scientifically reliable (Anastasi, 1958; Rosenthal & Rosnow, 1990). 

The HGN, One-leg stand, and Walk-and-turn tests have test-retest reliabilities of .66, .72, and .61 respectively with a combined reliability of 77. This means that 34 percent of the HGN, 28 percent of the One-leg stand and 39 percent of the Walk-and-turn test scores can be attributed to errors in scoring. If 23 percent of the score on a breathalyzer depended on the manufacturer of the device, would it be allowed into evidence? Quite possibly the most telling lack of reliability of the FST battery is that when different officers tested the same subjects at the same dose level on different days, the reliability was only .59. This means that 41 percent of the score was due to error. These reliabilities are far too low to be useful in making important decisions. By contrast the reliability of the BAC machine readings was .96, indicating a high level of reliability. 

Myth 5: The field sobriety tests are scientifically valid. 


Fact:  The 1977 NHTSA study reported the results in terms of validity coefficients. The validity coefficient for HGN, One-leg stand and Walk-and-turn tests was .67, .48, and .55 respectively with a combined validity coefficient of 67. For example, if the officer used the individual FSTs, the accuracy in predicting the BAC levels would increase by only 26 percent with the HGN test, 12 percent with the one-leg stand test and 16 percent with the walk-and-turn test. If all three tests were administered, accuracy in predicting BAC levels would improve by only 26 percent. The error in predicting BAG levels using the HGN, the one leg stand, and the walk-and-turn combined would be 74 percent as large as it would be by chance. 

For the FST battery to be a valid predictor of BAC, it must not only identify individuals above a BAC level of .10 as “failing, “ but also identify individuals below .10 as “passing.” That is, the test must have discriminative power. In NHTSA’s own studies, a significant proportion of people who were below the .10 BAG standard in effect at that time were falsely viewed as being impaired. In the 1977 Burns and Moskowitz study, 46.5 percent of the “arrest” decisions by participating officers were incorrect. Of the 101 arrest decisions, 47 subjects had BAG levels less than .10. The authors, themselves conclude, “Obviously, an error rate of 46.5 percent in making arrests is not acceptable” (p.25). 

In the follow-up study by Tharp et. Al. (1981), the false arrest rate was 32 percent. The primary reason for the decrease in false alarm from 46.5 percent in the ‘77 NHTSA study to 32 percent in the 1981 study was not due solely to the “new improved FST,” but partly to the distribution of subjects across the dose levels. In the ‘77 NHTSA study 27 percent of subjects were in the critical range (BAC in the middle range) and in the ‘81 NHTSA study only 22 percent of subjects were in the middle range. In other words the distribution in the ‘81 NHTSA study made discriminations easier. If the ‘81 NHTSA study had used the same distribution of BAC levels that were employed in the ‘77 NHTSA study, the false arrest rate would have been higher than 32 percent and probably would have matched the “unacceptable” 46.5 level of the ’77 NHTSA study. These validity scores are quite low and suggest that the FST battery is of little benefit for an officer determining BAC levels. 

Myth 6: NHTSA has validated the FST in a field setting. 

Fact:  The 1977 and 1981 NHTSA studies were conducted in a laboratory setting. It is obvious that laboratory studies are very different from studies performed in a natural or field setting. Laboratories are quite different from real life situations. For example, the influence of alcohol on the individual depends greatly on the social context, as well as the expectations of the person. Subjects in these NHTSA studies were told not to eat eight hours prior to the testing. Test subjects were tested at 15-minute intervals, and the study began early in the morning. This would mean that many subjects had not eaten for long as 12 hours before being tested. It is doubtful that a person drinking in a natural setting would fast for hours and then consume alcohol at unknown ethanol levels.Laboratories are artificial by nature and only gives an indication of what one might expect in a field setting. In the conclusions of the 1981 NHTSA study, the authors recommended that the field sobriety test should be validated in the field for 18 months and in various localities across the nation. The 1983 NHTSA study by Anderson, et al., the purported “field validation” of the FST battery, did not meet those recommendations, A 3-month study was conducted in a limited number of locations on the east coast.  Dr. Bums has testified on cross examination4 that the FST has never been adequately field tested. Most importantly the FST has never been standardized or validated in a field getting. 

Myth 7: The NHTSA studies have been published in Peer Reviewed Journals. 

Fact:  Neither of the 1977 or 1981 NHTSA studies has been published in a scientific peer-reviewed journal. The publications have been limited to technical reports issued by NHTSA. Dr. Burns has admitted on cross examination3 that the method and results sections were too lengthy to be published in a scientific journal. Based on this logic lengthy but important studies would never be published. It is difficult to see how the NHTSA could claim that the FST Is accepted in the scientific community, when results of studies on the validation of the FST have never appeared in a scientific peer reviewed journal, which is’ a basic requirement for acceptance by the scientific community. 

Myth 8: There is a consistent relationship between BAC levels and driving impairment. 

Fact:  The literature on the effects of alcohol is so diverse that one can only conclude that any demanding task may be impaired at almost any BAC level. Research indicates that there are substantial individual variations in the metabolism of alcohol which would, most likely affect performance. Performance is also affected by individual differences and individuals with identical BAC levels, may very well have different levels of impairment (Hurst and Bagley, 1972; Moskowitz, Daily and Henderson, 1974). Many studies involving the influence of alcohol on impairment find a rather significant number of subjects whose performance actually increases after the consumption of alcohol. In a study conducted under the auspices of the California Highway Patrol and various law enforcement agencies, Giguire (1985) found that 17 percent of his subjects with doses calculated to achieve BAG levels of .10 improved driving performance on a closed course. Mangarin & Standery (1989) also found no effects of alcohol dose on a video driving performance despite an unusually high dose calculated to achieve a BAG of .16. These studies and others suggest a complex relationship between BAC levels and performance and offers little support for setting specific BAC impairment levels and certainly does not support the assumption that BAG levels could be used as a substitute criteria for driving impairment. 

Myth 9: People who are not impaired can “pass” the Field Sobriety Tests. 


Fact:  Cole and Nowaczyk (1991) had 21 adults perform field sobriety tests who were completely alcohol free, as confirmed by breath tests. The subjects were given six tests including a heel to toe test and a one leg stand test. None of the subjects was under the extreme pressure that is associated with a roadside detention situation. Two separate groups of law enforcement officers gathered at different times to judge the performance of the participants. These were actual police officers who had received standard training in the observation and Identification of intoxicated drivers. The officers were then asked to identify individuals who had too much to drink to drive. Of 147 responses by the police officers, 68 of those responses (46 percent) indicated that a completely sober person was too intoxicated to drive, The average police experience was 12 years. Interestingly, the officer with the least experience had the fewest wrong responses. 

Compton (1985) found false positive rates for totally alcohol free participants to be as high as 54 percent for some police departments. In the 1981 NHTSA study 18 percent of alcohol-free subjects and 31 percent of subjects with BAC levels of .05 were judged to be impaired. Clearly, there is a strong tendency for certified alcohol-free participants to fail Field Sobriety Tests. 

Myth 10: The Horizontal Gaze Nystagmus (HGN) Test is the most sensitive test for measuring Impairment. 

Fact:  Because the HGN test is a physiological task unlike the other Field Sobriety Tests which are psychomotor, divided-attention tasks, it is sometimes viewed as being the most sensitive of the three tests. Also, some of NHTSA’S research indicates it has the strongest relationship with BAC (e.g., Burns & Moskowitz, 1977 (p. 17]; Anderson, et at., 1983 (Table 2]). Yet, some of NHTSA’s own data raise question marks about its ability to discriminate among individuals with different BAG levels. 

In a report commissioned by NHTSA, Snapper, Seaver and Schwartz (1981) reviewed the Burns & Moskowitz study and conclude, “Nystagmus, on the other hand was not a highly-rated test. … First, Burns and Moskowitz evaluated tests with respect to the relationship between performance on the test and blood alcohol concentration (BAC). A close relationship between these two variables does not necessarily imply a close relationship between performance on the nystagmus test and driving performance, or between test performance and accidents. Specifically, it is not apparent that performance on the nystagmus test reflects, any skills related to driving. In addition, examining a driver for nystagmus may be difficult operationally and somewhat unsafe. Scoring is quite subjective and would require careful training for the test administrator” (p. 4-4). 

The difficulty in scoring is illustrated in the Tharp, et al. study where we find a weak relationship between an officers ability to judge the angle of nystagmus onset and the actual angle as measured by a machine. Officers are instructed that onset of nystagmus before 45 degrees of eye movement to the outside is an indication of a BAC above .10. Yet, we find that of the 10 officers who participated in the Tharp et al. study, 5 had correlation coefficients less than .44, with 2 in the .23 to .26 range. This indicates little relationship between what the officers judged .the angle of onset to be and what the machine actually recorded as the angle of onset, 

The 45 degree angle of onset itself is troubling. Based on NHTSA’s own research, a 45 degree angle corresponds to a BAC of approximately .05 or .06, not .10 (Tharp, et al., 1981).  A more appropriate angle, based on their findings, is 41 or 40 degrees not 45 degrees. A BAC level of .08 would correspond to an angle of onset of approximately 43 degrees. The task for the officer to detect such small changes is quite daunting, if not impossible. 

Follow-up research on impairment and performance with the HON has shown it can lead officers to falsely conclude a person has a BAG above .10 when it is not. Compton, in a NHTSA study (1985), reported the findings of a study where individuals were stopped at simulated sobriety checkpoints. The subjects, dosed to different BAC levels, were encouraged to act as though they were not impaired. The officers gave “failing” scores (4 points or higher) to 15 percent of the sober individuals and 64 percent of those with BAC levels between .05 and .09 (the average BAC level in this condition was .07). 

Giguire (1985) had 24 Navy personnel drive on a closed course under sober and intoxicated conditions. In addition to evaluating their driving performance, Giguire had officers administer the Field Sobriety Tests. Of the 13 subjects with BACs below .10 (between .064 and .099), 12 showed evidence of impairment based on the HGN. The HGON is not as accurate a test for determining BAG as NHTSA would like you to believe. 

Conclusion 

Because of its widespread use, the FST battery has been assumed to be a reliable and valid predictor of driving impairment. NHTSA has done little to dispel that assumption.  Law enforcement cannot be blamed for its use of the FST battery. Training documents refer to NHTSA reports and provide what appears to be supporting evidence for the validity of the FST battery. In addition, there is little doubt that individuals who have high BAC levels will have difficulty performing the FST battery. However, what the law enforcement community and courts fail to realize is that the FST battery may mislead the officer on the road to incorrectly judge individuals who are not impaired. The FST battery to be valid must discriminate accurately between the impaired and non-impaired driver, NHTSA’s own research on that issue (Anderson, et. al., 1983; Bums & Moskowitz, 1977; Tharp, et al. 1981) has not been subjected to peer review by the scientific community. In addition, a careful reading of the reports themselves provides support for the inadequacy of the FST battery. The reports include low reliability estimates for the tests, false arrest rates between 32 and 46.5 percent, and a field test of the FST battery that was flawed. Because officers in many cases had breathalyzer results at the time of the arrest. NHTSA clearly ignored the printed recommendations of its own researchers in conducting that field study. 

 


What is needed is a careful examination of the complex relationships among motor coordination tasks, BAG level and driving impairment. Tests should be developed based on our understanding of these relationships. The current method of selecting the “best of what is out there” is not serving the public well. 

References 

Anderson, I.E., Schweitz,R. M. & Snyder, M. 8. (1983). Field evaluation of a behavioral battery for DWI. Final Report, DOT-HS-806-676, 1983. 

Anastasi, A. (1988). Psychological Testing, Sixth edition. NY: Macmillan Press. 

Burns, M. & Moskowitz, H. (1977). Psychophysical tests for DWI arrest. Final Report, DOT-HS-802-424, NHTSA, 1977. 

CoIdwelI, B. B., Penner, D. W., Smith, H. W., Lucas, 0. H. W., Rodgers, R. F. & Darroch F. (1958). Effect of ingestion of distilled spirits on automobile driving skill. Quartery Journal of Studies on Alcohol, 19, 590-616. 

Cole, R. M. & Cole, S. N. (1991). New proof that field sobriety tests are “failure designed.” OWl Journal, 6(2), 1-5. 

Cole, S. & Nowaczyk, S. H. (1994). Field sobriety tests: are they designed for failure? Perceptual and Motor Skills, 79, 99-104. 

Compton, R. P. (1955). Pilot test of selected DWI detection procedures for use at sobriety checkpoints. Final Report, DOT- H S_806-724. 

Giguire, W. (1985). Impairment caused by moderate blood alcohol levels in a closed course: preliminary demonstration. In S. Kaye & G. Meier (Eds.), Alcohol, Drugs and Traffic Safety. Proceedings 9th International Conference. 

                     Hurst, P.M. and BagIey, S.K. Acute adaptation to the effects of alcohol. Quart. J. Stud. Alc., 33, 358-378, 1972. 

Moskowitz, H., Daily, J. And Henderson, A. Acute tolerance to behavioral impairment by alcohol in moderate and heavy drinkers. DOT-NHTSA,TM (L) – 4970/013/00, 64 pp., 1974. 

Moskowitz, H. & Robinson, C. (1987). Driving-related skills impairment at low blood alcohol levels. In P. C. Noordzij & A. Roczbach (Eds.), Alcohol, drugs and traffic safety. Elsevier Science Publishers. pp. 79-86. 

Moskowitz, H. & Robinson, C. (1988). Effects of low doses of alcohol on driving-related skills: a review of the evidence. Final Report, DOT-HS-807-280. 

NHTSA, National Highway Traffic Safety Administration (1992). DWl Detection and Standardized Field Sobriety Testing. DOT- PB94-780 228. 

Rosenthal, A. & Rosnow, R. L. (1991). Essentials of Behavioral Research. (2nd ed.) New York: McGraw-Hill. 

Snapper, K. J., Seaver, D. A., & Schwartz, J. P. (1981). An assessment of behavioral tests to detect impaired drivers. Final Report, DOT-HS-806-211. 

Tharp, V., Burns, M. & Moskowitz, H. (1981). Development and field test of psychophysical tests for DWI arrests. Final Report, DOT-HS-805-864. 

 

47 Types Of Nystagmus Differant Than HGN

January 16, 2007

TYPES OF NYSTAGMUS–SEPARATE FROM HORIZONTAL NYSTAGMUS

(1) Acquired;
(2) Anticipatory (induced);
(3) Arthrokinetic (induced, somatosensory);
(4) Associated (induced, Stransky’s);
(5) Audio kinetic (induced);
(6) Bartel’s (induced);
(7) Brun’s;
(8) Centripetal;
(9) Cervical (neck torsion, vestibular-basilar artery insufficiency);
(10) Circular/Elliptic/Oblique (alternating windmill, circumduction, diagonal, elliptic, gyratory, oblique, radiary);
(11) Congenital (fixation, hereditary);
(12) Convergence;
(13) Convergence-evoked;
(14) Dissociated (disjunctive);
(15) Downbeat;
(16) Drug-induced (barbituate, bow tie, induced);
(17) Epileptic (ictal);
(18) Flash induced;
(19) Gaze-evoked (deviational, gaze-paretic, neurasthenic, seducible, setting-in);
(20) Horizontal;
(21) Induced (provoked);
(22) Intermittent Vertical;
(23) Jerk;
(24) Latent/Manifest Latent (monocular fixation, unimacular);
(25) Lateral Medullary;
(26) Lid;
(27) Miner’s (occupational);
(28) Muscle-Paretic (myasthenic);
(29) Optokinetic (induced, optomotor, panoramic, railway, sigma);
(30) Optokinetic After-Induced (post-optokinetic, reverse post-optokinetic);
(31) Pendular (talantropia);
(32) Periodic/Aperiodic Alternating;
(33) Physiologic (end-point, fatigue);
(34) Pursuit After-induced;
(35) Pursuit Defect;
(36) Pseudo spontaneous;
(37) Rebound;
(38) Reflex (Baer’s);
(39) See-Saw;
(40) Somatosensory;
(41) Spontaneous;
(42) Stepping Around;
(43) Torsional;
(44) Uniocular;
(45) Upbeat;
(46) Vertical;
(47) Vestibular (ageotropic, geotropic, Bechterew’s, caloric, compensatory, electrical/faradic/galvanic, labyrinthine, pneumatic/compression, positional/alcohol, pseudo caloric.

Non Alcohol related reasons for HGN

January 16, 2007

CAUSES OF HORIZONTAL GAZE NYSTAGMUS NOT ALCOHOL RELATED

(1) problems with the inner ear labyrinth;
(2) irrigating the ears with warm or cold water under peculiar weather conditions;
(3) influenza;
(4) streptococcus infection;
(5) vertigo;
(6) measles;
(7) syphilis;
(8) arteriosclerosis;
(9) muscular dystrophy;
(10) multiple sclerosis;
(11) Korchaff’s syndrome;
(12) brain hemorrhage;
(13) epilepsy;
(14) hypertension;
(15) motion sickness;
(16) sunstroke;
(17) eyestrain;
(18) eye muscle fatigue;
(19) glaucoma;
(20) changes in atmospheric pressure;
(21) consumption of excessive amounts of caffeine;
(22) excessive exposure to nicotine;
(23) aspirin;
(24) circadian rhythms;
(25) acute trauma to the head;
(26) chronic trauma to the head;
(27) some prescription drugs, tranquilizers, pain medications, anti-convulsants;
(28) barbiturates;
(29) disorders of the vestibular apparatus and brain stem;
(30) cerebellum dysfunction;
(31) heredity;
(32) diet;
(33) toxins;
(34) exposure to solvents, PCBs, dry-cleaning fumes, carbon monoxide;
(35) extreme chilling;
(36) lesions;
(37) continuous movement of the visual field past the eyes;
(38) antihistamine use.

Schultz v. State, 664 A.2d 60, 77 (Md. App. 1995)

The Inadequacy of Instrumental “Mouth Alcohol” Detection Systems in Forensic Breath Alcohol Measurement

January 15, 2007

Rod Gullberg, Washington State Patrol Breath Test Section
811 East Roanoke Seattle, WA 98102
“Mouth Alcohol”, resulting from regurgitation or recently consumed alcohol, has long been a concern in forensic toxicology because of the potential for biasing an end-expiratory breath alcohol measurement. Manufacturers of forensic breath alcohol instruments have attempted to address the issue in part by developing software algorithms that attempt to identify ‘mouth alcohol” and abort the test if detected as present. These algorithms (as in the case of the BAC Datamaster, National Patent Analytical Systems, Inc.) generally evaluate the slopes of the breath alcohol expirogram and will abort the test if the slope is sufficiently negative.
Experimental breath alcohol expirograms were collected from drinking subjects both with and with out the presence of “mouth alcohol”. The data reveals that for subjects already having measurable breath alcohol, biases can exist in end-expiratory measurements and remain undetected by the “mouth alcohol” detection algorithm within the BAC Datamaster instrument. These biases occur at approximately five minutes after exposure to “mouth alcohol” because the expirogram does not conform to that assumed by the instrumental algorithm. These biases are unlikely to occur in sober subjects. Rather than relying on instrumental features to minimize the risk of “mouth alcohol” bias, sound protocol employing a 15 minute observation period and duplicate testing will enhance confidence in results to a much greater extent.

BASIC PRINCIPAL UNDERLYING BREATH TEST PROVEN UNTRUE

January 11, 2007

The Impact of Breathing Pattern and Lung Size on the Alcohol Breath Test

MICHAEL P. HLASTALA
1,2,3 and JOSEPH C. ANDERSON4

1Department of Physiology and Biophysics, University of Washington, Box 356522, Seattle, WA 98195-6522, USA; 2Department of Medicine, University of Washington, Box 356522, Seattle, WA 98195-6522, USA; 3Division of Pulmonary and Critical Care
Medicine, University of Washington, Box 356522, Seattle, WA 98195-6522, USA; and 4Department of Bioengineering, University of Washington, Seattle, WA 98195-5061, USA (Received 21 March 2006; accepted 29 September 2006)
 

Abstract—Highly soluble gases exchange primarily with the bronchial circulation through pulmonary airway tissue. Because of this airway exchange, the assumption that end-exhaled alcohol concentration (EEAC) is equal to alveolar alcohol concentration (AAC) cannot be true. During exhalation, breath alcohol concentration (BrAC) decreases due to uptake of ethanol by the airway tissue. It is therefore impossible to deliver alveolar gas to the mouth
during a single exhalation without losing alcohol to the
airway mucosa. A consequence of airway alcohol exchange is that EEAC is always less than AAC. In this study, we use a mathematical model of the human lung to determine the influence of subject lung size on the relative reduction of BrAC from AAC. We find that failure to inspire a full inspiration reduces the BrAC at full exhalation, but increases the BrAC at minimum exhalation. In addition, a reduced inhaled volume and can lead to an inability to provide an adequate breath volume. We conclude that alcohol exchange with the airways during the single exhalation
breath test is dependent on lung size of the
subject with a bias against subjects with smaller lung
size.

Keywords—Ethyl alcohol, Ethanol, Bronchial circulation,
Airway gas exchange.

INTRODUCTION

An assumption used in the development of the
alcohol breath test (ABT) is that the ethanol concentration
in the last part of the exhaled breath is equal to
that in the alveolar gas. This long-held assumption is
the basis for justifying the ABT1 as an accurate
measure of blood alcohol concentration (BAC).
However, under normal circumstances, a singleexhalation
alcohol breath test shows a gradually and
continually increasing breath alcohol concentration
(BrAC) if the subject exhales at a constant rate
(Fig. 1). The end-exhaled alcohol concentration
(EEAC) is always lower than the alveolar alcohol
concentration (AAC). As more volume is exhaled the
BrAC continues to increase. It has recently been shown
that EEAC is less than AAC due to the exchange of
alcohol in the airways during both inspiration and
expiration.2,3,8
Earlier studies have examined the assumption of
equality between end-exhaled and AAC by comparing
ABT values with blood measurements and found a
considerable amount of variation in the ratio of EEAC
to BAC. For further evidence regarding the lack of
end-exhaled and alveolar equality, two studies10,13
have shown that EEAC is approximately 15–20%
lower than AAC on average (obtained using isothermal
rebreathing). The explanation for this variation
has been discussed before.2,8 The physiological
importance of the discrepancy between EEAC and
AAC are the subject of this study.
Two recent studies have demonstrated a relationship
between the blood:breath2 ratio (BBR) for alcohol and
body weight14 or gender11 in normal subjects. Thus, it
may be possible that the BBR for alcohol is dependent
on physiological or anatomic differences among individual
subjects.9 One anatomical feature, lung size,
depends on body size, age, gender and ethnicity.
When an ABT is performed, subjects are not
required to control either the volume inhaled or the
Address correspondence to Michael P. Hlastala, Division of
Pulmonary and Critical Care Medicine, University of Washington,
Box 356522, Seattle, WA 98195-6522, USA. Electronic mail: hlastala@
u.washington.edu
1 A list of abbreviations used in this paper is shown in Table 1.
2 The blood:breath ratio is equal to the ratio of end-exhaled alcohol
concentration divided by blood alcohol concentration (EEAC/BAC).
Annals of Biomedical Engineering ( 2006)
DOI: 10.1007/s10439-006-9216-3
 2006 Biomedical Engineering Society
volume exhaled. Under normal resting conditions, a
subject inhales and exhales a tidal volume (VT)
beginning from a functional residual capacity (FRC)
(Fig. 2). When administering an ABT, the subject is
asked to inhale ambient air and exhale into the breath
test instrument as far as possible. Although the subject
is asked to take a full inhalation, he/she is not required
to inhale to total lung capacity (TLC). Because it takes
some effort to inhale from FRC to TLC, a volume
known as inspiratory capacity (IC), it is most likely
that a subject’s lung size is less than TLC at the time
exhalation is initiated (gray line in Fig. 2). Some subjects
may exhale after inhaling only a very small volume.
The expiratory volume also varies naturally
between tests. To obtain a valid ABT, a subject can
exhale any amount between the minimum exhaled
volume required by the particular breath test instrument
(usually either 1.1 or 1.5 l)5 and the maximum
exhaled volume of the lungs, which is limited by the
vital capacity (VC), the difference between TLC and
residual volume (RV). The exhaled volume depends on
the mechanical limitations of the lungs and the relative
effort of the subject, which may vary from time to
time. For the calculations below, we assume that an
average exhaled volume is the average of the minimum
volume and the VC.
Lung volume varies substantially among individual
human subjects (both normal and with lung disease). In
1991, the American Thoracic Society (ATS) compiled
data from three international societies (the ATS, the
European Community for Coal and Steel, and the
European Society for Clinical Respiratory Research)
and published a summary document of lung volumes in
normal, non-smoking, human subjects for clinical use
in interpretation of pulmonary function tests.1 Collectively,
the summary of data (Table 2) shows that, in
adults, lung volumes increases with body height and
decreases with age. Lung volumes are smaller in African
Americans, both males and females, than their Caucasian
height-, age-, and gender- matched counterparts.
For either racial group, females have smaller vital
capacities than males. Because individuals with smaller
lung size must exhale a greater fraction of their lung
volume to fulfill any minimum volume requirement for
a valid sample, we reasoned that a subject with a smaller
lung volume would exhale farther along the increasing
exhaled partial pressure profile before an end-exhaled
sample is taken (see Fig. 3). Consequently, the alcohol
breath test would tend to overpredict the BAC for
individuals with small lung volumes.
We use a mathematical model2 to explore the
dependence of BrAC on lung size (a function of height,
age, gender, and race), inspiratory volume, and expiratory
volume. We hypothesize that BBR will depend
on the subject physical characteristics as well as the
level of cooperation.
FIGURE 2. Lung volume tracing for a single exhalation
maneuver. A subject breathes tidal volumes (VT) at functional
residual capacity (FRC) and then expands his lungs to total
lung capacity (TLC) by inhaling a volume equal to the inspiratory
capacity (IC). The subject exhales his vital capacity (VC)
at a constant flow rate, which causes his lung volume to
approach residual volume (RV). The gray tracing shows the
lung volume dimensions if the subject only inhales 50% of IC
during the prolonged inhalation.
BrAC
AAC
0.2
0.4
0.6
0.8
1.0
I
III
II
0 1 3 4 5
Exhaled Volume (Liters)
0.0
2
FIGURE 1. Exhaled ethanol concentration, normalized by
alveolar alcohol concentration, over a full exhalation at a
constant flow (From4).
TABLE 1. Glossary of abbreviations.
AAC Alveolar alcohol concentration
ABT Alcohol breath test
ATS American Thoracic Society
BAC Blood alcohol concentration
BBR Blood:breath ratio
BrAC Breath alcohol concentration
EEAC End-exhaled alcohol concentration
FRC Functional residual capacity
IC Inspiratory capacity
RR Respiratory rate
RV Residual volume
TLC Total lung capacity
VC Vital capacity
VI Volume of inspiration
VT Tidal volume
M.P. HLASTALA AND J.C. ANDERSON
METHODS
Mathematical Model
A detailed description of the model has been published
previously.2,4,15 Only the essential features will
be described here. The airway tree has a symmetric
bifurcating structure through 18 generations. The
respiratory bronchioles and alveoli are lumped
together into a single well-mixed alveolar unit. Axially,
the airways are divided into 480 control volumes.
TABLE 2. Predicted forced vital capacity for healthy, Non-smoking subjects: Caucasian and African American, male and female.
Predicted vital capacity (l)
Caucasian African-American
Height (in) Height (m) Age (Year) Male Female Male Female
51 1.30 20 2.587 2.137 2.866 2.244
51 1.30 40 2.195 1.721 2.430 1.810
51 1.30 60 1.803 1.305 1.994 1.376
55 1.40 20 3.178 2.560 3.191 2.541
55 1.40 40 2.786 2.144 2.755 2.107
55 1.40 60 2.394 1.728 2.319 1.673
59 1.50 20 3.770 2.984 3.517 2.838
59 1.50 40 3.378 2.568 3.081 2.404
59 1.50 60 2.986 2.152 2.645 1.970
63 1.60 20 4.361 3.407 3.842 3.135
63 1.60 40 3.969 2.991 3.406 2.701
63 1.60 60 3.577 2.575 2.970 2.267
67 1.70 20 4.952 3.830 4.167 3.432
67 1.70 40 4.560 3.414 3.731 2.998
67 1.70 60 4.168 2.998 3.295 2.564
71 1.80 20 5.544 4.254 4.493 3.729
71 1.80 40 5.152 3.838 4.057 3.295
71 1.80 60 4.760 3.422 3.621 2.861
75 1.90 20 6.135 4.677 4.818 4.026
75 1.90 40 5.743 4.261 4.382 3.592
75 1.90 60 5.351 3.845 3.946 3.158
79 2.00 20 6.727 5.100 5.144 4.323
79 2.00 40 6.335 4.684 4.708 3.889
79 2.00 60 5.943 4.268 4.272 3.455
FIGURE 3. Effect of lung size (as represented by vital capacity) on the exhalation profile. At a given exhaled volume (e.g., 1.5 l),
BrAC/AAC is inversely related to lung size. The model simulated a lung performing an IC inhalation (IC = 0.75 Æ VC) and a VC
exhalation at a rate of 200 ml s)1. The horizontal solid bars indicate the end-exhaled normalized BrAC at an average exhaled
volume. The relative average end-exhaled breath to alveolar concentration ratios are 0.767, 0.722 and 0.705 for subject vital
capacities of 2.0, 4.0, and 6.0 l, respectively.
Single-Exhalation Alcohol Breath Test
Radially, the airways are divided into six concentric
layers: (1) the airway lumen, (2) a thin mucous layer,
(3) connective tissue (epithelium and mucosal tissue),
(4) the bronchial circulation, (5) the adventitia, and (6)
the pulmonary circulation. Functionally, the upper
respiratory tract and cartilaginous airways (generation<
10) only have the first four layers. Within each
radial layer, concentration and temperature values are
bulk averages for the entire layer. Mass and energy are
transported between lumenal control volumes by bulk
convection and axial diffusion. Radial transport
between the gas phase and mucous layer is described
with heat and mass transfer coefficients. Radial
transport of water and soluble gas between concentric
layers occurs via filtration (from bronchial circulation
to mucus) and diffusion (Fick’s law). In the alveolar
unit, the concentration of soluble gas is allowed to vary
with time and depends on the pulmonary blood flow,
ventilation, blood solubility, and concentration of
soluble gas in the incoming blood as described by a
mass balance on the alveolar compartment.
Because airway volume increases with increasing
lung size, the lengths and diameter of the intraparenchymal
airways were scaled to ensure the ratio of the
airway volume to the VC was constant. Since the VC
of the Weibel lung model is 5000 ml, these dimensions
were scaled by the factor (VC/5000)1/3. None of
these airway dimensions changed dynamically during
the breathing cycle. The dimensions of the airway wall
compartments were calculated using data and a
method outlined previously.2
Mass and energy balances around a control volume
produce three partial-differential in time, t, and space,
z and nine ordinary differential equations. The equations
are solved simultaneously for the following 12
dependent variables: the mole fraction of soluble gas in
the air, mucous, connective tissue, bronchial bed, and
adventitial tissue layers; the temperature of the air,
mucous, connective tissue, bronchial bed, and adventitial
tissue layers; the mole fraction of water in the air;
and the mucous thickness. The 12 differential equations
are solved numerically using previously published
boundary conditions.2 The spatial derivatives are
approximated by upwind finite difference while the
time derivatives are solved using LSODE, an integrating
software package developed by Hindmarsh.7
Computer Simulations
Before an ABT was simulated, the model first must
reach breath-to-breath steady-state conditions. The
temperature, water concentrations, and ethanol concentrations
within the mathematical model were
brought to steady-state conditions by simulating tidal
breathing at FRC. A respiratory rate of 12 br min)1, a
sinusoidal flow waveform, and a tidal volume equal to
10% of VC were used for the case study (Table 2). For
the parameter study, tidal volume was varied between
200 and 600 ml in 100 ml increments. The inspired air
temperature and relative humidity were set to 23C
and 50%, respectively. The bronchial blood flow rate
was set to 1 ml s)1. The concentration of ethanol in the
pulmonary arterial blood was constant and equal to
0.10 g dl)1 of blood. Steady-state conditions were
reached when the end-exhaled water and ethanol
concentrations changed by less than 0.1% between
breaths. Then, the model simulated a single inhalation
of a volume equal to or a fraction of IC, the volume
from FRC to TLC, at a constant rate of 1500 ml s)1.
Inspiratory capacity was approximated to be 75% of
the VC.6 Then, the model simulated a prolonged
exhalation; the lung was emptied at a rate of
200 ml s)1 until the lung volume reached RV.
RESULTS
For highly soluble gas like ethyl alcohol, exhaled
concentration continues to increase with continued
exhalation due to airway gas exchange. An example of
an exhaled ethyl alcohol profile is shown in Fig. 1. In
this example, a male subject with a BAC  0.09 g/dl
inhaled quickly to TLC, exhaled at a constant flow
rate, and stopped exhalation at RV.4 Several different
expiratory profiles for the same subject are shown.
During exhalation at a constant exhaled flow rate, the
exhaled ethanol concentration rises continuously during
the final phase (phase III) of the ethanol profile.
When the subject stops exhalation (either due to
reaching RV or simply because the subject chooses to
stop), the alcohol concentration plotted against time
levels off because exhalation has stopped and no new
air enters the breath test machine.8 At this time, a
sample is taken and assumed to be ‘‘alveolar’’ in nature.
However, any breath sample is ‘‘always’’ lower in
alcohol concentration than AAC. The classical interpretation
assumes that the EEAC is related to the BAC
with an average BBR of 2100. This factor neglects the
exchange of alcohol with the airways of the lungs and
any variability in this ratio among individuals.
From the model’s predictions of exhaled ethanol
profiles from human subjects,4 we can describe the
mechanisms underlying ethanol exchange in the airways.
As fresh air is inhaled, it absorbs ethanol from
the mucous layer, thereby depleting the ethanol concentration
in the airway wall. Because of the small
bronchial blood flow (Qbr) and the significant diffusion
barrier between the bronchial circulation and mucous
layer, the mucus is not replenished with ethanol before
M.P. HLASTALA AND J.C. ANDERSON
exhalation begins. During exhalation, respired air
encounters a lower concentration of ethanol in the
mucus and, therefore, a large driving force for the
deposition of ethanol onto the mucus. This large airto-
mucus gradient promotes recovery of ethanol by the
mucous layer, decreases the ethanol concentration in
the air, and delays the rise in ethanol concentration at
the mouth. A large (small) air-to-mucus gradient causes
a slowly (rapidly) increasing phase III slope. These
absorption–desorption phenomena decrease the ethanol
concentration leaving the lung (relative to the
alveolar concentration) throughout exhalation and
are the major mechanisms of pulmonary ethanol
exchange.
The mathematical model simulated the effect of lung
size on the exhalation profile (Fig. 3). After a steadystate
was reached during tidal breathing (RR = 12 br
min)1 and VT = 400 ml), the model simulated a full
inhalation from FRC to TLC and then a constant
(200 ml s)1) exhalation to RV. These conditions were
simulated in five lung sizes as represented by the VC
that varied from 2 l to 6 l. The normalized BrAC after
a maximum exhalation (to RV) was 0.79 for all five
lung sizes and appears to be unaffected by lung size
(i.e., VC). However, many times subjects do not exhale
their entire VC and, in addition, most alcohol breathtesting
instruments only require a minimum exhaled
volume (e.g., 1.5 l) before a breath test is acceptable.
We examined the normalized BrAC in Fig. 3 after 1.5 l
of air had been exhaled from lungs of different sizes:
small (VC = 2 l), medium (VC = 4 l) and large
(VC = 6 l). The normalized BrAC was 0.74, 0.61, and
0.55, respectively. At this exhaled volume, the ratio of
change in normalized BrAC to change in lung size is
)0.048 l)1. Additionally, we examined how lung size
affected the normalized BrAC (Fig. 3) after an average
exhalation. We assumed that, on average, an individual
would exhale a volume that is the mean of the
minimum (1.5 l) and maximum (VC) volume. Thus,
for an individual with VC = 6 l, an average exhaled
volume (after an IC inhalation) is 3.75 l and results in a
normalized BrAC of 0.705. Subjects with smaller lung
size, 4 and 2 l, and providing an average exhalation
have normalized BrAC of 0.722 and 0.767, respectively.
For an average exhalation, individuals with
smaller lung size provide BrAC samples that are
greater than those with larger lung size because of the
minimum exhalation volume requirement in combination
with the mechanics of airway gas exchange. The
effect of lung size on this average BrAC is )0.015 l)1.
Thus, a one liter increase in VC decreases the normalized
BrAC at this average volume by 0.015.
The minimum, average, and maximum BrAC values
for subjects with different vital capacities are shown in
Fig. 4. Results are shown for vital capacities varying
between 2.0 and 7.0 l and for an inspiration of a full
IC. As lung VC increases, the average BrAC decreases.
For lungs with vital capacities less than 2.0 l, it is often
difficult for the subject to fulfill the mininum 1.5 l
minimum exhalation volume.
We simulated the effect of inspiratory volume on the
exhalation profile for a given lung size (Fig. 5). Once a
periodic steady-state was achieved (VT = 400 ml), the
model simulated an inhalation from FRC. The inhaled
volume depended on the simulation. For a maximum
IC inhalation, the inhaled volume was assumed to be
0.75ÆVC. Smaller inhaled volumes of 66%, 33%, and
10% of IC were simulated. After inhalation, a constant
(200 ml s)1) exhalation to RV was simulated. Figure 5
shows the effect of inhaled volume on normalized
BrAC from three lungs of varying size, VC = 2 l
(panel A), 4 l (panel B) and 6 l (panel C). For every VC
studied, a decrease in inhaled volume causes: (1) an
increase in normalized BrAC at a given exhaled volume;
(2) an increase in the normalized BrAC from a
minimum (1.5 l) and average exhalation; and (3) a
decrease in the normalized BrAC after a maximum
exhalation to RV. Specifically, a decrease in inspired
volume in a lung with VC = 4 l causes the normalized
BrAC after a minimum exhalation to increase by
0.048 l)1, the normalized BrAC after an average
exhalation to increase 0.004 l)1, and the normalized
BrAC after a maximum exhalation to decrease
0.022 l)1. These rates of change of normalized BrAC
per inspired volume are a function of VC. A two liter
increase (decrease) in VC causes these rates to decrease
(increase) by 15%. As compared with individuals with
small VC, subjects with large VC can choose from
more possible inspired volumes that will result in a
minimum exhaled volume and an acceptable breath
test. We examined the effect of tidal volume on BrAC
and found that a 100 ml increase in tidal volume
FIGURE 4. The relationship between normalized breath
alcohol concentration and lung size (based on vital capacity)
are shown for IC inhalations followed by different exhaled
volumes: maximum (VC), average and minimum (1.5 l). See
text for definitions.
Single-Exhalation Alcohol Breath Test
decreased all three measures (minimum, average, and
maximum exhalation) of normalized BrAC by 0.01.
The variation of lung volume among individuals of
differing gender, body height and age are shown in
Table 2. Typical values are presented in Table 2 for
normal Caucasian and African American male and
female adults. Lung volumes are greater in equally sized
and aged males compared with females, in Caucasians
FIGURE 5. Effect of inspiratory volume on the exhalation profile for a given lung size. At a given exhaled volume (e.g., 1.5 l),
BrAC/AAC is inversely related to volume of gas inhaled (VI). The model simulated a lung inhaling a volume, VI, from FRC and
exhaling to RV at a rate of 200 ml s)1. VC represents lung size. For each panel, VC is 2 l (panel A), 4 l (panel B), and 6 l (panel C).
M.P. HLASTALA AND J.C. ANDERSON
compared with African Americans and in younger
adults compared with older adults. Table 3 shows the
predicted BrAC normalized by AAC taken from Fig. 4.
The predictions of the mathematical model show a
greater BrAC (relative to AAC) in all cases comparing a
smaller lung volume with a larger lung volume.
DISCUSSION
Alcohol breath testing-instruments require a minimum
exhaled volume before a breath sample is taken
at the end of an exhalation. For a subject with a small
lung size, a greater fraction of the VC must be exhaled
before the sample criteria are fulfilled. Most breath test
instruments require a minimum exhalation pressure (or
flow) for a minimal duration of time (4–6 s and a
minimal exhalation volume (between 1.1 l and 1.5 l).
For our calculations, we chose 1.5 l as the minimum
exhaled volume. Once the minimum criteria are fulfilled,
a sample will be taken when the change in
exhaled alcohol partial pressure levels off (always
achievable when the exhaled flow is stopped). For a
subject with a VC of 6 l using a BAC Verifier Datamaster
(minimum volume is 1.5 l), a sample can be
obtained any where between 1.5 and 6.0 l of exhalation
because the subject may choose to stop exhalation any
where between 1.5 l and VC. For a subject with a VC
of 2 l, a sample can be obtained using a BAC Verifier
Datamaster anywhere between 1.5 and 2.0 l of exhalation.
A subject with a small lung size will proceed
further up the increasing BrAC exhaled profile before a
sample is taken (Fig. 3).
One of the fundamental assumptions of the ABT is
that during exhalation, the BrAC continues to increase
until alveolar air reaches the mouth. At this point the
BrAC levels off. This observation has been assumed to
indicate that EEAC is equal to AAC. However, breath
alcohol always increases during exhalation as air
moves out of the mouth,4 never reaching AAC. The
flatness of the slope of the exhaled alcohol profile
simply means that exhalation has stopped. It is not an
indication of alveolar air. Additional support of this
idea follows from two studies using isothermal rebreathing
in human subjects,10,13 which showed that
EEAC (with a single-exhalation maneuver) is always
less than AAC. The difference, on average, is
approximately 15%8 and consistent with the ideas
described in this paper. Individuals with smaller lung
size are predicted to have a smaller difference between
EEAC and AAC such that an individual with a smaller
lung size, would have an ABT that is greater than an
individual with a larger lung size.
The major thesis of this paper is that lung size and
breathing pattern influence the BrAC reading determined
with a breath-testing instrument. Figure 3
shows exhaled alcohol profiles for subjects taking a full
inspiration followed by a full expiration. For each lung
size (represented by VC), the end exhaled BrAC is the
same. In other words, if a subject takes a full inspiration
followed by a full exhalation, there would be no
size dependence. If these subjects were to exhale just to
the minimum volume requirement (1.5 l), the greatest
discrepancy is predicted between subjects with differing
lung size. Every thing else being equal (including
BAC), the subject with the smallest lung size would
TABLE 3. Relative BrAC comparisons.
Predicted VC (l) BrAC/AAC
Min Avg Max
55’’ vs. 75’’ – Male 40 Years
55’’ Male – 40 Years 2.786 0.681 0.747 0.794
75’’ Male – 40 Years 5.743 0.509 0.675 0.770
BrAC Ratio of small to large volume 1.34 1.11 1.03
67’’ Female vs. 67’’ Male – 40 Years
67’’ Female – 40 Years 3.414 0.629 0.723 0.785
67’’ Male – 40 Years 4.560 0.560 0.696 0.776
BrAC Ratio of small to large volume 1.12 1.04 1.01
67’’ AA Male vs. 67’’ Caucasian Male – 40 Years
67’’ AA Male – 40 Years 3.731 0.607 0.714 0.782
67’’ Caucasian Male – 40 Years 4.560 0.560 0.696 0.776
BrAC Ratio of small to large volume 1.08 1.03 1.01
75’’ Male – 60 Years vs. 20 Years
75’’ Male – 60 Years 5.544 0.516 0.678 0.771
75’’ Male – 20 Years 6.351 0.487 0.667 0.767
BrAC Ratio of small to large volume 1.06 1.02 1.00
Single-Exhalation Alcohol Breath Test
have the greatest BrAC. Table 3 summarizes this effect
by comparing the relative ratio of BrAC between two
hypothetical subjects that differ in height, gender, race,
or age. Comparing a male and female of the same
height, the female has a minimum exhalation BrAC
that is approximately 12% greater than the male.
Comparing a 55-inch tall male with a 75-inch tall male,
at minimum exhalation, the smaller male has a 34%
greater BrAC than the taller male. With a minimum
exhalation, the overestimate for the smaller lung individual
is substantial.
On the average, a subject with a valid breath test can
exhale to any point between the minimum volume and
the maximum exhalation. When the subject stops
exhaling, new breath is no longer being delivered for
analysis. Therefore, the BrAC levels off when plotted
against time. An average of the different exhalation
volumes would be approximately equal to the mean of
the volumes exhaled at 1.5 l and the maximum exhalation.
For hypothetical subjects that differ in either
their height, gender, race or age, the ratios of average
BrAC between matched subjects are shown in Column
4 of Table 3. Comparing a 67-inch tall 40-year-old
male and with a female of the same height and age, the
female has an average exhaled BrAC that is approximately
4% greater than the male. Comparing a 55-inch
tall 40-year-old male with a 75-inch tall 40-year-old
male, at average exhalation, the smaller male has an
11% greater BrAC than the taller male. Comparing a
67-inch tall 40-year-old African American male with a
67-inch tall 40-year-old Caucasian male, at average
exhalation, the African American male has a 3%
greater BrAC than the Caucasian male. Comparing a
75-inch tall 20-year-old Caucasian male with a 75-inch
tall 60-year-old Caucasian male, at average exhalation,
the African American male has a 2% greater BrAC
than the Caucasian male. With an average exhalation,
the bias for the smaller lung individual is less than the
bias predicted for the minimum exhalation. The largest
discrepancy is related to body height because of the
greatest difference in relative lung size.
The mechanism of airway gas exchange has been
described briefly above and used to explain how ethanol
exchanges in the lung.2–4 Based on this mechanism
of ethanol exchange, the effect of changes in
inspired volume on BrAC can be understood. A small
inhaled volume will reduce the ethanol concentration
in the airway mucus and tissue layers to a lesser extent
than a large inhaled volume. During exhalation, the
former case will have a smaller air-to-mucus gradient
than the latter case. A smaller gradient causes less
ethanol to be deposited to the airway surface and, as a
result, the BrAC rises more rapidly when the inhaled
volume is small than when it is large (Fig. 5). The
maximum BrAC/AAC depends on the ratio of inspiratory-
to-expiratory time, but because the flow rates
are prescribed, inhaled volumes are defined by percent
of VC and exhalation always proceeds to RV, the
maximum BrAC/AAC only depends on inhaled volume
(VI) as shown in Fig. 5.
The ability to fulfill the minimum exhalation criteria
for a breath test instrument is limited by individuals
with smaller lungs and less than full inhalations. Figure
5 illustrates the combined impact lung size and
inspiratory volume have on the ability to provide a
minimum sample volume. As the size of the individual’s
lungs decrease, it becomes more important to
inspire a greater volume before exhalation. This finding
is consistent with the observations of Jones and
Andersson12 showing the probability of failing to
provide a minimum sample is greater in females than
males. Both genders show an increased in the probability
of an insufficient sample with increasing age.
There are two recent studies that can be used to
compare with our model predictions. Ska˚ le et al.14 and
Jones and Andersson11 determined the blood–breath
ratio (or partition ratio) for several subjects (male and
female) with varying heights, ages and body weight.
Jones studied 9 male and 9 female subjects and found
average BBRs of 2553±576 for males and 2417±494
for females. Although not statistically significant, the
trend agrees with our predictions. The ratio of females
to males is 1.056. The smaller lung size females had a
5.6% greater BrAC than the males. Ska˚ le et al. studied
9 male and 15 female subjects and found that the
blood–breath ratio was dependent on body weight.
The average BBR for subjects with body weights of
50–70 kg was approximately 2250 while the BBR for
subjects with body weights of 90–100 kg was approximately
2476. The ratio is 1.10. The BrAC for the
smaller subjects was 10% greater than the larger
subjects. Neither of these two papers measured lung
VC as this was not part of their hypotheses. So we
cannot directly compare our data. However the trends
are consistent with the hypothesis put forward in this
paper that individuals with smaller lung size have
greater BrAC in comparison to the BAC3 .
The present hypothesis is consistent with published
data and with the mechanisms of pulmonary gas exchange.
We encourage future investigators to include
3 The Blood–Breath Ratio (BBR) is a commonly used term in
forensic science. Because alcohol is a very highly soluble gas, the
ratio of concentration in the blood normalized by that in the breath
is a very large number (typically around 2000). For a given Blood
Alcohol Concentration (BAC), the Breath Alcohol Concentration
(BrAC) is about 1/2000 x BAC. With smaller lung volumes, the
BrAC is greater, hence the BBR (= BAC/BrAC) is lesser. In one
case the BrAC is in the numerator (BrAC/AAC). In the other case,
the BrAC is in the denominator. So a greater BBR is the same as a
lesser BrAC/AAC.
M.P. HLASTALA AND J.C. ANDERSON
the measurement of lung VC with the measurements
of BBR in order to provide data to test our
hypothesis. Surely, if there is anatomically dependent
variation in the alcohol breath test, it is important to
make corrections for the bias of the test. Once these
data are obtained, several possible alternative solutions
can be used: appropriate corrections to the
BrAC values can be made; adjustable legal limits can
be used for individuals of differing lung size; or rebreathing
can be used to obtain a better sample of
AAC.
In conclusion, alcohol exchanges between the respired
air and the airway tissue during both inspiration
and expiration. This airway gas exchange causes the
exhaled alcohol concentration to always be less than
the AAC. A consequence of this airway exchange is
that BrAC depends on lung size and the amount of
effort provided by the subject.
ACKNOWLEDGMENTS
This work was supported, in part, by National
Institute for Biomedical Imaging and Bioengineering
Grant T32 EB001650 and by National Heart, Lung,
and Blood Institute Grants HL24163 and HL073598.
REFERENCES
1American Thoracic Society. Lung function testing: Selection
of reference values and interpretative strategies. Am.
Rev. Respir. Dis. 144:1202–1218, 1991.
2Anderson, J. C., A. L. Babb, and M. P. Hlastala. Modeling
soluble gas exchange in the airways and alveoli. Ann.
Biomed. Eng. 31:1402–1422, 2003.
3Anderson, J. C. and M. P. Hlastala. Breath tests and airway
gas exchange. Pulm. Pharmacol. Ther. in press, 2006.
4George, S. C., A. L. Babb, and M. P. Hlastala. Dynamics
of soluble gas exchange in the airways. III. Single-exhalation
breathing maneuver. J. Appl. Physiol. 75:2439–2449,
1993.
5Harding, P. Methods for breath analysis. In: Medical–Legal
Aspects of Alcohol (4th ed.), edited by Garriott J. C.
Tucson: Lawyers & Judges Publishing Co., 2003, pp. 185–
211.
6Hildebrandt, J. Structural and mechanical aspects of respiration.
In: Textbook of physiology, edited by Patton H.
D., Fuchs A. F., Hille B., Scher A. M., and Steiner R.
Philadelphia: W.B. Saunders Co., 1989, pp. 991–1011.
7Hindmarsh, A. LSODE (computer software). Laurence
Livermore Laboratory, Livermore, CA.
8Hlastala, M. P. The alcohol breath test – a review. J. Appl.
Physiol. 84:401–408, 1998.
9Hlastala, M. P. Invited editorial on ‘‘the alcohol breath
test’’. J. Appl. Physiol. 93:405–406, 2002.
10Jones, A. W. Role of rebreathing in determination of the
blood–breath ratio of expired ethanol. J. Appl. Physiol.
55:1237–1241, 1983.
11Jones, A. W. and L. Andersson. Comparison of ethanol
concentrations in venous blood and end-expired breath
during a controlled drinking study. Forensic Sci. Int.
132:18–25, 2003.
12Jones, A. W. and L. Andersson. Variability of the blood/
breath alcohol ratio in drinking drivers. J. Forensic. Sci.
41:916–921, 1996.
13Ohlsson, J., D. D. Ralph, M. A. Mandelkorn, A. L. Babb,
and M. P. Hlastala. Accurate measurement of blood alcohol
concentration with isothermal rebreathing. J. Stud.
Alcohol 51:6–13, 1990.
14Ska˚ le, A. G., L. Slørdal, G. Wethe, and J. Mørland. Blood/
breath ratio at low alcohol levels: A controlled study. Ann.
Toxicol. Analytique. XIV:41.
15Tsu, M. E., A. L. Babb, D. D. Ralph, and M. P. Hlastala.
Dynamics of heat, water, and soluble gas exchange in the
human airways: 1. A model study. Ann. Biomed. Eng.
16:547–571, 1988.

Breath Tests and Airway Gas Exchange (abstract)

January 10, 2007

Anderson, J.C, and M.P. Hlastala.
Breath tests and airway gas exchange
Pulm. Pharmacol. Ther. 20: 112-117, 2007.
[Reprint (PDF)]

Departments of Medicine, and Physiology and Biophysics.
University of Washington, Seattle, Washington 98195.


ABSTRACT

Measuring soluble gas in the exhaled breath is a non-invasive technique used to estimate levels of respiratory, solvent, and metabolic gases. The interpretation of these measurements is based on the assumption that the measured gases exchange in the alveoli. While the respiratory gases have a low blood-solubility and exchange in the alveoli, high blood-soluble gases exchange in the airways. The effect of airway gas exchange on the interpretation of these exhaled breath measurements can be significant. We describe airway gas exchange in relation to exhaled measurements of soluble gases that exchange in the alveoli. The mechanisms of airway gas exchange are reviewed and criteria for determining if a gas exchanges in the airways is provided. The effects of diffusion, perfusion, temperature and breathing maneuver on airway gas exchange and on measurement of exhaled soluble gas are discussed. A method for estimating the impact of airway gas exchange on exhaled breath measurements is presented. We recommend that investigators should carefully control the inspired air conditions and type of exhalation maneuver used in a breath test. Additionally, care should be taken when interpreting breath tests from subjects with pulmonary disease.

Physiological Responses to Ethanol

January 9, 2007

Physiological Responses To Ethanol

Blood Alcohol Category of Dose Response Relationship for
Concentration Influence the Non-Habituated Consumer

0.01%-0.04 SUBACUT Social/Emotional: No behavioral changes
apparent to the casual observer. Influence negligible.
Cognitive: Shared Attention deficits detectable in sensitive individuals.
Memory: Unaffected
Fine Motor Abilities: Slight changes detectable by specialized tests.
Balance: Normal
Depressant Effect: Very Slight
Vision: Normal

0.03%-.12 EUPHORIA Social/Emotional: Mild state of euphoria;
increase in self-confidence and a decline in inhibitions; increase in sociability and gregariousness; decline in judgment and ability to comply with social controls.
Cognitive: Decline in information processing ability. Decreased attention.
Memory: Long-term memory basically unaffected. Short-term memory deficits apparent at 0.07% and above.
Orientation: Slightly narrowed.
Fine Motor Abilities: Sensory motor impairment begins. Information processing slowed resulting in a decline in performance on specialized tests
Balance: Baseline coordination deficits apparent on certain balance and coordination tasks.
Depressant Effect: Slight
Vision: Pursuit tracking significantly affected above 0.06%. Pendular nystagmus consistently present above 0.05%.

0.09%-.20 INTOXICATED Social/Emotional: Emotional
instability with a loss of critical judgment. Behavior manifested that is uncharacteristic of the subject in the sober state.
Cognitive: Impairment of perception and comprehension.
Memory: Declines evidence in both long and short term abilities.
Orientation: Loss of social awareness
Fine Motor Abilities: Gross deterioration.
Balance: Coordination grossly affected and inability to maintain balance evidence. Sensory motor incoordination.
Depressant Effect: Drowsiness
Vision: Reduced visual acuity with decrease in peripheral vision and glare recovery. Flicker fusion consistently appears.

0.18% – 0.30% CONFUSION Social/Emotional: Mental confusion.
Exaggerated emotional statres (fear, rage, sorrow).
Cognitive: Inncoherence
Memory: Continuing decline in ability to recollect past and present events.
Orientation: Disorientation
Fine Motor Abilities: Only slightly functional
Balance: staggering gait, increasing muscular incoordination with corresponding increased pain threshold.
Depressant Effect: Apathy and lethargy.
Vision: Diplopia (double vision). Disturbances of perception, color, form, motion & dimensions.

0.25% to 0.40% STUPOR Social/Emotional: Complete loss of social
awareness. General inertia.
Cognitive: Marked decreased response to stimuli.
Memory: Unreliable.
Orientation: Non-existent
Fine Motor Abilities: Lost
Balance: Inability to stand and marked muscular in coordination.
Depressant Effect: Impaired consciousness, sleep.
Vision: Only slightly functional and very blurred.

0.35% – 0.50% COMA Social/Emotional: Unconsciousness.
Cognitive: Coma
Orientation: Life threatening state. Abolished reflexes with incontinence of urine and feces.
Fine Motor Abilities: Anesthesia state with depressed or abolished reflexes.
Depressant Effect: Anesthesia state. Impairment of circulation and respiration. Subnormal temperature. Death possible.
Vision: Non-functional.
0.45 + DEATH Death from respiratory arrest can occur.
Dubowski, Kurt M. “Stages of Acute Alcohol Influence/Intoxication,” University of Oklahoma College of Medicine, 1985; Giguire, Williams E., “The Quantitative Measurement of Driving Impairment in the Field, “ Tenth International Conference of Alcohol, Drugs and Traffic Safety, Amsterdam, The Netherlands, 1986.

BREATH ALCOHOL MEASUREMENT

January 8, 2007

ALCOHOL AND BREATH ALCOHOL MEASUREMENT

There are two means of electronically measuring the breath alcohol concentration. One method is by using an infrared and the other method is a fuel cell.

Infrared Cell

Alcohol strongly absorbs infrared energy at mainly two wavelengths namely 3,4 microns and 9,5 microns. 9,5 microns is referred to as the primary wavelength in the measurement of ethyl alcohol, or ethanol, the alcohol that is present in alcoholic beverages. Other substances absorb infrared energy at 3,4 microns therefore absorption at 9,5 microns, the primary wavelength, is used to determine alcohol concentration. In the infrared cell used to measure the alcohol concentration an infrared beam is passed through the breath/alcohol mixture and detected by an infrared detector. The greater the concentration of alcohol in the breath sample the greater the amount of infrared light that is absorbed (Lambert-Beer)

The process of analysis of a breath sample for alcohol by infrared cell as follows:

1. The breath sample is captured in the infra-red cell

 

2. Infrared energy from the source passes through the breath sample. The alcohol in the breath sample absorbs some of the infrared energy.

 

3. The energy absorbed is related to the amount of alcohol present.

 

4. The reduction of infrared energy is detected and measured by the infrared detector the amount of reduction being proportional to the concentration of alcohol in the breath sample.

Fuel Cell

In the fuel cell an electro-chemical reaction between alcohol and oxygen produces an electric current proportional to the concentration of alcohol in air.

The process of analysis of a breath sample for alcohol by the fuel cell is as follows

1. The breath sample is introduced to the fuel cell

 

2. The alcohol in the sample is chemically oxidized at the anode

 

3. At the same time, oxygen (from the atmosphere) is chemically reduced at the cathode.

 

4. A current flow, proportional to the concentration of alcohol, is produced between the two electrodes.

DRAGER ALCOTEST 7110

The Drager Alcotest 7110 uses two means of measuring the breath alcohol concentration. One measurement is performed in an infrared cell and the other in a fuel cell. Other substances will also produce a voltage at the terminals of the fuel cell and therefore the purpose of the fuel cell is to detect the presence of any substance other than alcohol. Should there be another substance present then the reading between the fuel cell and the infrared cell will differ. If the difference exceeds 5% then the measurement process is stopped and no reading is displayed or printed. An indication of the presence of an interferent is indicated.

Measurement Process

After the instrument self tests and zero tests, the breath sample is introduced into the instrument via a delivery tube. From the delivery tube the sample enters the infrared cell and is analysed. A small portion of the breath sample in the infrared cell is taken into the fuel cell. That portion of the breath sample is analysed by the fuel cell. The instrument does two whole processes automatically. The two results are compared and then a second self-test and zero tests take place, and then the result is displayed and printed out. The instrument self-checking process takes place continually while the instrument is in use. If any of the self-checking processes detect a fault, or one result is not confirmed by the other, the instrument will indicate that a fault has been detected and will automatically abort the analysis. The instrument also aborts the analysis if a self or zero test fails.

Operating the equipment

After being switched on, the instrument takes approximately 15 minutes to warm up during which period the testing is inhibited. After warm up the testing is started and details entered via the keyboard.

After the operator information has been entered the instrument automatically pumps ambient air through the sample hose and the internal measuring cells, the instrument performs a zero test and self test. On completion of a successful zero and self test the driver has approximately two minutes to blow into the mouth piece. The sample hose is removed from its storage recess and the driver is then required to blow into the mouthpiece. Sufficient air has been blown through the sample hose when the bar graph is complete. The breath sample is now analysed for alcohol content. The measurement cell is then automatically flushed with ambient air and another zero and self-test performed. The result of the measurement is displayed on the LCD display and then printed on the printout.

Conditions when Measurements are not taken

The following conditions will result in no reading being displayed or printed (the conditions are displayed on the LCD display)

1. Check Airway-obstruction of the breath sampling system

 

2. Zero Test Error-contamination of ambient air

 

3. Insufficient Sample-no sample provided by driver

 

4. Alcohol in Mouth-contaminant alcohol in breath sample

 

5. Range Exceeded-result of analysis exceeds range of accurate measurement

 

6. Interferent Detected-presence of interfering substance detected

In each case the cause of termination of the test is printed together with the time and date of occurrence, followed by TEST DISCONTINUED.

LION INTOXILYSER 5000P-SA

In the Lion Intoxilizer the absorption of infrared light by ethanol at the so-called primary analytical wavelength is used to determine its concentration. To differentiate between ethanol and other organic contaminants, the absorption by the breath specimen of infrared light at three additional but characteristic secondary wavelengths is also measured. The

ratio of these four absorption measurements to each other is then compared with those taken during the initial factory calibration process of the instrument, the values of which are stored in memory. If the relative absorption values obtained on the breath specimen differ by more than a specified amount from the stored values then the presence of a substance other than ethanol i.e. an interfering substance, is detected, in which case a message “interfering substance” is indicated.

Any changes in the light beam used in the infrared cell are detected by a fifth filter, which acts as a reference. This filter transmits light at a wavelength where ethanol and any other contaminants do not absorb the infrared light. The infrared light is detected and comprehension for any changes in light intensity is made.

These five filters are housed on a wheel that rotates at 2 400 revolutions per minute. This means that infrared absorption is measured at each of the five wavelengths forty times per second.

Measurement Process

After the instrument self and zero tests, the breath sample is introduced into the instrument via a delivery tube. From the delivery tube the sample enters the infrared cell and is analysed. The instrument does the whole process automatically. A second self and zero tests take place, and then the result is displayed and printed out.

The instrument self-checking process takes place continually while the instrument is in use. If any of the self-checking processes detect a fault the instrument will indicate that a fault has been detected and will automatically abort the analysis. The instrument also aborts the analysis if a self or zero tests fails.

Operating the equipment

After switch on, the instrument takes approximately 15 minutes to warm up during which period the testing is inhibited. After warm up the testing is started and details entered via the keyboard.

After the operator information has been entered the instrument automatically pumps ambient air through the breath tube and the internal measuring cells, the instrument performs an Air Blank test. On completion of a successful Air Blank test the driver has approximately three minutes to blow into the mouthpiece. The sample hose is removed from its storage recess and the driver is then required to blow into the mouthpiece. This 3-minute period allows up to 5 attempts to blow. Indication that sufficient air has been blown through the sample hose occurs when the tone stops. The breath sample is now analysed for alcohol content. The measurement cell is then automatically flushed with ambient air and another Air Blank test performed. The result of the measurement is displayed on the LCD display and printed on the printout.

Conditions when Measurement are not taken

The following conditions will result in no reading being displayed or printed (the conditions are displayed on the LCD play)

1. Specimen Incomplete- Subject has not provided required specimen (1,2 litres of

breath) within 3 minutes.

2. Mouth Alcohol-Residual mouth alcohol detected

 

3. Interfering Substance- Substance other than ethanol detected

 

4. Ambient Air Fail-Air Blank reading of 0 mg/l not obtained.

 

5. Out of range-Subjects breath alcohol level exceeded 2.2 mg/l

In each case the cause of termination of the test is printed together with the time and date of occurrence.

6. Computer and software requirements-Computer and Peripheral Hardware, Software

Requirements, Software Operational Requirements.

As the specification covers a large number of tests, some of which are destructive, type testing is performed on a sample of each make and model of EBT. Note that not each and every EBT of a particular make and model is subject to testing to SABS 1793, as this is impractical.

In order to ensure that EBT’s remain within the limits of accuracy as specified by SABS 1793 each and every unit should be subject to calibration testing periodically. This usually takes place every six months. It does not mean however that the equipment is no longer accurate after 6 months but is rather to give confidence firstly to the operator and secondly to the courts that the equipment is accurate.

Calibration testing only tests the accuracy of measurement at a number of concentrations of ethanol in air, usually at the legislated limit and at one level below and one level above the limit. It does not test for compliance with all the requirements of the SABS specification. Calibration can be done by using a calibrated gas concentration (dry gas method) or a wet bath simulator. At present the CSIR calibrate the EBT’s using the dry gas method.

RELATIONSHIP BETWEEN BLOOD ALCOHOL AND BREATH ALCOHOL

Alcohol is absorbed through the alimentary canal into the blood stream in which it is distributed throughout the body affecting the nervous system, especially the brain. The blood alcohol level is a measure of the degree to which a person is likely to be affected. In the past blood samples have been taken and subjected to a complicated analysis to determine the concentration of alcohol in the blood.

Taking a blood sample is invasive and its analysis is intricate and time-consuming. Fortunately alcohol is a drug, which is volatile enough to appear in the expired breath. Thus some of the alcohol, which has been consumed and absorbed into the blood, evaporates into the air in the lungs. The concentration of alcohol in a person’s breathe

Is dependent on the concentration of alcohol in the blood. The higher the blood alcohol levels the higher the breath alcohol level, and vice versa. This relationship follows well-established physical and physical and physiological principles. Thus breath alcohol analysis is also a measure of the degree to which a subject is affected by the consumption of alcohol.

The method by which alcohol gets into the blood stream and thence into the breath is initially through the stomach and intestines. The liquids (alcoholic drink) passes quickly from the mouth into the stomach and then more slowly into the small intestine. Very little

alcohol is absorbed through the mucus lining of the mouth and the stomach: the vast majority is absorbed through the walls of the small intestine. The duodenal walls are permeable to small compounds such as nutrients from digested food. Food is broken down by enzyme action in the intestine and the nutrients are small and soluble enough to pass through the walls of the gut, which are richly supplied by the blood vessels. The blood absorbs and carries away the nutrients from the digested food. This process of digestion requires time whereas alcohol being a small soluble molecule requires no such breakdown.

It diffuses directly into the walls of the digestive tract entering the blood in the network of capillaries supplying these organs. (This explains the rapid effect of alcohol consumption) The capillary vessels feed ultimately into the portal vein, which carries the blood with the nutrients and the alcohol to the liver where some of the alcohol (and certain desirable components from the digestive tract) is eliminated. The rest is transported to all parts of the body, including the brain.

The blood from the intestine goes via the liver and joins the used blood from other organs in the vena cava and flows to the right-hand side of the heart. From the right ventricle the blood is pumped to the lungs via the pulmonary artery. In the lungs the blood vessels divide and subdivide becoming capillaries that line the tiny airspaces, the alveoli, containing deep lung air. The alveoli are fine capillaries created by the branching and rebranching of the bronchial tubes. The tissues of the lung are so thin the blood and the air are virtually in contact facilitating the exchange of oxygen and carbon dioxide. If the blood contains alcohol some of the alcohol is lost into the alveolar air, the amount of alcohol going into the air of the lungs is proportional to the amount of alcohol in the blood. This is how the alcohol gets into the breath.

From the lungs the blood returns to the left ventricle of the heart via the pulmonary vein, from where the fresh blood (and alcohol if present) is pumped to the various organs of the body. The brain is richly supplied. Alcohol that is absorbed into the blood is quickly distributed throughout the body.

From the liver the blood flows to the heart whence it is pumped through the lungs. In the lungs the blood is aerated gaining oxygen and giving up carbon dioxide. If the blood contains alcohol some of the alcohol is lost into the alveolar air: the amount going into the air of the lungs is proportional to the amount of alcohol in the blood. The relationship between breath alcohol level and blood alcohol level follows Henry’s Law which states that if one has a given concentration of vapour (alcohol) in the air then, at equilibrium, one would have a definite concentration of that material in the liquid (blood).

From the lungs the oxygenated blood (containing alcohol if present) returns to the heart for

Distribution throughout the body, the brain being richly supplied. For an average healthy man each heartbeat displaces about 70ml of blood. Since at rest the average heart rate is about 70 beats per minute the heart will pump about 5 litres per minute. As an average man has about 5 litres of blood (woman about 4.5 litres) in the blood vascular system, the

blood circulates quickly and freely. Alcohol that is absorbed into the blood is therefore quickly distributed throughout the body.

The alcohol in the blood is eliminated mainly through the action of the liver (95%) where it is broken down to carbon dioxide and water. The remaining alcohol (5%) is eliminated unchanged in the urine-very small amounts are eliminated in the breath and perspiration.

Evidential Breath Testing

The test for breath alcohol requires that breath in equilibrium with the blood be tested, that is alveolar air. For this reason sufficient breath must be exhaled during the test to ensure that alveolar air is measured; corridor air contains lesser amounts of alcohol in the exhaled breath with increasing levels of alcohol until a plateau is reached which will be the alcohol in the alveolar air. This will require about one litre of breath to be exhaled through the instrument detect an aberration in the test will be aborted.

Mention is sometimes made that the presence of mouth alcohol can affect the results of breath alcohol testing since the procedure requires a sample of lung air to be blown into the instrument through a mouthpiece. Mouth alcohol would result in the instrument initially detecting a high alcohol level followed by a lower level from the upper lung air and then a different level from the deep lung air. This would be a deviation from the expected normal test, and the test would be terminated and no result recorded. Before a subject is to be tested the procedure should include a waiting period of at least 20 minutes before the test commences. After this time any mouth alcohol would have then been swallowed or absorbed.

Any mixture which contains alcohol and which is taken by mouth will contribute to the alcohol in the body, so the alcohol in a cough mixture will be absorbed in the same way as alcohol in alcoholic beverages.

The specification for evidential breath testing equipment requires that foreign gases (such as methanol, isopropanol, acetone, ethyl acetate and toluene) that could be in a breath sample together with alcohol should cause no cross-sensitivity in the evidential breath test, or cause no excessive variation, failing which the test would be automatically discontinued. Sometimes persons with severe diabetes could have acetone in their breath at a higher level than the equipment is designed to tolerate, in which case the instrument would automatically terminate the test.

Statistical Evaluation of Standardized Field Sobriety Tests

January 8, 2007

In Press: Journal of Forensic Sciences. May, 2005
Statistical Evaluation of Standardized Field Sobriety Tests
Michael P. Hlastala1, Ph.D.; Nayak L. Polissar2, Ph.D.; and Steven Oberman3, J.D.
1. Division of Pulmonary and Critical Care Medicine, Department of Medicine, Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195-6522 2. The Mountain-Whisper-Light Statistical Consulting, Seattle, WA 98112 3. Daniel and Oberman, an Association of Trial Lawyers, 550 W. Main St., Suite 950; Knoxville, TN 37902 Running Head: Field Sobriety Test Accuracy
1
ABSTRACT: Standardized Field Sobriety Tests (SFSTs) are used as qualitative indicators of impairment by alcohol in individuals suspected of DUI. Stuster and Burns authored a report on this testing and presented the SFSTs as being 91% accurate in predicting Blood Alcohol Concentration (BAC) as lying at or above 0.08%. Their conclusions regarding accuracy are heavily weighted by the large number of subjects with very high BAC levels. This present study re-analyzes the original data with a more complete statistical evaluation. Our evaluation indicates that the accuracy of the SFSTs depends on the BAC level and is much poorer than that indicated by Stuster and Burns. While the SFSTs may be usable for evaluating suspects for BAC, the means of evaluation must be significantly modified to represent the large degree of variability of BAC in relation to SFST test scores. The tests are likely to be mainly useful in identifying subjects with a BAC substantially greater than 0.08%. Given the moderate to high correlation of the tests with BAC, there is potential for improved application of the test after further development, including a more diverse sample of BAC levels, adjustment of the scoring system and a statistically-based method for using the SFST to predict a BAC greater than 0.08 %.
KEYWORDS: forensic science, alcohol, intoxication, horizontal gaze nystagmus, one leg stand, walk and turn.
2
In August of 1998, The National Highway Traffic Safety Administration published on their web page, a final report entitled “Validation of the Standardized Field Sobriety Test Battery at BACs Below 0.10%” (1) as a follow-up to the original work of Burns and Moskowitz (2) and of that of Tharp et al (3). This report has been used as a standard for Field Sobriety Testing (FST) by law enforcement agencies around the US. In the report, authors Stuster and Burns conclude that the use of SFSTs for “estimates of the 0.08% level were accurate in 91 percent of the cases, or as high as 94 percent “if explanations for some of the false positives are accepted”. However the conclusion regarding accuracy is strongly influenced by the large number of subjects with BAC levels much greater than the 0.08% level The accuracy is substantially less for individuals with lower BAC levels, as will be shown below. Three additional papers have recently been published addressing accuracy of sobriety tests at lower alcohol levels. McKnight et al (4) evaluated BAC levels below 0.10 using Horizontal Gaze Nystagmus (HGN) and other modified tests. These authors used correlation analysis and concluded that HGN was the only valid indicator effective in identifying subjects between BAC levels of 0.04% and 0.08%. Another study by Heishman et al (5) focused on ethanol at low levels, cocaine and marijuana using correlation analysis with a variety of variables in addition to the SFSTs so it is difficult to correlate with the Stuster and Burns data. Cole and Nowaczyk (6) studied 21 sober (non-drinking) subjects using trained police officers to evaluate the SFSTs using videotapes of the individuals performing SFSTs. Forty-six percent of the officers’ decisions were that the individual had “too much to drink”.
3
SFSTs are usually used as tools by officers in the field to determine if an arrest followed by a breath test is justified. However, often breath test results are not available in court for a variety of reasons. Under these circumstances, the SFST’s are frequently used as an indication of impairment and sometimes as an indicator that the subject has a BAC greater than 0.08 g/dl.
The purpose of this report is to outline the statistical strengths and weaknesses of the Stuster and Burns report (1) (SBR) and to suggest some improvements in the use of SFSTs. Our findings suggest that the SFSTs may be helpful in estimating blood alcohol concentration (BAC) or breath alcohol concentration (BrAC), but the results of the SBR must be interpreted more conservatively than suggested by the authors.
Methods
The original study was funded by the National Highway Traffic Safety Administration (NHTSA) and carried out in the San Diego area by seven police officers who administered the SFSTs on those stopped for suspicion of driving under the influence (DUI) of alcohol. The officers were instructed to carry out the SFSTs on the subjects, and then to note an estimated BAC based only on the SFST results: including the walk and turn (WAT), the one leg stand (OLS) and the horizontal gaze nystagmus (HGN) tests. Subjects driving appropriately were not stopped or tested. However, “poor drivers” were included because they attracted the attention of the officers. The data
4
collection did not include body weight, presence of prior injuries, and other factors that might influence either the SFSTs or the measured BAC (7, 8).
The officers were asked to estimate the BAC values1 using SFSTs. Some of the subjects were arrested and given a breath test. The criteria used by the officers for estimation of BAC were not described in the report. There appears to be no specific quantitative combination of the FSTs, but rather there appears to be a subjective estimate of BAC. In other words, the decision to determine an estimated BAC was left to the subjective judgment of each officer. Each set of FSTs (for a given subject) was scored by only one officer. So it was not possible to assess inter-officer variability.
The data of Stuster and Burns were obtained via a request to the National Highway and Transportation Safety Administration (NHTSA) using the Freedom of Information Act (FOIA). Figure 1 shows the raw data {estimated BAC (EBAC) vs. measured BAC (MBAC)} for 297 subjects, who had a mean EBAC and MBAC of 0.117% and 0.122%, respectively. The figure shows the line of identity (EBAC = MBAC) and a least-squares regression line for EBAC vs. MBAC. In some cases the EBAC was greater than the MBAC resulting in a greater probability of arrest than if the MBAC had been used (points above the line of identity). In other cases EBAC was lower than MBAC resulting in a lower probability of arrest than if MBAC had been used
1 The SFSTs are designed to estimate the blood alcohol concentration (BAC) in units of gm/dl. However, the SFSTs are evaluated with the breath alcohol concentration (BrAC) in units of gm/210L. We will use the term, BAC and express the values with units of % to be consistent with the original study.
5
(points below the line of identity). EBAC is plotted against MBAC for all observations. The MBAC of these points varies over a range of BAC = 0.00% to 0.33%.
Statistical Methods
The accuracy with which officers classified drivers as having a BAC above or below 0.08% is presented graphically by sorting the data on increasing MBAC and then using a moving window of 21 observations, shifting upward one observation at a time. The accuracy is calculated as the percentage of observations in the window that are correctly classified as < 0.08% or = 0.08% MBAC. The accuracy for the group of 21 observations in the window is plotted vs. the mean of the MBAC measurements in the window.
Four traditional test evaluation statistics were also calculated, namely, 1) sensitivity (percent of true positives who are correctly classified as such by the test), 2) specificity (percent of true negatives who are correctly classified as such by the test), 3) positive predictive value (percent of those with a positive test result who are true positives), and 4) negative predictive value (percent of those with a negative test result who are true negatives) (9). These test evaluation statistics are more commonly used than the accuracy measure defined by SBR. However, the term “accuracy” is used in related literature and in legal proceedings, and, therefore, we use it in this article along with the four more traditional test statistics. It is important to note that one may have very high accuracy yet have much weaker performance on one or more of the four traditional statistics, as happened with SBR.
6
The relationship of MBAC with the three sub-tests of the SFST, with the total SFST score, and with EBAC were analyzed using simple and multivariate linear regression and with Pearson correlation coefficients as a descriptive measure. (10)
7
Results
The accuracy of the SFST is not a single percentage, but depends very much on the level of MBAC. Using the 21-observation moving window, the accuracy of classifying individuals as above or below 0.08% MBAC can be pictured in relation to measured breath alcohol concentration (Figure 2). The data show that the officer’s accuracy in estimating whether a person’s BAC is over or under 0.08% depends on the MBAC. If MBAC is lower than 0.04, the officer is generally 80% or more accurate at predicting a subject’s category (above or below 0.08% MBAC) in the sample studied. If the MBAC is greater than 0.09%, then the officer is about 90% or more accurate at predicting the subject’s category. However, if the MBAC is around 0.08%, specifically, between 0.06 and 0.08, the SFSTs are only about 30-60% accurate in correctly predicting whether a subject’s MBAC is = 0.08% or < 0.08%. The minimum accuracy in Figure 2 is 33%.
The data also provide evidence that the officers’ estimates were not based only on the SFST. This is shown by an analysis where even very liberal use of only the SFST in a predictive model yields a BAC estimate with precision that is substantially inferior to the precision of the officers’ estimates, even though the officers were instructed to base their estimates only on the SFST.
Specifically, regression models provide a method to estimate MBAC based only on the three tests in the SFST. A regression model was fitted to predict MBAC from
8
independent variables including linear and quadratic (squared) terms in the three tests: HGN, HGN2, OLS, OLS2, WAT, and WAT2. The model is liberal in using the three tests, because not all of the variables add significantly or substantially to prediction of the MBAC. Nevertheless, all variables were retained (yielding an over-fitted model), in order to maximize use of the tests within this sample, attempting to mimic or even improve on how an officer might combine test results in practice. Interaction terms between tests were also tried (e.g., HGN*WAT), but they added so little to prediction of MBAC, with a negligible increase in R-squared, that they were not used in the liberal model. (A more appropriate regression model is presented later.)
The amount of variation in MBAC explained by the model based on the three tests alone (and their quadratic terms) is 56%, which increases to 76% when EBAC (the officer estimate of BAC) is added to the model, in addition to the tests. The gain in precision in predicting the quantitative value of MBAC from the model based only on tests to the model based on the tests plus the officer estimates is statistically very significant (20% increase in R-squared, p < 0.001). The mean absolute difference between the officer estimate, EBAC, and the measured value, MBAC, is 0.024% (in BAC units), versus a larger value of 0.031% indicating less precision, for the mean absolute difference between the model-based estimate and the MBAC.
The striking increase in precision when the officer estimates are added to a liberally-fitted model using only the tests suggests that the officers did not base their estimate solely on the test scores but most likely used other clues. This suggests that
9
it may be impractical to evaluate the three tests in isolation from other non-test clues used by the officers, such as slurred speech, odor of alcohol, appearance, admitted drinking or driving behavior. Another explanation may be the presence of other drugs in addition to alcohol. Or, as suggested by critics of the study, Price and Cole (9), it may be that the officers used portable breath testers (PBT) prior to recording their BrAC estimate and were then influenced by the known PBT values. The Stuster and Burns report (1, page 10) notes that “all police officers participating in the study were equipped with NHTSA-approved, portable breath testing devices to assess the BACs of all drivers who were administered the SFST…”.
The utility of individual tests (HGN, OLS and WAT) and the combination of tests to predict MBAC can be evaluated by plotting MBAC against the total score from the individual tests. Figure 3 shows a plot of the measured breath alcohol concentration versus the total score from the three tests, with a reference line at MBAC = 0.08%. For Figure 2 only, a small amount of “jitter” (random noise) has been added to the score of each subject to avoid overlapping points. The jitter is less than ±0.25 points horizontally. The considerable variation in MBAC above each point score is apparent, and in addition, for total scores 4-18, there are MBAC values lying on both sides of the 0.08% cut-point. In order to be 95% confident that the subject has a MBAC greater than 0.08%, the total score (HGN + OLS + WAT) must exceed approximately 17 (based on the 95% lower confidence limit for predicted MBAC for an individual from the regression of MBAC on total score).
10
Figure 4 shows the percentage of measured breath alcohol concentration values that are above 0.08% in relation to each of the three individual test scores. For each score (horizontal axis), the percent of subjects with that score or higher who have an MBAC larger than 0.08% is plotted (Y-axis). In order to observe 95% of persons with MBAC > 0.08% in this sample, the score for WAT (circles in the plot) must be 5 or larger. None of the scores for HGN (crosses) reach the 95% point and the scores for OLS (triangles) reach over the 95% point only at 10 points and higher, where there are only two subjects. Note that the “failure” scores for these three tests, as specified by Stuster and Burns, are 4 for HGN, 2 for OLS, and 2 for WAT (12). Failure of an FST according to NHTSA standards simply estimate the 50% likelihood that a subject is > 0.08%. The data show that in order to be considerably more confident that the subject is above 0.08%, the scores should be much higher than the “failure” scores.
The correlation coefficients for individual tests vs. both MBAC and EBAC are shown in Table 1. The FST with the strongest correlation with MBAC is HGN followed by WAT and OLS. The strongest correlation is with the total test (determined by summing the scores for the three FSTs. However, total score and HGN have very similar correlations with MBAC and EBAC.
11
Discussion
Figure 5, redrawn from Figure 4 of SBR. illustrates the logic used by Stuster and Burns to describe the accuracy of SFST. A correct decision was registered if both MBAC and EBAC are = 0.08% (upper, right quadrant; N=210) or both are “0.08% (lower, left quadrant; N=59). An incorrect decision occurred with a false positive (upper, left quadrant; N=24), when (EBAC = 0.08% and MBAC < n=”4),” mbac =” 0.08%.” n =” 214″> 0.12, Stuster and Burn’s conclusion that the tests have 91% accuracy was strongly affected by the fact that a majority of points are in this high MBAC range, where correct classification as above
0.08 is more reliable. Of the correct results, 210 data points out of a study total of 297 were in the 0.08% to 0.33% range and 59 were in the 0.000% to 0.079% range. (The accuracy estimated by Stuster and Burns as 91% was calculated from the values in Figure 2 as (210 + 59)/297 = 0.91). The number of false positives (N=24) was much greater than the number of the false negatives (N=4). In the range of data near the 0.08% level, the estimated BAC by these experienced officers overestimates the measured BAC, introducing a bias against the subjects (see Figure 1). Using EBAC to determine whether the subject MBAC is greater than 0.08% is 100% accurate for all subjects with MBAC > 0.12%. In other words, if the subject is highly intoxicated, the SFST provide an accurate indication. It is not surprising that if the subject is clearly intoxicated, the officers can make this determination. If the MBAC is < 0.08%, there is a 24 / (24 +59) = 29% chance of a false arrest (determined from Figure 2). 12
To illustrate the problem with the SBR statistical strategy, let’s apply the same logic to determine the level of accuracy at hypothetical cut-point (“legal limit”) levels lower than 0.08%. For example, if Stuster and Burns were to use the same data set to examine the accuracy at lower threshold BAC (0.07% down to 0.01%) levels, they would determine an increasing accuracy level at lower threshold levels. The relative increase in apparent accuracy with decreasing BAC threshold is shown in Table 2, which indicates a hypothetical cut-point for designating a driver as “over the limit”. For example, if the legal limit were 0.04%, the SBR method would conclude that SFST are 93.9% accurate. At a legal limit of 0.01%, the SBR conclusion would be that the SFST are 99.3% accurate. The method used by Stuster and Burns has determined a high degree of accuracy simply because most of the data points are at MBACs much greater than the cut-point of 0.08% used in their study. What underlies this problem is the weakness of “accuracy” as the sole performance statistic for this test, as well as the specific nature of this sample, weighted heavily toward individuals with high levels of MBAC.
An alternative way to explore the accuracy of SFST is to assess the accuracy over a range of points that is symmetric about the 0.08% cut-point (limit). In addition to accuracy, four traditional statistics of test performance also help in this exploration: sensitivity, specificity, positive predictive value and negative predictive value. Table 3 shows the accuracy of SFST when the range of interest extends above and below 0.08% by the same amount, along with the four traditional performance statistics. For
13
data with MBAC ranging between 0.07% and 0.09%, The SFST are 72.2% accurate. As the range broadens, the calculated apparent accuracy increases. At the broadest range of 0.04% – 0.12%, the calculated apparent accuracy is now 82.2%. Taken to the extreme, using all of the data points (MBAC = 0.00% to 0.033%), the apparent accuracy is 91% as calculated by Stuster and Burns. The accuracy of SFST in the vicinity of 0.08% is poorer than estimated in the SBR for the whole data set.
Parallel with the reduced level of accuracy in the range 0.07-0.09% MBAC, the four traditional test performance statistics in Table 3 also show varying performance in this range. Specificity is low (36%), indicating that a large fraction of subjects (64%) would be falsely declared over the limit. The sensitivity is excellent in this range, 96%, due to the tendency of EBAC to overestimate alcohol level compared to MBAC. Positive predictive value (PPV) is fair, 70%, indicating that 30% of the subjects declared over-limit would not be so. Negative predictive value (NPV) is good, 83%, indicating that most of those declared under-limit would really be so, but this, again, due to the over-estimation by EBAC. As the range of MBAC in Table 3 steadily widens to finally include all cases, specificity increases to a maximum of only 71%, while sensitivity, PPV and NPV all reach at least 90%, due to predominance in this sample of high levels of measured alcohol.
A closer examination of the data between 0.04% and 0.12% is shown on Figure 6 (by expanding a section of Figure 1). Another way of determining the officer’s accuracy in estimating the BAC is to compare the fraction of observations (EBAC)
14
overestimating and underestimating the MBAC. If we consider three ranges of MBAC, 0.00% = MBAC <> 0.10%, 50 are overestimates and 108 are underestimates of MBAC. Thus, the experienced officers used in this study tended to overestimate the BAC at low levels (<> 0.10).
The optimal predictive capability of the SFST depends on the scaling for the particular test and the predictive capacity of the test. The maximum scores permitted for HGN, OLS and WAT are 6, 4, and 8, respectively. However, some officers assigned scores that were greater than the maximum score allowable for a given FST. The highest scores assigned in this study were 6, 12, and 9 for the HGN, OLS and WAT, respectively.
By adjusting the weight given to each test and taking account of the precision of the test in predicting MBAC, we find the following linear regression model (equation 1)
15
maximizes the precision of the SFST for estimating MBAC, using only linear versions of the three test variables. The quadratic terms (squared values of the three test variables), while statistically significant as a group (p = 0.004) increase R-squared by only 2%, from 54% to 56%, and have been omitted for parsimony. The model is based on the 261 cases without any missing values for the three tests. Note in the equation below that the increase in BAC per point increase in the score is largest for HGN, with a 0.017 increase in BAC, on the average, for each point increase in the HGN score.
MBAC = -0.007 + 0.017 x (HGN Score) + 0.0012 x (OLS Score) + 0.011 x (WAT Score) (Eq. 1)
The equation does quite well in predicting the mean MBAC, but there is still a large spread of individuals around the predicted value. The standard deviation of individual MBAC values around the predicted regression value is 0.044%. A 95% confidence interval for the true MBAC of an individual, predicted from this regression model, would have a minimum width of ± 0.09%, certainly a wide range.
Using the predictors (HGN, OLS, WAT), the additive model from equation 1 accounted for 54% of the variability in MBAC (corresponding to a correlation of 0.73). Including EBAC as an additional predictor in the model resulted in a substantial and significant increase (p < 0.001) in the variance of MBAC explained, increasing it to 75%. As noted earlier, this marked increase in predictability of MBAC by adding in the
16
officer’s EBAC indicates that the officers’ estimates were probably influenced by factors other than the three FSTs
We believe that the accuracy of the SFST can be improved if a weighted sum of scores from the three standard tests is combined as described in Equation 1. However, this relationship should be tested in a variety of populations, and, in a larger sample, it is possible that non-linear and other functions of the test scores may help in prediction. The evaluation should include an assessment of accuracy and bias in estimating the numerical BAC and, as well, the accuracy in classifying individuals above or below specified limits (such as 0.08%) for various low, medium and high levels of measured BAC. In follow-up trials of the FST, the instructions given to officers for converting test scores into estimates of BAC should be stated more explicitly (such as using equation 1 above, or another algorithm). Further, some attempt should be made to identify and incorporate (or control) other factors, aside from the SFST scores, that influence BAC estimates. It may be difficult or impossible to “turn off” other cues that officers use in estimating BAC or in making a decision about an arrest.
The magnitude of the correlations between the tests and MBAC suggests that this type of testing could be developed further, either through re-formulation of the tests, or through different scoring systems, or by other means. In the current framework, the test scores have to be quite high to provide confidence that the subject is above 0.08%, but further development could potentially improve confidence in the three test results, both singly and in combination. And, anticipating the possibility that
17
some jurisdictions may now or in the future have lower (or higher) legal limits than 0.08%, testing could include more representation from lower levels of BrAC.
The SFST total score and sub-test scores are undoubtedly correlated with breath alcohol level (Table 1). However, predicting a numeric blood alcohol concentration from the SFST scores, as the SFST methodology is defined in the Stuster and Burns report, has limited accuracy and precision. The evidence for this is a) considerable over- and under-estimation of MBAC (see Results section); b) a large range of observed MBAC values corresponding to any given total SFST score (Figure 3); and, c) a large spread of observed MBAC values around predicted MBAC values from a liberal regression model that attempts to optimize the use of the SFST, yet has a minimum prediction uncertainty of ±0.09%.
If our interest is not in quantitative prediction, but in classifying individuals, such as below vs. equal to or above a limit of 0.08%, the utility of the SFST depends very much on how intoxicated an individual is. Accuracy (and specificity) are low when individuals are close to 0.08% MBAC (Figure 2 and Table 3), but if the individuals are quite intoxicated, such as above 0.12%, then accuracy is high (Figure 2).
The use of a single test performance statistic, accuracy, and the calculation of this one statistic for the entire study sample is an over-simplification of the more complex relationship between the SFST score and the MBAC level.
18
SFSTs could become more useful if much more data are accumulated and analyzed using statistical methods such as those presented in this paper, including some of the traditional test evaluation statistics. It is likely that the usefulness of SFSTs will be greatest for drivers who have high test scores. The moderate to strong correlations between the tests and MBAC suggest a potential for further test development. Enhanced understanding would come from tests applied to a more diverse population sample as well as from the development of a statistical approach to predicting the probability of a subject having a BAC greater than 0.08 % from a particular set of SFST scores.
19
References:
1. Stuster J, Burns M. Validation of the standardized field sobriety test battery at BACs below 0.10 percent. August, 1998. National Highway Traffic Safety Administration. 2. Burns M, Moskowitz H. Psychophysical tests for DWI arrest. Technical Report DOTHS-5-01242. National Highway Traffic Safety Administration. Washington, DC. 3. Tharp V, Burns M and Moskowitz H. Development and field test of psychophysical tests for DWI arrest. US Department of Transportation, National Highway Traffic Safety Administration Final Report DOT-HS-805-864, Washington, DC. 4. McKnight, AJ, Langston, EA, McKnight AS, Lange, JE. Sobriety tests for low blood alcohol concentrations. Acc Anal & Prevent 2002;34: 305-311. 5. Heishman, SJ, Singleton, EG, Crouch, DJ. Laboratory validation study of drug evaluation and classification program: ethanol, cocaine, and marijuana. J Anal Toxicol 1996; 20: 468-481. 6. Cole S and Nowaczyk, RH. Field sobriety tests: Are they designed for failure? Perceptual and Motor Skills 1994; 79: 99-104. 7. Hlastala M. The alcohol breath test – A brief review. J Appl Physiol 1998; 84: 401408. 8. Hlastala M. Invited editorial on “The alcohol breath test”. J Appl Physiol 2002; 93: 405-406. 9. Price P, Cole S. NHTSA field sobriety tests validation v. invalidation, 25 The Champion. 2001; 25: 37-42. 10.Fisher LD, van Belle G. Biostatistics. Wiley, 1993. 11.Weisberg S. Applied linear regression, 2nd edition. Wiley, 1985. 12.NHTSA DWI Detection and Standardized Field Sobriety Testing Student Manual, DOT-HS-178-R1/02.
20
Additional information and reprint requests:
Michael P. Hlastala, Ph.D. Division of Pulmonary and Critical Care Medicine, Department of Medicine Department of Physiology and Biophysics Box 356522 University of Washington Seattle, WA 98195-6522 Email: hlastala@u.washington.edu
21
TABLES
Table 1. Pearson correlation of three Field Sobriety Tests with measured breath alcohol (MBAC) and officer-estimated breath alcohol (EBAC).
TEST MBAC EBAC TOTAL score (3 tests) 0.69 0.74 HGN Horizontal Gaze Nystagmus 0.65 0.71 WAT Walk and turn 0.61 0.64 OLS One leg stand 0.45 0.51
22
Table 2. Accuracy of “over-limit” designation based on estimated breath alcohol concentration for defined cut-points (hypothetical legal “limit”) of measured breath alcohol concentration (MBAC)
Legal “limit” (%) N: All N: MBAC < cut-point N: MBAC ! cut-point Accuracy* 0.10 297 107 190 90.6% 0.09 297 97 200 89.2% 0.08 297 83 214 90.6% 0.07 297 69 228 89.6% 0.06 297 58 239 90.6% 0.05 297 43 254 92.3% 0.04 297 29 268 93.9% 0.03 297 19 278 93.9% 0.02 297 9 288 97.6% 0.01 297 4 293 99.3%
*Accuracy = 100%*(# correctly classified as ! limit or < limit)/total
23
Table 3. Accuracy and other statistics related to “over-limit” designation based on estimated breath alcohol concentration for defined ranges of MBAC.
Range of MBAC Total in Range Accuracy Sensitivity Specificity PPV NPV 0.07 – 0.09 36 72% 96% 36% 70% 83% 0.06 – 0.10 65 75% 95% 44% 73% 85% 0.05 – 0.11 97 79% 97% 55% 75% 92% 0.04 – 0.12 135 82% 95% 63% 79% 90% All cases 297 91% 98% 71% 90% 94%
Accuracy = (# correctly classified as ! 0.08 or < 0.08)/total PPV = positive predictive value NPV = negative predictive value
24
Figure Legends:
Figure 1. Estimated BAC vs. Measured BAC for all subjects in the Stuster and Burns study. The line of identity (Estimated BAC = Measured BAC; thin line) and linear regression line (heavy solid line) are shown.
Figure 2. Accuracy of classification of individuals as ! 0.08% or < 0.08% MBAC using the officer estimate. Accuracy is plotted vs. measured breath alcohol concentration (horizontal axis).
Figure 3. Measured breath alcohol concentration versus total of three test scores.
Figure 4. Percent of subjects with MBAC greater than 0.08% vs. the individual test score, with the percentage calculated for all individuals at or above the designated score.
Figure 5. Decision matrix at 0.08% BAC (modified from figure 4 in Stuster and Burns).
Figure 6. Data from Figure 1 expanded to show points between 0.04% and 0.12%. The line of identity (EBAC = MBAC), dashed line and linear regression line (heavy solid line) are shown.
25
Figure 1
26
Figure 2
27
Figure 3.
28
Figure 4.
29
30 Figure 5. Measured BAC (MBAC) Estimated BAC (EBAC) < 0.08% ! 0.08% < 0.08% ! 0.08% N=24 N=210 N=59 N=4 Figure 6.
31

Are Standardized Field Sobriety Test Designed for Failure?

January 8, 2007

Perceptual and Motor Skills, 1994, 79, 99-104. 8 Perceptual and Motor Skills 1994

FIELD SOBRIETY TESTS: ARE THEY DESIGNED FOR FAILURE?’

SPURGEON COLE AND RONALD H. NOWACZYK

Clemson. University

Summary–Field sobriety tests have been used by law enforcement officers to identify alcohol-impaired drivers. Yet in 1981 Tharp. Burns. and Moskowitz found that 32.% of individuals in a laboratory setting who were judged to have an alcohol level above the legal limit actually were below the level. In this study, two groups of seven law enforcement officers each viewed videotapes, of 21 sober individuals performing a variety of field sobriety tests or normal-abilities tests, e.g.. reciting one’s address and phone number or walking in a normal manner. Officers judged a significantly larger number of the individuals as impaired when they performed the field so­briety tests than when they performed the normal-abilities tests. The need to reevalu­ate the predictive validity of field sobriety tests is discussed.

Field sobriety tests have been used throughout this century by police officers to help them assess whether an individual is too impaired to drive an automobile. A classic paper by Bjerver and Goldberg, (1951) examined the relationship between performance on the field sobriety test and driving. Over the past two decades the National Highway Transportation Safety Administration (NHTSA) has funded several studies to examine the effec­tiveness of field sobriety tests in predicting a person’s level of intoxication and driving impairment (e.g., Anderson. Schweitz. & Snyder. 1983; Burns & Moskowitz. 1977; Tharp, Burns, & Moskowitz. 1981).
In a 1977 report, Burns and Moskowitz examined a number of differ­ent tests commonly used by officers. Based on the results from a laboratory study, they recommended three tests, the Horizontal Gaze Nystagmus (HGN) test, the walk-and-turn test, and the one leg stand test for further research. The HGN measures the angle of gaze at the onset of jerking move­ment which can be influenced by alcohol consumption as well as other phys­iological factors. The other two tests require dividing, attention among men­tal and physical tasks. Briefly, the walk-and-turn test requires a person to stand on a line in a heel-to-toe position while listening to instructions and then to take nine steps in a heel-to-toe fashion, pivot, and take nine more steps along a straight line. The one-leg stand requires an individual to stand with arms at the side and extend one foot six inches off the ground and maintain that position while counting for 30 seconds without extending the arms or losing balance. (For complete instructions see “DWI Detection and

‘Requests for reprints can be sent to either author at the Department of Psychology, Clemson University, Clemson, SC 29634. The authors thank Ronnie Cole for his assistance in the com­pletion of this study and Jack Davenport for his comments on an earlier draft of this manuscript.
100 S. COLE & R.H. NOWACZYK

Divided Attention Field Sobriety Testing” by NHTSA, 1987.) Although these tests seemed to hold the most promise, the authors reported that false alarms are a concern. In the 1977 study, 47 percent of the subjects who would have been arrested based on test performance actually had a blood alcohol concentration (BAC) lower than .10 percent, the decision level used by officers.
A 1981 report by Tharp, et al employed the three previously mentioned tests in another laboratory study. The error rate improved somewhat; 32 per­cent of the participants judged to have BACs greater than .10 actually had BACs lower than .10, the decision point used in many states for assuming driving impairment. Reliability coefficients for these tests, however, were of­ten below accepted levels for standardized clinical tests. Reliable rests have coefficients of approximately .85 or higher (Rosenthal & Rosnow, 1991). Test-retest reliability coefficients for the field sobriety tests ranged from .61 to .72 for individual tests and .77 for the total test score for 77 individuals who were dosed to the same BAC level on two occasions. Interrater reliabil­ity coefficients, based on having different officers score performance on each occasion, were even lower, ranging from .34 to .60 with .57 as an over-all test score.
Problems in scoring can be attributed, in part, to the lack of standard­ization across many of the field sobriety test studies. In addition, a few mis­cues in performance can result in an individual being scored as impaired (Anderson, et at.. 1983). For example, a person is viewed as impaired for missing two of nine points on the walk-and-turn test or two of five points on the one-leg stand test. The stringent scoring criteria as well as potential subjectivity in determining whether a point should be awarded may account for accuracy rates that vary from 72 to 96 percent among police agencies using these tests in the Anderson, et al. study. The fact that these tests are largely unfamiliar to most people and not well practiced may make it more difficult for people to perform them. As few as two miscues in performance can result in an individual being classified as impaired because of alcohol con­sumption when the problem may actually be the result of their unfamiliarity with the rest.
This study tested the hypothesis that sober individuals will find the field sobriety tests difficult to perform and, as a result, will be judged to be impaired by officers viewing their performance. Individuals who were com­pletely sober were asked to perform several field sobriety tests and several “normal-abilities” tests which should be well known to individuals. These latter tests included answering personal data questions, such as stating one’s address and phone number, as well as walking in a normal manner. Per­formance on the field sobriety tests and normal-abilities tests was video­taped. Law enforcement officers were asked to view these tapes and deter-
FIELD SOBRIETY TESTS 101

mine if these individuals were impaired (“too drunk to drive”). If the field sobriety tests are difficult to perform under normal circumstances, then we can expect officers to judge incorrectly individuals as being impaired on the basis of the field sobriety test performance as compared with scores on the normal-abilities tests.

METHOD
Subjects and Design

Fourteen police officers from the local municipality or county sheriff’s office rated the performance of 21 individuals who had completed the field sobriety and normal-abilities tests. These officers, with 1 to 17 years of law enforcement experience (M = 11.7 yr.) were volunteers who were certified by the South Carolina Academy for Police Officers which is a state requirement. As part of this certification requirement they had completed the state DUI training program and have had field experience with DUI detection. All offi­cers were assigned to duties in the field.
Ten males, seven white and three African-American. and eleven white females served as participants. They were recruited from local businesses. The owners of these businesses were asked if they had any employees who were willing to volunteer to serve in an experiment involving psychomotor tasks. Participants were currently employed, between 21 and 55 years of age, and not overweight, and had no known physical disabilities.
All individuals and officers were paid for their participation. The indi­viduals performed both field sobriety tests as well as normal-abilities tests. Half of the officers were randomly assigned to each condition in which they viewed performance on either the field sobriety or normal- abilities tests.

Tests Performed

Prior to the administration of the tests, each participant was adminis­tered the Datamaster breathalyzer test. All participants had a BAC level of .00. Each participant performed six field sobriety tests and four normal-abilities tests in the same order in an indoor setting. The field sobriety tests included the walk-and-turn test, alphabet recitation, one-leg stand, a one-leg stand while tilting backward with the eyes closed and touching the nose, a one-leg stand with counting, and a one-leg extension test. These tests were selected after interviewing a number of officers concerning tests they used in the field. None of these officers served in this study. The Horizontal Gaze Nystagmus test was not included because it requires officers individually to monitor the participants’ eye movements which would have been difficult to videotape in a controlled fashion. It is also not included in the 1987 NHTSA self-instructional guide (NHTSA, 1987). The four normal-abilities tests included counting from 1 to 10, reciting one’s Social Security number, driver’s license number or date of birth, recit-

102 S. COLE & R. H. NOWACZYK

ing one’s home address and phone number, and walking in a normal manner, turning around, and walking back to the starting point. These tests were se­lected by the experimenters to sample motor and cognitive activities that are commonly performed by most individuals.
Standard instructions for each test were read by the experimenter. Par­ticipants were told that they would perform a number of motor-coordination tasks that would last approximately 30 minutes. These instructions were based on those used by law enforcement in South Carolina and followed NHTSA guidelines. If participants had questions regarding the instructions, the experimenter reread the appropriate section. The reading of instructions was included on the videotape. The tests were performed indoors in a meet­ing room where distractions were minimal. A 7.62-cm wide strip of tape was placed on the floor for the walk-and-turn test as per NHTSA requirements.

Procedure

Each officer watched a videotape of the 21 individuals performing one of the two sets of tests. The order of performance of the individuals was the same for both the field sobriety tests and normal-abilities tests. The officers were provided with sheets of paper listing the participants by number. The officers were allowed to take notes and were asked “Do you fee!, as a law enforcement officer, that the following subjects, based on field sobriety tests performed on videotape, have had too much to drink to drive.
Their responses, either “yes” or “no,” were recorded for each individual. The decision was recorded by the officer immediately, after viewing the individual’s performance and prior to viewing the next individual’s performance. Eachofficer participated in individual sessions.

RESULTS
The proportion of officers who decided that an individual had “too much to drink” was recorded for each individual separately for the field so­briety and normal-abilities tests. There was a significant difference as a function of test (t29 = 4.38, p<.01). Forty-six percent of the officers’ deci­sions were that an individual had “too much to drink” from viewing the field sobriety tests. Fifteen percent of the decisions from the normal-abilities tests were that a person had “too much to drink.”
Differences among individuals were apparent. Only three individuals were rated as “unimpaired” by ail officers on both the field sobriety and normal-abilities tests. One individual’s performance was rated as showing he had had “too much to drink” based more on the normal-abilities tests (by three officers) than on the field sobriety tests (none of the officers). Five in­dividuals were rated as having had “too much to drink” by all the officers who viewed the field sobriety tests. One other individual was rated as hav­ing had “too much to drink” by all but one officer. Of these six individuals

FIELD SOBRIETY TESTS 103

only one was rated as “impaired” by as many as four of the officers who saw the same individuals performing the normal-abilities tests. Four of these in­dividuals were rated as having had “too much to drink” by two or fewer of the officers viewing the normal-abilities tests.

Discussion

The data indicate that judgments of impairment are influenced by the type of test performed. An individual was more liken to be judged as impaired on the basis of field sobriety test performance than on performance of the normal-abilities tests. Even without alcohol, the number of errors made by individuals performing the field sobriety tests was sufficient for of­ficers to judge that the individuals had had too much to drink. These find­ings are consistent with other studies reporting sizable percentages of indi­viduals judged to be impaired when they were not (Burns & Moskowitz, 1977; Tharp, et al, 1981).
While training of officers, standardization of test instructions, admin­istration, and scoring may reduce the number of incorrect classifications, the major obstacle may be the field sobriety tests. The fact that these tests re­quire unfamiliar and unpracticed motor sequences may put an individual at a disadvantage when performing them. To the law enforcement officer who has demonstrated the tests many times, the motor sequences may, seem easy and straightforward. It may also be that to the casual observer that the tests are easy to perform. Yet, when an untrained individual actually performs the test, then the difficulty of performing the tests at an acceptable level may become evident.
The reliance on field sobriety test performance by law enforcement officers in their decision to arrest or not and by juries in their decision wheth­er to convict a person of driving under the influence underscores the need to examine field sobriety tests critically. The tests should discriminate between the two populations of individuals who are impaired and those who are not. Ideally, the tests should separate the two populations, that is, increase d, the mean difference between the two populations. The tests, however, may be doing nothing more than adjusting the officer=s β, or criterion measure, downward.
These tests must be held to the same standards the scientific com­munity would expect of any reliable and valid test of behavior. This study brings the validity of field sobriety tests into question. If law enforcement officials and the courts wish to continue to use field sobriety tests as evidence of driving impairment, then further study needs to be conducted addressing the direct relationship of performance on these and other tests with driving. To date, research has concentrated on the relationship between test performance and BAC and officers’ perceptions of impairment. This study indicates that these perceptions may be faulty.

104 S. COLE & R. H. NOWACZYK

REFERENCES

Anderson, T. E., Schweitz, M. B (1983) M. B. (1983) Field evaluation of a behavioral battery for DWI. Final Report, DOT-HS-806476.

Bjerver, K. &: Goldberg, L. (1951} Effect of alcohol ingestion on driving ability: results of practical road tests and laboratory experiments. Quarterly Journal of Studies on Alcohol, 11, 1-30.

Burns, M., & Moskowitz, H. (1977) Psychophysical tests for DWI arrest. Final Report, DOT-HS-802-424, NHTSA. ‘

NHTSA. (1987) DWI Detection and Divided Attention Field Sobriety Testing. Final Report, DOT-HS-807-186.
Rosenthal, R., &: Rosnow, R. L. (1991) Essentials of behavioral research methods and data analysis. (2nd ed.) New York: McGraw-Hill.
Tharp, V., Burns, M., & Moskowitz, H. (1981) Development and field test of psychophysical
tests for DWI arrests. Final Report, DOT-HS-805-864, NH’ISA.

Accepted May 23. 1994.