introducing a new feature which will investigate the many dynamics of a changing landscape for doping in sports. topics will include expected subjects such as detection methods and athlete behaviours. however, we will take in a wider view of topics as well, such as the larger societal impact, media and marketing responses to doping and the science of detection.
the topic i am personally most interested in is the explicit and more subtle or implicit goals behind testing for drugs, and what it means to have clean sports. however, to kick it off i want to look at a common misconception with regards to testing for drugs. several of us have been batting around a number of topics with regards to doping and the one i hear, that rears up all too often, is that samples should be sent to different labs, or split and saved so that additional independent confirmations can be made later. it's important to think more carefully about whether this idea is a good one or not. this proposed solution pre-supposes that the problem with catching dopers is that the lab is not doing a sufficiently rigorous or thorough job. this is very likely not the case. to understand why, one needs to look at the possible outcomes from a doping detection regime.
this is an issue with two dimension: actual state and test result. the first dimension is the athlete's actual state or behaviour. you have an athlete. there are two mutually exclusive possibilities: athelete doped (we'll call this one (1)) or athlete not doped (we'll call this zero (0)). like schroedingers cat, you know nothing about the state of doping until you intervene by testing. the second dimension is the test for doping. as before, there are two outcomes: test revealed presence of compound associated with doping (we'll call this one (1)) or test did not reveal presence of compound associated with doping (we'll call this one (0)).
you can obviously never get an answer to the first dimension directly without cooperation. the second dimension, however, you control completely. and this is what makes it both interesting and frustrating, because this is the dimension where the breakdown occurs. the breakdown is not in the process itself but in the inferences you try to draw from it.
-
possible outcome 1: athlete dopes, tests positive. correct positive identification
-
possible outcome 2: athlete dopes, does not test positive. false negative
-
possible outcome 3: athlete does not dope, tests positive. false positive
-
possible outcome 4: athlete does not dope, does not test positive. correct negative identification
it's helpful to understand what it means to produce a false positive and a false negative. false positives, where a rider is determined to have doped without having done so, can only arise during testing. the rider cannot contribute to the cause of a false positive. a false positive can result from bad handling (contamination or poor care of sample) or from poor analysis that can arise because of several reasons including, but not restricted to, operator error of machinery, bad calibration or bad standards, or machinery malfunction. false negatives on the other hand can arise for a variety of reasons.
- avoidance: many performance enhancing products are only in the body (system) for a few days so cannot be detected afterwards. if an athlete takes the drug outside a window when testing will occur, they have the effect but will not show the evidence. this is well documented and well understood. cycling has an extensive system of out of competition tests and the real meat of the current doping scandal is how lax the international sanctioning body has been about athletes who thwart this system of out of competition checks. special bonus for cycling fans: anyone who hasn’t read the section of willy voet’s book on how cyclists delivered “clean” urine samples after races under close scrutiny needs to see my prior post on qualifications for discussing these topics.
- masking agents: there is well established chemistry that will prevent sure recognition of doping compounds. ambiguous test results favor the athlete (tie goes to the runner) so masking can be an effective strategy. unless you give yourself away. garzelli, a very good cyclist was almost dumped during the giro several years back for having vast amounts of probenecid in his system. http://www.cyclingnews.com/news/?id=2002/may02/may20news the link is almost comical in indicating no real benefit in either performance or masking. classic euro old school flavor in that one.
- undetectable compounds. mostly because they are not known. think how many molecules are in your blood or urine. they can only identify things they are looking for. if they don’t know athletes are taking it, they can’t look for it. if they can’t look for it, they can’t find it. and if they can’t find it, the athlete is not guilty. qed.
- maintenance of the sample. one of hamilton’s samples degraded to the point where it could not be tested due to bacterial contamination. again, tie goes to the runner. under the current regime, you are innocent until proven guilty. as it should be. but this happens more often that the labs would like to admit. because it should never happen. it’s a custodial and stewardship issue.
- limits of detection. this is a complicated issue. the labs are able to detect compounds at levels much smaller than the limits that are published. the limits are there in part because they represent what would be a strictly performance enhancing level of the compound (i.e. caffeine, all riders drink coffee but no one ever tests out for that) and in part based on levels of detection. the compounds may be there but not in sufficient amounts to pass a threshold to call it a positive test.
this implicit assumption described above is that there is a sufficient amount of information for correctly classifying the cases present. it represents a problem in understanding what type i and type ii errors are really caused by and how they are identified. it's like sleight of hand. you could impose twice the amount of testing on the existing samples, four times, ten times, and in spite of all that extra work, you'd be no more certain about anything other than the variation in laboratory analysis. and yet changes in the frequency of outcome 1 relative to outcome 2 would be ascribed to this testing. but it's not real. the factors that are likely causing outcome 2 cannot be identified by refinements in the way existing samples are tested. short of testing every 2-3 days for epo, there is no way to be positive that your sampling regime is capturing it's presence. the mistake here is in the collection regime, not the analysis regime.
the interesting but unintended effect of subscribing to the more analysis of existing samples approach is that it further obscures the important factors contributing to false negatives. we have a social idea that more testing makes one more confident, but it’s an illusion. above is only a partial list of the reasons why false negatives might occur and cross lab testing, or additional sample splitting doesn’t address any of them. it’s a classic case of addressing the symptom and not the cause. there’s no good evidence right now that the labs are the problem so why would people spend a lot of time and money addressing possible inter-laboratory variation in analysis? this is the easiest to quantify and the laboratory accreditation process addresses it directly.
assuming that more testing of the current samples would provide better results is not correct. if this was true, and easy to demonstrate, no one would support the current testing regime. it's expensive and a hell of a lot of work and people's livelihoods are on the line. simple statistical power analysis would be enough to show that the protocol was inadequate. you need to know that your detection regime is robust before you start. and the thing is that the detection regime is fine. in other words, more testing, in statistical or probability terms, won't tell you anything you don't know already. hence, one can expect negligible change for the better in the prevalence of type i and type ii errors based solely on increased attention to this factor.
add to all these factors, that adjudication of this process takes place in what is, obviously, a legal context. as a result, all protocols must be followed exactly and spelled out explicitly. one mistake in a remarkably long chain of administrative minutiae is effectively a get out of jail free card. in addition, pitting labs against each other if they produce different test results will put any scientific credibility out the window. labratories will be reduced to the humiliating status of dueling subject experts in a trial.
assuming that we desire "clean" sports, and this is a discussion which is emerging, we need to focus on the sources of the false negatives.
(ref: http://www.washingtonpost.com/wp-dyn/content/article/2007/07/31/AR2007073101997.html )
- posted by scott