Shambhavi Thakur

*Listen to this article:*

When your doctor prescribes a treatment, there is no guarantee that it will definitely help you more than it might harm. Just as in many other aspects of life, there are no certainties in healthcare. A diagnosis may turn out to be wrong and a treatment may prove ineffective, or worse, harmful. Statistics offers a way of quantifying and dealing with uncertainties in clinical medicine.

In a video clip tweeted recently, a patient described his “miraculous recovery” from Covid-19 and ascribed it to a drug called Itolizumab that is manufactured and marketed by India’s Biocon Limited under the brand name Alzumab. While his recovery is without a doubt reason to celebrate, we can, and indeed must, be a great deal more sceptical of the uncritical attribution of his survival to the drug in question.

How do we know that he recovered despite the drug and not because of it? Would he have recovered anyway even if he had not been given the drug? How can we be sure?

The specific questions to do with the quality of evidence of the effectiveness of Itolizumab in severe Covid has been critically examined in previous publications; see here, here and here.

The question I want to deal with now is a more general one: How do we determine that a proposed treatment for a particular clinical illness works? If the manufacturers of a drug believe that their wonder drug prevents death and saves lives, how can we be sure that there is solid scientific evidence to justify such a claim?

Enter the world of clinical trials — a poorly understood and badly taught subject. Sadly, many doctors, even senior ones, have little knowledge of the intricacies and nuances of the subject.

**Randomised, placebo-controlled clinical trials, or RCTs**

Of all the advances in medicine, the concept of an RCT is among the most fundamental ideas to have revolutionised clinical practice in the last 75 years. The first RCT to be published was a famous 1948 paper establishing the role of streptomycin in the treatment of tuberculosis.

The modern RCT is based on three linked ideas:

However strongly we may believe that a new drug “must work” — often based on our understanding of a disease process, coupled with experiments in the test tube or in an animal model — we can never be sure that it will do so without a formal high quality RCT.

“Well-conducted RCTs have repeatedly contradicted practices supported by common sense and clinical observation”, as has been argued in this paper warning us not to be beguiled by theoretical bio-plausibility. Indeed, drug regulatory authorities around the world now license a treatment only on the basis of RCTs.

Clinical doctors — those men in white coats with stethoscopes adorning their necks — may think they are in charge, but when it comes to RCTs to test out new treatments for existing diseases (or existing drugs for new diseases), it is the statisticians who are in the lead roles, not clinicians, unless they have specialised in clinical trial work.

The design, sample size estimation, the process of randomisation, the blinding of treating clinician or patient or both, whether the patient is in the placebo arm or the active treatment arm, and certainly the final writing up and reporting of the RCT — all these are special subjects, and it is best for statisticians to be closely involved.

The doctors involved in an RCT may think they are trying out a new drug but in fact they are as much participants as their patients in a scientific experiment with the sole objective of answering the question, “Does this drug have the same effect on the course of this clinical illness as a worthless inert bit of glucose (placebo)?” That “worthless inert bit of glucose” is the placebo that leads to the idea of the placebo-controlled trial. In a remarkable bit of research, Dr Vinay Prasad of the University of California found that when clinical practices in routine use were put to the test, 40 percent were shown in RCTs to be no more useful than a placebo.

**Reporting the results of an RCT**

Conceptually, the simplest RCTs are those where the outcome of interest is a binary event such as mortality, and the patients are randomly assigned to one of two groups: the active treatment group and the control group. If assignment to the two groups is determined not by deliberate clinician or patient choice but by a strict process of random allocation, then we can be as sure as is humanly possible that any differences observed in the outcomes in the two groups can be confidently attributed to the drug under study.

Let’s take the example of the Dutch trial to investigate the role of convalescent plasma in patients with severe Covid illness. This is a form of treatment that has intrinsic common sense behind it as well serious political backing. But injecting plasma taken from another person carries risks and so, it is wise to do a formal trial to make sure that the benefits outweigh the harms.

These researchers decided to carry out an RCT. Assuming a background mortality rate of 20 percent and seeking not to miss a 50 percent improvement on this (i.e. a reduction of the risk of death with convalescent plasma treatment to 10 percent), they figured they would need to study 426 patients with half assigned to receive plasma.

As events turned out, they stopped the trial after 86 patients at which point the data looked like this:

In percentage terms, the death rate was 6/43 = 14 percent in the plasma-treated patients, and 11/43 = 25.6 percent in the control group.

**Interpretation of these numbers**

This is where it becomes tricky. On the face of it, most clinicians would jump to the conclusion that plasma therapy conferred a 11-12 percent reduction in the risk of death. So, hey, it works!

But that would be a hasty, ill-judged, and erroneous conclusion or, as HL Mencken put it, “neat, plausible and wrong”.

As I said earlier, this is no longer clinical medicine. It is now a statistical experiment and the statistician’s logical analysis would go along the following lines.

Recall the nature of the experiment. We are testing a hypothesis. That is what science is about. The hypothesis here is called the null hypothesis and it is that there is, in reality, no difference in the mortality rate between the population of patients treated with plasma and the population of patients not given plasma.

Our clinical trial is but an experiment. We cannot study the entire population of patients so we take a sample with which to generate data to test that hypothesis. If the null hypothesis was in fact the correct one, then we would expect to observe the same mortality rate in the two groups in our sample — about 20 percent in each.

Of course, we know that in practice, due merely to chance, if we tossed a coin 20 times, we would be unlikely to get exactly 10 heads. Similarly, it is highly unlikely, if the null hypothesis was true and if we treated a randomly allocated 43 patients with plasma and another 43 without plasma, that we would observe 20 percent mortality in both the groups.

So, here is the question: What is the probability that we would observe a departure from that 20 percent expected value by as much as six or more percentage points either side? How far either side of the expected 20 percent must the observed mortality rates fall before we conclude that the null hypothesis must be rejected?

This is where the idea of statistical significance and the p-value comes in. In the standard null hypothesis statistical test, we accept by convention that if the p-value is under five percent, we reject the null hypothesis. This is not a rule, there is no science behind it, it is just a convention. In the case of our example, it turns out that the probability of observing a six percentage points departure either side of the expected 20 percent mortality rate in each group is, in fact, 0.28 or 28 percent. This then is our p-value.

Another way of expressing this idea is that the results we observed in our experiment with 43 patients in each arm of this RCT are not surprising enough to conclude that we should reject the null hypothesis of no difference between plasma and no plasma.

Bottom line conclusion: Though convalescent plasma looked like a great idea, this trial does not provide sufficient evidence that it in fact saves lives.

But try explaining that to President Trump!

***

*The media must be free and fair, uninfluenced by corporate or state interests. That's why you, the public, need to pay to keep news free. Support independent media by subscribing to Newslaundry today.*

The Bedaquiline Question

Fact-checking during a pandemic: A response to Biocon and the Telegraph

Complaining about the media is easy and often justified. But hey, it’s the model that’s flawed.