I continue covering COVID-19 in my blogs and in the last one, when I was talking about the real statistics, I mentioned that we don't know the real prevalence of coronavirus in general population and that much more research is needed. I also mentioned the study1 that Dr.Ioannidis announced as a new "breakthrough" research and I made a commitment to review it.
The study aimed to estimate the real number of people infected with coronavirus and it has already attracted a lot of attention for obvious reasons – there are lots of speculations in mass media that are related or refer to this study. I will be honest, I was going to make a full critical appraisal of this study, like the one I did on Dr.Ornish's Lifestyle Heart Trial, but just after reading the abstract, I realized that it's flawed beyond measure, so my today's blog will be rather short.
The snapshot of the study:
The authors have advertised their study via Facebook ads, which invited people to volunteer to be tested for coronavirus. They tested them using lateral flow immunoassay and reported the unadjusted prevalence of antibodies of 1.5%, and the estimated population-based prevalence of 2.81%. They interpreted these results as the indication that there are 50-85 times more people infected with coronavirus in general population and accordingly the virus has spread much wider that we expected and its case fatality rate is much lower than previously considered, around the same level as the CFR of the regular flu.
I believe that any kind of studies in this area are commendable at this point, but they should be properly evaluated and communicated to us – just facts, the study limitations and implications. No sensationalism! Unfortunately, this one is not one of these reports. It has several issues, for example a lot of people already commented on the method of testing and its shortcomings, but I wouldn't worry too much about it – almost all tests have limitations and issues, so this is not a huge one. I would say it's not even close to the main design problem.
The critical flaw of the study
Normally I would call a study design issue an "issue", or "concern", or "bias", but in this case, it is literally a flaw, which is critical as it completely eliminates this study's external validity. Let me explain: in order for any kind of sampling methodology to be generalizable to a larger population, the sample must be representative of this population. This study has two major sources of bias arising from their selection process:
- First, they used Facebook ads to target a "representative sample of the county by demographic and geographic characteristics", which is nearly impossible to do to begin with. The issue is that they have selectively addressed a portion of the Santa Clara county – only those who had internet, had Facebook account and used it. It might sound odd in 2020, but a lot of people are not using Facebook, some people might be cut off from internet like homeless people, prisoners, military personnel, shift workers to an extent. Thus, in general this target audience is already quite different from the population of Santa Clara county.
- Then, we have the second problem – people who saw the ads volunteered to participate in the study. They were not randomly selected, they chose to participate in the study and it's safe to assume that many of these people were in a risk group for coronavirus – they might have been in contact with somebody with COVID-19, they might have had some flu-like symptoms, they might have suspected that they were exposed etc. Also, likely those who volunteer to help in the hospitals or simply health care workers are also likely to respond to these ads, which, again increases their chances of having contracted the virus and accordingly – their chances of testing positive for it.
So, taken together these two issues are expected to result in an incredibly higher number of people testing positive and the sample that is not representative even of Santa Clara county let alone the US or the world. And we are not talking about 20-30% increase in prevalence, no, likely we are talking about the order of magnitude differences, which quite expectedly led to such sensational findings of "50-85 times" higher prevalence of the viral infections in the sample compared to the previously reported official data.
In conclusion, I must say that though this study corroborates my message on real statistics of the COVID-19 and in general I believe that we need more research and we need it fast, this study's findings are not generalizable and, quite frankly, I am shocked that Stanford epidemiologists could come up with something like this. Moreover, I find it to be utterly irresponsible for them to endorse and sensationalize this kind of research findings without public acknowledgement of their limitations and proper context.
I will definitely discuss it on my YouTube channel, so if you are curious please go ahead and check my video on it out. Also, don't forget to subscribe to my channel – I hope you'll like it – and you are always welcome to leave comments and make suggestions.
Stay safe and strong,
1. Bendavid E, Mulaney B, Sood N, et al. COVID-19 Antibody Seroprevalence in Santa Clara County, California. medRxiv. 2020:2020.2004.2014.20062463.