This is Part 10 of my Case for Vaccine Data Science series.
For now, this is the last part of this series, but I will update it if I see any real need.
Recently, Albert Benavides from VAERSAware posted something really interesting on Twitter:
He has a lot of dashboards on the website which explain these “throttling” delays in VAERS reports. This has actually affected a lot of VAERS research in my opinion.
In this article, I will use some charts specifically for VAERS death reports to show some examples of what is actually going on. I chose the death reports because it is a binary Yes/No outcome, and there cannot really be good reasons to delay the reporting for at least these reports (if you ran a government agency which actually cared about health).
The dates used in VAERS death reports
But first, we should go over all the dates which are used in VAERS death reports.
VAX_DATE = The date of vaccination
ONSET_DATE = The date when the symptom was first observed by the patient
DATEDIED = In the case of death, this is the date the patient died. Sometimes even if the symptom onset is immediate, the date of death could be many months later. We need to consider that in our analysis
RECVDATE = The date the VAERS report was submitted
APPEARED_ON = This is not one of the columns in the CSV file, but rather this is some information I scraped from the MedAlerts website. This is the date when the report gets added to the CSV file and becomes public.
I took all the death reports for the COVID19 vaccines and constructed a dataset.
You can use the dataset above to verify everything else I am writing in this article.
15 day delay allowance
I allow for a 15 day delay from the date of death to the date the report appears in public. There are actually some cases where the APPEARS_ON is the same as the DATEDIED (click here for an example).
However, I allow a 15 day delay for death reports (the delay from date of death till it appears online) because in some cases there could be some PII information which needs to be redacted etc.
Chart explanation
For each chart which follows, the X axis is the first day of a given month (beginning 1st March 2021).
I subtract 15 days from that date (so for e.g. for 1st March 2021 it would be Feb 14, 2021)
Let N be the number of death reports where DATEDIED <= 14 Feb 2021
I sort the reports into two buckets:
The reported bucket consists of all death reports on or before 14th Feb 2021 which appeared online by 1st March 2021. Let us call this R.
The unreported bucket consists of all death reports on or before 14th Feb 2021 which have not yet appeared online by 1st March 2021. Let us call this U. This number will be equal to N-R
It goes without saying that you want R to be as close to N as possible, and you want U to be as close to zero as possible.
To consider a specific example, let us consider the month of March 2022. 15 days before that date is 14th Feb 2022.
According to the chart, there were 12269 death reports where DATEDIED was on or before 14th Feb 2022.
Of these, only 9299 death reports were seen in the VAERS dataset by 1st March 2022 (i.e. if you downloaded the CSV files on that date).
This means only 75.79% of the VAERS death reports which happened on or before 14th Feb 2022 were available online by 1st March 2022. The rest of them were reported after that date, some of them much much later. Note: The rightmost axis in the chart is the percentage.
Same day reports
I did the same analysis for same day death reports - these are death reports where NUMDAYS is zero. In other words, the symptom onset began on the same day of vaccination (but the date of death need not be on the same day).
Same week reports
I also did the same analysis for death reports where NUMDAYS <= 7.
These are deaths which happened close enough to the vaccination date that you should at least do a thorough investigation.
An example
Let us consider a concrete example.
This paper was published in April 2022, and states that there were 7674 death reports. (Emphasis mine)
From the Methods:
METHODS: A retrospective analysis examined VAERS reports between 14 December 2020 and 8 October 2021 and focused on AE reports related to COVID-19 vaccines and AE outcomes [e.g., emergency room (ER) visits after being vaccinated, hospitalization, prolongation of existing hospitalization, life-threatening events, disability, birth defect, and death]. Reporting odds ratios (RORs) and Breslow-Day statistics were used to compare AE reporting between COVID-19 and non-COVID vaccines and between individual COVID-19 vaccines.
From the Results:
RESULTS: A total of 604,157 AEs of COVID-19 vaccines were reported, including 43.51% for the Pfizer-BioNTech vaccine, 47.13% for the Moderna vaccine, and 9.12% for the Janssen COVID-19 vaccine. About 12.56% of patients visited ER after being vaccinated, 5.96% reported hospitalization, and 1.52% reported life-threatening events. Among the number of death cases (n = 7,674; mean age = 73), 2,025 patients (26.39%) had hypertension and 1,237 (16.12%) patients had cancer.
If we look at the chart, we see that there were already 8074 death reports by Oct 2021 when the authors looked at the VAERS CSV file.
What explains the difference of 400 death reports, which constitutes nearly 5% of the total reports as of that date? I think most of it can be attributed just to the delay in reporting.
Unfortunately, almost no VAERS paper fully explains the exact methods they used. For example, in this specific paper, I am not able to say for sure if a death report merely checks for DIED=’Y’, or does it include reports where one of the symptoms is death, and do they also include reports where the write-up includes the word death? The methods do not elaborate this, and I think all VAERS papers should.
Summary
Regardless of the specific percentage, which hit a low in March 2022 of just 75%, you can see that it can lead to significant undercounting of even death reports.