If VAERS is merely "hypothesis-generating", why do we even need IRB protocols?
The Case for Vaccine Data Science - Part 9
This is Part 9 of my Case for Vaccine Data Science series.
I mentioned this briefly in the previous article, but David Gorski is upset that the so called “anti-vaxxers” are analyzing VAERS “even without a protocol approved by an institutional review board”
The complete VAERS dataset (scrubbed of personally identifiable information, such as names) can be downloaded and analyzed by anyone, even without a protocol approved by an institutional review board.
Elsewhere in the same article, he refers to VAERS as “hypothesis-generating, not hypothesis testing”
VAERS is, of course, what is known as a passive reporting system in that it relies on doctors, nurses, healthcare workers, and, yes, people receiving vaccines (or their families) to report AEs after vaccines. As a system, it was never intended to provide an accurate estimate of the frequency of AEs related to vaccines, but rather to serve as an early warning system, a “canary in the coal mine”, if you will, for possible new vaccine-related AEs. In other words, VAERS is a hypothesis-generating, not a hypothesis testing, system, and its hypotheses are tested using better systems, like VSD, CISA, and PRISM.
I have already discussed how the closed nature of VSD, CISA and PRISM are preventing the type of open critique which VAERS allows.
Update: a reader asked me about the schema used by PRISM. I am afraid that is basically my whole point! Unlike VAERS, you cannot really get access to these databases without jumping through many hoops. And I found something even more interesting (and hilarious).
Here is how David Gorski describes the alternative active monitoring surveillance systems.
Notice how antivaxxers always cite VAERS and only rarely, if ever, cite other, much better and more reliable, vaccine safety monitoring databases, such as the Vaccine Safety Datalink (VSD), the Clinical Immunization Safety Assessment (CISA) project, or FDA’s Post-licensure Rapid Immunization Safety Monitoring System (PRISM)
If you click into each of those links, one of those does not even link to an actively maintained website but is rather a link to the internet archive!
Guess which one?
I suppose nothing screams “data transparency” than linking to an active surveillance system which is not even accessible from the internet anymore!
But even if you simply restrict VAERS to “hypothesis generating”, why do we still need a protocol submitted to an Institutional Review Board?
I am not claiming that David Gorski intentionally wants to add friction into the process, but that is the outcome no matter what his intention is.
“The Speed of Science”
You might remember the comical admission by a Pfizer representative Janine Small that the vaccine was not tested for transmission because Pfizer was moving at the “speed of science”, so it was considered OK.
By that same token, why shouldn’t we do “hypothesis generating” for VAERS at the same “speed of science”? Why should we restrict ourselves to sending in protocols to an institutional-review-board? (all three words in that phrase individually scream “bureaucracy”. Imagine smushing them together into a single phrase and expecting good outcomes!)
Deleted VAERS reports
In Nov 2022, I did some VAERS analysis and generated a hypothesis that it seemed like the deleted VAERS reports “become less complete, less serious or less conclusive“
As far as hypotheses go, this was actually a pretty good one, mainly because it was partly incorrect (which is what you want in a hypothesis - it should at least lead to something useful even if the hypothesis itself is wrong).
Reader Devon Brewer asked if we were just spinning our wheels and whether the CDC merely kept the first report and discarded followups. I was vaguely aware of this, but now I had a dataset of duplicates (i.e. a mapping between deleted and retained report, which CDC does not publish).
So I tested this improvement to the hypothesis and it seemed to explain all the deletions.
But even though this shows that there isn’t anything overtly nefarious about the deleted VAERS reports, the hypothesis did lead us down some interesting rabbit holes.
For example, the deleted VAERS reports did not “become less complete, less serious or less conclusive“ (which makes it sound like the followup report was filed before the original report), but rather the followup report was often “more complete, more serious or more conclusive” than the original. This is of course vital information if you are actually serious about analyzing VAERS.
For example, this meant that the followup process sometimes missed even death reports.
It also meant that nearly every research paper published on the topic of VAERS understated vaccine dangers.
The Pros and Cons of Vaccine Data Science
Analysis of deleted reports is a very good example of something which represents both the Pros and Cons of doing data science using the VAERS dataset.
I recently wrote this:
I will come back to this in a later article in this series, as one of the major advantages of data science techniques is the ability to “look at” and answer these types of questions. But this approach does have a downside which needs to be discussed in detail, and I will do that in that later article.
As you can see with the deleted VAERS reports, using Data Science principles generates hypotheses, but they are not full fledged protocols either. If anything, we need to do a lot more of it precisely for the reason that David Gorski states.
And it is also the opposite of submitting a protocol to an Institutional Review Board and waiting for approval. And don’t forget that for the non-VAERS systems you don’t even have access to the raw data.
So how exactly could someone have gotten the same results that I did, if they went through an IRB approval process based on a dataset they could not even access? Personally, I don’t think that is even possible.
Summary
The advantage of doing Data Science over the VAERS dataset is the speed of analysis. Since you are usually generating and partially confirming hypotheses (and discarding ones which don’t make sense), you are not constrained by the need to follow a protocol.
The limitation of doing Data Science over the VAERS dataset is that it does not provide a complete analysis.
I've spent some time searching for the actual PRISM database on line and, though I've found descriptions of the database, I've not found the database itself and, of course, no query mechanism. Where is the database and what are the requirements to gain access? What is the schema? What is the current methodology for inclusion? (I've found some old papers, but nothing current.) Interested analysts would like to know.