Should CDC keep the original report or the followup report?
This is a false dilemma, but let us explore the question of which one is usually "better"
I got this comment on my previous article:
Another question is why VAERS staff have to delete follow-up or duplicate reports for a given case if they are applying the supposed rule of including only the first report for a case.
The problem with leaving them in the CSV file is clear - it will ADD more injuries than there actually are into the public database and also confound all the aggregate calculations by the outside researchers. And in the worst case, it can become really bad if people intentionally send in more follow up reports just to make VAERS look worse.
So I think CDC is doing the right thing by deleting followup reports.
In my view, the CDC should move all the followup reports to another set of files called DELETEDDATA, DELETEDSYMPTOMS and DELETEDVAX and also add a mapping between the deleted VAERS_ID and the retained VAERS_ID. And they need to be transparent about this and publish these CSV files on the same page as the original CSV files.
People who want to include followup reports in their analysis can do a lookup inside these files and “merge” the information however they want. People who don’t want to do it can just stick to the original reports.
Which report is better?
But the comment also brought up a much more tricky problem. There is no way to know if a followup report is better than the original. So there isn’t any particular reason to choose one over the other.
Follow-up reports aren't always more complete than the original reports. For instance, in your last example, the original report includes several symptoms that are not in the follow-up report, and at least some of these symptoms appear in the write-up in both reports. There is also information in the original report on other medications, current illness, pre-existing conditions, and allergies that doesn't appear in the follow-up report.
Now, the first thought that comes to mind is that this is a false dilemma.
The reports actually need to be merged, and if there is conflicting information, some expert should be asked to resolve it based on all the information in both the reports.
And if the CDC is merely transparent about the deletions and publishes a mapping, I don’t think this is much of a concern anyway.
But for the sake of discussion, let us suppose we need to decide whether followup reports are better than original reports, and how we can decide this.
Adding in a ✅ flag for original reports which are better
Usually I use the flag ⚠️ to indicate that the deleted followup report had some important information which was omitted from the original.
So I simply decided to add a ✅ to indicate the opposite - the original report had some important information which was omitted from the followup.
Since I used the exact same rules (except I reversed the condition) this would be consistent and symmetric.
Here is what the comparison list looks like:
Let us now filter these reports to see which occurs more often.
Out of a total of 1801 rows, there are 388 rows with the ⚠️ flag (this means the followup report had more/better information for this particular field than the original report)
And there are 196 rows with the ✅ flag (this means the original report had more/better information for this particular field than the followup report)
In other words, you can say with a reasonable amount of certainty that the followup reports do contain more information on average.
But even with this analysis, it is quite clear that the original report also often contains important information which is not present in the followup.
Conclusion
To conclude, I think this is a false dilemma and the better option is
a) CDC publishes the deleted reports and includes a mapping file
b) outside researchers then merge the reports based on whatever custom algorithm they want so as to get the maximum possible information
Note: I do read the comments which are doubting the entire VAERS report submission pipeline. I don’t have any opinion on that (because if you think that the reports are completely doctored, what exactly is your proposed next step?)
For now, I am just going to proceed as if it is just a complex system where sometimes the left hand actually does not know what the right hand is doing
It looks like VAERS use to append follow up data to the initial reports. You can actually see it if you study reports pre-2011. See item #4 in image: https://i.imgur.com/IZ4K9iU.jpg . I won't bother with with much explanation, but maybe check out this relevant video: https://www.bitchute.com/video/YmI5hQeAjSfd/ Only publishing initial reports begs the question, how many people are now since dead in VAERS? How many of the ~13K "inappropriate aged" kids that are basically in the lowest event (None of the Above) now have myocarditis? The public won't made aware, but CDC knows as they continue to collect follow-up data. Makes me wonder if the Harvard Pilgrim Study had anything to do with VAERS making this huge paradigm shift of not "appending" to initial reports since 2011? https://www.vaersaware.com/deleted-reports-2007-2022
Thank you for your follow-up analysis, Aravind, and for your history lesson, Albert (WelcomeTheEagle88).
I didn't argue for favoring the original report over the follow-up report, or vice versa. I wholeheartedly agree with you that the best policy is to have ALL of the data, even if they include inconsistencies between original and follow-up reports.
The results of your most recent analysis are consistent with my point that there is no consistent bias in the deletion of reports. Follow-up reports naturally will tend to have somewhat more information than original reports, as symptoms develop and clinicians perform more diagnostic procedures.
My question of how the duplicate reports enter the public version of the database remains. It is very easy to design a database and entry interface to prevent this. Are the VAERS contractors so incompetent that they haven't done so? Maybe they have an incentive to be sloppy, if they negotiate the contract with FDA/CDC based on amount of work. Creating easily preventable errors and then fixing them adds up to a lot of time. And CDC/FDA probably don't care -- they want something that meets just the letter of the law, and might like that VAERS looks shoddy.
A greater concern is deleted reports that aren't duplicates. Albert Benavides has investigated this, and while they represent a very small proportion of all reports, it is very troubling.