Discussion about this post

User's avatar
Jessica Rose's avatar

Great write up. When I run my code to check for duplicates, I check duplicate IDs and I search for duplicate entries by matching a number of variable fields simultaneously. The search algorithm I use is actually quite strict: I check for dups based on exact field matches in the AGE_YRS, STATE, SEX, VAX_LOT, VAX_SITE, VAX_ROUTE, VAX_DATE, VAX_NAME, VAX_MANU, RECVDATE, CAGE_YR, CAGE_MO, DIED, HOSPITAL, ER_ED_VISIT, ALLERGIES, ONSET_DATE, PRIOR_VAX, VAX_DOSE_SERIES, L_THREAT, DISABLE, BIRTH_DEFECT variables. I really think you might be onto something with the idea that the follow-up filings might be the ones removed due to the fact that there are multiple field matches. If I got that wrong from your write-up, I think it may be right. For example, the algorithm I made to suss out duplicates checks for both ID dups and multiple and simultaneous variable field dups (variables listed above) since the same person can remain with permanent ID in VAERS under two different IDs. So theoretically, the VAERS data wranglers might do the same thing to remove duplicates, but perhaps their 'algorithm' to seek and remove duplicates based on duplicate simultaneous multiple field entries is always 'running' (for lack of a better way to express this) and thus maybe that's precisely why people's follow-ups don't get into the front end of VAERS. We have to assume that the temporary VAERS ID entries would be cross-checked against permanent VAERS IDs: how else would they know if 'someone' had previously filed a report? So if matches were found across many variables, maybe their 'algorithm' removes it prematurely? The only weird part about this idea is that, well, we would see none of it. But, I agree, this would not be nefariousness, but it would be stupid and something that needs to be fixed. The shitty part is there's no way for us to check this theory since we don't have access to the uncooked books with the temp IDs. We need a study: people who receive temp IDs and subsequent perm IDs trying to submit a follow-up. Like hundreds of these. Just to see what happens. Would any of their perm IDs get updated? Would they get a new perm ID? Jess

Expand full comment
Gary Hawkins's avatar

I would agree with the additional new file by CDC with a column explaining each deletion.

The discussion above must be about reports that were visible but then deleted.

But there seem to be 60K that have never been published. 230K at one time so at least 170K of those were eventually released. That's the effect in action we all know that each week some are held back, showing up later, with others being held back.

See the never published chart:

https://deepdots.substack.com/p/new-vaers-flat-file-easy-data-mining

Expand full comment
7 more comments...

No posts