Jikkyleaks vs Jeffrey Morris opens a real pandora's box
It makes a strong case for a full audit of NIH's RECOVER consortium data
Summary:
The Twitter exchange between Jikkyleaks and Professor Jeffrey Morris about the NIH RECOVER consortium data has opened up a Pandora’s box
RECOVER consortium data relies heavily on EPIC EHR systems, which hold medical records for 78% of US hospital patients as of 2022
But EPIC EHR systems lacked a checkbox option to select or change unvaccinated status
Looking at the data, we can see approximately 44% of patients marked as "UNKNOWN" vaccination status were actually vaccinated
How this misclassification impacts Morris's paper findings:
Original claim: Unvaccinated people were 95% more likely to get COVID-19 and long COVID
After recalculating with the corrected classification:
COVID-19 infections: The results reverse, showing more vaccinated cases
Long COVID: The numbers become equal between vaccinated and unvaccinated groups
Based on these findings, we can see the clear need for a full audit of the NIH RECOVER dataset and all papers that relied on this data
My examination reveals this issue could affect multiple research papers using the RECOVER consortium data, potentially reversing their conclusions
While this may not constitute fraud, the evidence points to a serious data classification problem that demands investigation at the very least
There was a fascinating thread between Jikkyleaks and Prof Jeffrey Morris on Twitter recently
After patiently answering a few questions1 somehow Prof Morris concluded with this:
So he says - “There is absolutely no justification for alleging fraud in this well known and used consortium data set.”
Jeffrey has slightly moved the goalpost from requiring an audit, which can be justified for many different reasons, to alleging actual fraud, which obviously needs a much higher bar in terms of evidence. But in his defense, the conversation also moved in that direction.
So I will connect some dots here (with some help from an LLM2) which shows quite clearly that there is a need for a full audit3 of the NIH RECOVER consortium data.
1: RECOVER data relied heavily on EPIC EHR
This was mentioned in one of the NIH RECOVER online meetings (emphasis mine):
So I do have one question, if I may ask, it seems like you are [00:15:00] going to be using the electronic health records. I know that Cerner and Epic are the major two electronic health systems that are utilized. If there are providers that aren't on those systems, whether there'll be alternative ways that they can participate if they don't have access to those systems.
2: 78% of US hospital patients have their data in EPIC EHR
This is from Wikipedia, and in fact touted by EPIC EHR to promote their company.
Epic Systems Corporation (commonly known as Epic) is an American privately held healthcare software company based in Verona, Wisconsin. According to the company, hospitals that use its software held medical records of 78% of patients in the United States and over 3% of patients worldwide in 2022.
Based on 1 and 2, it is possible the entire dataset used by Jeffrey Morris came just from EPIC EHR.
This is of course the worst case analysis but since there seems to be no easy way to find out what percentage of the data used in Jeffrey Morris’s paper came from EPIC EHR, we will just use this number unless we get an update from Jeffrey with the actual number.
3: EPIC EHR did not have a checkbox to select unvaccinated status
This is from Chief Nerd, who posts many data driven threads on Twitter
And even more worryingly, there is no option to change the status to unvaccinated.
Read the full thread on Twitter because it has a lot more information on this topic.
4: Nearly half the UNKNOWN status were in fact vaccinated
This is the real kicker, and it is from the same thread
How does this affect the Jeffrey Morris paper?
The paper, which is based on the RECOVER network as Jeffrey mentions at the top of his thread, makes the following claim: people who were unvaccinated were 95% more likely to get a COVID19 infection, and about 95% more likely to eventually get long COVID.
I asked Google’s NotebookLM (which is pretty good at this task) to calculate the underlying numbers, which were not mentioned in the paper4
COVID19 infection
This is for the number of people who got a COVID19 infection
During Delta, about 300 people who were vaccinated got COVID19 while about 2100 people who were unvaccinated got COVID19. Suppose 44% of the unvaccinated group was actually vaccinated.
0.44 x 2112 = ~930
This means we will have 300 + 930 = 1230 vaccinated people who got COVID19
And 2112 - 930 = 1182 unvaccinated people who got COVID19
In other words, this has reversed the results!
Long COVID
Let us do a similar calculation for the people who got long COVID
Suppose 44% of the unvaccinated group was actually vaccinated.
0.44 x 141 = ~62
This means we have 17 + 62 = 79 vaccinated people who developed long COVID
And 141 - 62 = 79 unvaccinated people who developed long COVID
Which means there was practically no difference between the two cohorts!
Other papers
And generally speaking you will notice that all this is just an effect of transferring 44% of the people in the unvaccinated cohort into the vaccinated cohort.
And even in papers which somehow came up with a scarcely believable 95%+ efficacy like the Jeffrey Morris paper, this reallocation almost entirely reverses the result.
For papers based on the RECOVER consortium data with less pronounced effects, the final results will probably be even worse and may even favor the unvaccinated cohort in some cases.
Full audit
We cannot just generalize and say “Let us move 44% of the unvaccinated people into the vaccinated cohort for every paper and see the results”
There are probably many other mitigating factors5.
However - this certainly necessitates a proper audit of the NIH RECOVER dataset to see how many of the papers suffer from this problem of wrong classification of the vaccinated as unvaccinated.
And whether or not Prof Morris thinks it is “justified”, no one is going to trust the results from his work until this happens.
The vaccine pushers might be tempted to ask “So are we now supposed to audit every single paper which relied on this dataset?”
Not really.
You only need to audit the papers that you don’t want to be automatically discounted!
And I genuinely commend Prof Morris for this, because some of us were definitely behaving like proper trolls :-)
You might have noticed a theme on my Substack recently - LLMs are going to soon make it very hard for these Pharma companies to obfuscate data.
Personally I don’t think this qualifies as actual fraud. But it could be, I am just saying I don’t know yet.
They just left it as percentages, but that just seems to be the norm these days
I don’t know what they are, but we can usually count on the vaccine pushers to find a million confounders if a result ever favors the unvaccinated! :-)
Why do you assume that the vaccinated people who were misclassified as unvaccinated would've had a much higher incidence of PASC than the vaccinated people who were properly classified as vaccinated? What different characteristics did the mislabeled vaccinated people have from properly labeled vaccinated people that would explain such a drastic difference?
If 44% of unvaccinated person-weeks were contributed by mislabeled vaccinated people who had the same rate of PASC as properly labeled vaccinated people, the mislabeled vaccinated people would've had a total of only about 1.9 cases of PASC (from .44*397762/10000*.11). So properly labeled unvaccinated people would've still had about 139 cases of PASC (from 397762/10000*3.54-.44*397762/10000*.11).
Another problem with your calculation is that even if 44% of people with unknown status would've been vaccinated at some point, it doesn't mean that 44% of person-weeks under the unvaccinated/unknown label would've been contributed by vaccinated people, since people with known vaccination status would've contributed unvaccinated person-weeks before they got vaccinated (if they got their first dose after the observation period had already started).
FFS everything is a shitshow