Peer Reviewed Lancet paper highlights CDC's v-safe text mining failures
How did this paper even clear peer review?
Summary:
A CDC paper published in August 2022 is the only one (I know of) which published sample verbatim free-text responses from v-safe
The paper tries to filter free text responses related to menstruation, and to further classify them on different dimensions like severity etc. using a pretrained Machine Learning classifier
The sample code used in the paper shows that the v-safe free text information is already in a format which can be easily analyzed if the CDC wanted to do it
The v-safe app omitted a list of known symptoms from the vaccine clinical trials from its check-the-box options before releasing the v-safe app. This meant people who experienced these symptoms could only report them using the free text option.
Till date, the CDC has not done any text mining of v-safe free text for these omitted symptoms
The paper cleared peer review because
a) it was written by the CDC, and
b) the left hand of the CDC does not know what the right hand does
Here is the paper (published August 2022):
Let us call this Paper 2. (Paper 1 will be introduced later in the article)
So what is so unique about this paper?
As far as I am aware, it is the only peer reviewed paper which provides sample verbatim free-text responses from v-safe1.
They look like this:
So how does this paper highlight CDC’s text mining failures?
First of all, here is an excerpt from the paper:
Data sharing
Individual participant text responses in v-safe are not available to others because they might contain personally identifying information.
And this is another excerpt from the paper:
Implications of all the available evidence
Menstrual irregularities and vaginal bleeding after COVID-19 vaccination are being reported, although this study is unable to assess whether these events are caused by COVID-19 vaccination. Text analytic methods can speed up the identification of potential vaccine reactions that are not prespecified in structured data collection. Further studies are needed to better understand their prevalence and pathophysiology.
When you combine these two excerpts, a few things become clear:
1 Only the CDC has access, and hence ability, to do text mining of v-safe free text entries. If someone can be blamed for the failure to do proper text mining, it should be the CDC.
2 They acknowledge that “text analytic methods” can “speed up” the identification of potential vaccine reactions that are “not prespecified in structured data collection”. They are referring to unsolicited responses, which we already know that the CDC has not analyzed till date2
3 The paper also shares some sample code which tells me that all the free text data has already been compiled into an easy-to-analyze CSV format (which I will discuss in a future article)
4 Given the clear admission from the CDC that it has not done text analytics on unsolicited symptoms, it is pretty clear that they are trying to pass off absence of evidence as evidence of absence (of vaccine danger).
If they wanted to be certain there was actually an absence of evidence in the nearly 6 million v-safe free text responses that they collected, the CDC should have actually done a more thorough job of text mining (explained later in the article).
And in case the CDC felt it did not have the necessary skillset in-house, they should have asked for help from people in the tech industry instead of declaring that the mRNA vaccines were completely safe.
What kind of skillsets should they have been looking for?
I would say an “AI Scientist” working at one of the tech giants should have been able to sort this out for the CDC in a matter of a few days.
But I am not sure if the tech giants would have cooperated3.
Publication Timeline
Now I will provide another heavily cited paper which looked at pregnancy outcomes, so you can get an idea of the timeline of how things evolved. This was published in June 2021.
Preliminary Findings of mRNA Covid-19 Vaccine Safety in Pregnant Persons
We will call this Paper 1. It has almost 1000 citations according to Google Scholar.
This paper (and the timeline) is important because the two papers share two authors: Tanya Myers and Christine Olson.
In other words, there were at least two people who co-authored Paper 1 who should have asked the rest of the team to read it when they were writing Paper 2.
On top of that, Paper 2 does not even cite Paper 1 even though
a) they are talking about a related topic
b) Paper 1 explicitly states that they will need to do more followup in the future using v-safe (emphasis mine). In other words, it would have made a lot of sense to use Paper 2 for followup
We were unable to evaluate adverse outcomes that might occur in association with exposures earlier in pregnancy, such as congenital anomalies, because no pregnant persons who were vaccinated early in pregnancy have had live births captured in the v-safe pregnancy registry to date; follow-up is ongoing.
Why I think CDC’s v-safe text mining was a failure
Is it too strong to suggest CDC’s text mining of v-safe was a failure?
I don’t think so. Here is why.
Let me first give you a summary of the text mining that Paper 2 tried to do.
In Paper 2, you see the following excerpt:
We examined reports to v-safe from Dec 14, 2020, to Jan 9, 2022, among adults aged 18 years and older. To identify text responses potentially related to menstrual irregularities or vaginal bleeding, we filtered all 6 168 832 free-text responses to select those containing the word fragments (“menses”, “menst”, “spotting”, “period”, “cycle”, “miscarr”, “menorrh”, or “metrorrh”) or (“bleed” or “blood” and “menop”, “uter”, “vag”, “breakthrough”, “break through”, “endomet”, “gest”, “term”, or “trimester”; appendix p 1).
In other words, they try to use string matching to see if a given free text response matches either Pattern 1 (alone) or both Pattern 2 and Pattern 3.
See the figure below.
About 95K free text responses out of a total of 6 million responses matched the pattern described.
Then they used a pretrained Machine Learning classifier model called BART which can classify these free text responses into the hypothesis statements (seen in the bottom of the picture above).
We applied zero-shot text classification to these filtered responses with the Bidirectional and Auto-Regressive Transformers (known as BART)13 model for natural language inference, pretrained for sentence pair classification on the Multi- Genre Natural Language Inference14 corpus
The classifier was able to label ~86K out of the 95K responses from the search string filtering.
They they did manual review of a subset of the classifier’s response, to see how many it got right. You can see that the classifier had different levels of accuracy for different topics.
The accuracy, however, is not the focus or even concern of my critique.
Why wasn’t miscarriage included as one of the classifier topics?
We can see that the string prefix ‘miscarr’ was already part of the search strings.
Here is the list of topics that they wanted to classify using the BART classifier from the code sample:
topics = ['This mentions a period that came early',
'This mentions menstruation',
'This mentions a period that came late',
'This mentions spotting',
'This mentions a heavy menstrual period',
'This mentions vaginal bleeding',
'This mentions uterine bleeding',
'This mentions a painful menstrual period',
'This mentions prolonged bleeding specifically',
'This mentions an irregular period specifically',
'This specifically mentions missing or skipping a period',
'This mentions not having a period for years']
core_top = ['This mentions menstruation',
'This mentions spotting',
'This mentions vaginal bleeding',
'This mentions uterine bleeding',
'This specifically mentions missing or skipping a period']
timing_top = ['This mentions a period that came early',
'This mentions a period that came late',
'This mentions spotting',
'This mentions an irregular period specifically',
'This specifically mentions missing or skipping a period']
severe_top = ['This mentions a heavy menstrual period',
'This mentions a painful menstrual period',
'This mentions prolonged bleeding specifically']
yrs_top = ['This mentions not having a period for years']
As you can see, miscarriage is not one of the topics they wanted BART to classify.
Why was it omitted?
Were there ANY miscarriage reports in v-safe?
One possibility is that there were no miscarriages at all, so there wasn’t any need to analyze it.
But miscarriages were already reported in the original paper.
With so much of the groundwork done in the code already, you might think it would not have taken much more effort to do topic classification for miscarriages?
Was it omitted because they did not want the public to see some sample free form text messages about this topic?
But it actually gets much worse.
At least the CDC established a pregnancy-specific v-safe registry which they could rely on for their assessments.
As an aside, Arkmedic is clearly not impressed with the results from the v-safe pregnancy registry, but I don’t have enough of a background in biology or medicine to comment on it.
But the CDC did not publish any text mining for the other known serious adverse reactions for the mRNA vaccines.
Omitted solicited symptoms from the original v-safe
If you are new to this topic, you might not know that the original design of v-safe also had some additional serious symptoms which were solicited (here solicited means they had a check-the-box option for it).
This was not some random list, but based on a list of symptoms that the vaccine manufacturers identified during the clinical trials.
And then they were omitted before v-safe was released, so that only the more benign symptoms were provided to the patients.
Here is how Aaron Siri describes it:
And what happened as a result of moving these Solicited symptoms to the free text boxes? This is how Aaron describes it.
Had the CDC included the check-the-box options for these adverse events, it could have clearly calculated a rate for each harm for the 10 million v-safe users. For example, if 400,000 reported myocarditis, then that would be around a 4% reported rate for this condition. The CDC, however, chose not to include these harms as check-the-box options. It instead relegated them to only potentially be captured in free-text fields!
So what is the next logical step if you omit these symptoms from the solicited list?
You will do the same kind of text mining analysis that they did in Paper 2, but based on the omitted symptoms.
And as far as I know, the CDC has not published any paper on the topic of mining the free text entries looking for any of the omitted unsolicited symptoms4.
Why wasn’t this done?
Conclusion
How many times have you heard the vaccine pushers point to v-safe as an example of how well the COVID19 vaccines have been studied and analyzed?
Even allowing for the fact that only the CDC has access to the v-safe free text entries (a red flag all by itself), why has there been only ONE paper written on the topic of text mining v-safe free text entries in 2+ years?
If I am wrong, please provide the link to the peer-reviewed publication in the comments. Not a blog or Substack article, but an actual peer reviewed paper.
I am somewhat glad I found this paper. In that previous article about v-safe text mining, I was making some assertions which would have been a bit hard for me to prove without seeing at least a few sample verbatim free text responses.
I suppose they could have still asked someone in a smaller organization. But this is the kind of quick-and-dirty stuff that is probably beyond the capability of a giant slow-moving bureaucracy.
If I am wrong about this, please provide a link to the paper.
Excellent write-up Aravind, it's worthy of me doing a write-up of explanation of this exact topic (data mining and diagnosing) causing the rift between Liz&Jess and myself. The full back story is here: https://www.vaersaware.com/post/openvaers-com-the-myocarditis-stats-the-full-story.
Speaking of miscarriages and fetal demise and analysis I did with Dr. Thorp, I noticed where CDC coders entrusted to extract a "symptom" from the narrative, incorrectly transcribed/assigned a spontaneous abortion to the report instead of foetal death, or even premature baby death!
Write-ups would say, "I had a miscarriage at 23 weeks" or "I had a miscarriage at 5 months". Both of these statements would be incorrect statements of miscarriage as defined (<19.999 weeks). Since I never found a report where I could say this was a good coder with the wisdom to know the difference , I started to think how much of this "coding" was being "automated" and how much was needed to be seen by human eyes? Surely they are automating some of the low hanging fruit easy stuff and do not need a coder to code everything? Especially the report that are simple administration errors only as an example.
I would like the Florida Grand Jury to demand access to all the data and employ some thinkers to look at the clear damage signals.