Analysis of v-safe deaths free text by batch

Was the release of the free text information driven by political considerations?

Mar 01, 2025

CDC promised to release 7.7 million v-safe free text entries by mid-Jan 2025 but deliberately withheld RESPONSE_IDs needed to determine dates.
In fact, they initially released only MedDRA codes for the free text entries, until ICAN sued them to publish the contents of the free text
I analyzed the release schedule for the free text for v-safe deaths and noticed suspicious timing: CDC released batches 1-6 by Oct 10, 2024; batches 7-8 by Nov 4 (just before election); batches 9-11 by Jan 26, 2025
CDC published only 3% of reports blaming vaccines for the death1 before the election (Batch 6), while releasing 20% of "Unknown" reports in the same timeframe
Is the CDC actively stonewalling transparency efforts regarding COVID-19 vaccine safety monitoring due to political motivations?

The CDC promised to release all the 7.7 million v-safe free text entries by mid-Jan 2025 and follow this schedule.

There are many different ways they could have complied with this order, and in my opinion they chose to be as unhelpful as possible. In fact, it is even possible that the way they released free text information for v-safe death reports was driven by political considerations.

Background

Each v-safe checkin could have an associated free text entry if the user wished to provide some information that wasn’t captured by the check-the-box fields.

Each checkin has a corresponding RESPONSE_ID in the checkins.csv file.

Since a given registrant will do multiple checkins, there are often multiple responses which contain free text information.

If the CDC provided the RESPONSE_ID corresponding to the free text entries (which they didn’t publish), v-safe analysts could have automatically inferred the date on which the user wrote a specific free text entry.

The CDC read this free text information and converted them into MedDRA codes and released only the MedDRA codes at first. Then ICAN sued them to release the free text entries.

Since CDC still haven’t published the RESPONSE_IDs till date, we need to manually read the free text and match it with the symptoms to see if the CDC coded the free text into the most suitable MedDRA codes.

Why didn’t the CDC release the RESPONSE_ID information? Why was this needless friction added into the process?

In any case, despite all this CDC stonewalling,

Closed VAERS

recently provided multiple examples where the CDC played down the danger of the vaccines by using ambiguous MedDRA codes.

CDC is stonewalling efforts to introduce transparency

A few weeks after the release of the first batches of v-safe free text information, I noticed that there was an effort to withhold some of the free text information associated with a given registrant. When you combine this with the fact that CDC did not provide the RESPONSE_ID for free text entries (it can be used to calculate the date of the checkin), it was already pretty clear that the CDC was trying to comply with the letter of the law but not its spirit.

In my opinion, it should not even be debatable that the CDC was ignoring the spirit of the law. There were over 1700+ registrants in v-safe who died. If the CDC did not want to be accused of third rate pharmacovigilance, they should have provided all the free text for these 1700+ registrants in the very first batch released in Feb 2024.

The CDC has now released 11 of the 12 installments they promised, and I think we can already notice some patterns in how they decided what information to release for the death reports.

Wayback Machine Archive

Let us look at the ICAN page which has all the v-safe free text information.

We will suppose that ICAN published the free text without any delays2.

Notice that CDC released the first 6 batches more or less on time, then published batch 7 and 8 on 4th Novemeber 2024, and then released batched 9, 10 and 11 at the end of January 2025.

Here is the release schedule according to my analysis:

First 6 batches: released by 10th Oct 2024

Batch 7 and 8: around 4 Nov 2024

Batch 9, 10 and 11: around 26 Jan 2025

This is what the Wayback Machine looks like for that page:

Obviously, 4th November was the last possible date the CDC could have published any updates without being accused of doing it just because of the election results.

All v-safe deaths do not have all free text yet

I constructed a dataset to demonstrate this topic.

As you can see, there are a total of 1731 reports with PT_NAME = ‘Death’.

I added a column called COMPLETION_BATCH_NUMBER which is the value of the batch number when all the free text for a given registrant was published.

For example, suppose a registrant has two RESPONSE_IDs in the SRO MedDRA file and two RESPONSE_IDs in the SD MedDRA file.

Suppose the free text for 1 SRO and 1 SD are released in Batch 5, and the free text for the other SRO and SD are released in Batch 7. Then the completion number is 7, because when Batch 7 was published, we had access to all the free text entries for that registrant.

Note: even though we don’t have a mapping between a given free text entry and the MedDRA values (because CDC did not provide us with the RESPONSE_IDs for free text entries), we are able to verify whether or not all the information was published just by counting the total number of entries.

If we don’t have all entries yet, the COMPLETION_BATCH_NUMBER is -1

We see that there are still 316 reports where the full text has not been published (they should be published in the 12th and last installment).

How did the CDC decide the order?

So how did the CDC decide which order in which to publish the v-safe reports for Death?

Obviously, they couldn’t have pushed every death report out to Batch 12 since that would have been a bit too obvious.

But they also needed to stay true to their core mission of being as unhelpful as possible to v-safe analysts.

This section is a bit speculative and might need a bit more work to verify3. But there is certainly a pattern I have noticed.

The checkins CSV file has a field called VACCINE_CAUSED_HEALTH_ISSUES. People can check the box with Yes or No or also leave it blank.

I calculated a field called BLAMED_VACCINE for each registrant using the following rules:

default value is ‘Unknown’ (this happens if they never fill out the field)
if they ever fill it as ‘Yes’ then the value is ‘Yes’
if they ever fill it as ‘No’ then the value is ‘No’
if they change a ‘Yes’ to a ‘No’ then the value becomes ‘Maybe’

If they change it from ‘No’ to ‘Yes’ then BLAMED_VACCINE is ‘Yes’ because they have some new data which makes them suspect the vaccine.

Here is an example of a 'Maybe’

There are a total of 391 people who blamed the vaccine for the death.

Of these, only 9 were fully published by Batch 6 (well before the election), which is less than 3% of the reports.

The number jumps to 35, still less than 10% of the ‘Yes’ reports, for Batch 8 which was released on 4th Nov 2024.

In contrast 511 people marked it as ‘Unknown’ (meaning they never filled out the VACCINE_CAUSED_HEALTH_ISSUES field). Of these, 100 were published by Batch 6, which is already about 20% of the reports.

And 168 were published by Batch 8, a much higher percentage of nearly 33%

If there is a good explanation for this, I would love to hear it from the CDC.

According to the relative of the v-safe registrant. We can infer this information using the VACCINE_CAUSED_HEALTH_ISSUES field in the checkins CSV file.

And if I am wrong, and ICAN was actually causing the delay, then the vaccine skeptics need to be asking questions about ICAN

If the pattern repeats for other severe adverse events, then my speculation is quite likely to be correct

Vaccine Data Science

Discussion about this post