Analysis of Cardiac MRI tests in VAERS

The LAB_DATA field should be studied more systematically

Aug 09, 2023

There are over 1400 reports in foreign VAERS where the LAB_DATA field mentions either “Cardiac MRI” or “Cardiac Magnetic Resonance Imaging”
The LAB_DATA field is actually highly structured and we can parse out information for individual tests quite easily using Python code
I parsed ALL the tests (i.e. not just the cardiac MRI) which were done for these 1400 reports and constructed a dataset with one test per row
In the best case, each row has 5 fields - test name, test date, test numerical value, test units and test comment. Many tests do not have a corresponding numerical value, but have a “test result” which is just provided as text
Where available, I parse the test date and calculate a DAYS_AFTER_VAX when test was done. Note: this field can even be negative sometimes. This is not an error, it just means the reporter included information about tests which were done before the most recent vaccine dose was administered
I also constructed an “Aggregate Statistics” dataset for each test which was conducted - number of tests, maximum value, minimum value, median and standard deviation
You can click on the individual test name in the “Aggregate Statistics” dataset and see individual test information. Inside those rows, you can then click on the MedAlerts URL to verify that the information is correct
Click here to see the Aggregate Statistics dataset
There is a lot of scope to improve this study - normalizing the unit of measurement, deduping test names, quantitative text analysis of non-numerical test results, extending it to other health conditions caused by the vaccine etc
Which means quite clearly, the LAB_DATA has not been analyzed enough. Instead of allowing independent researchers who may be able to pool all this vital information to identify patterns, the CDC instead chose to stop publishing the writeups and LAB_DATA information for EU reports

Recently I read an article in the Epoch Times which said that one way to identify vaccine induced heart issues is to do something called a Cardiac MRI.

Myocarditis Caused by COVID-19 Vaccination Often Evades Normal Tests
According to Dr. Milhoan, obtaining an accurate diagnosis of vaccine-associated myocarditis is challenging.
“The way the vaccine injury works, the heart often forms a scar that we don’t always pick up on our other usual tests. Normally if we study someone with suspected myocarditis, we will get labs that reveal damage to the myocardial cell, such as a troponin level, an EKG to see how the heart looks electrically, an echocardiogram, and a stress test,” he said. “But these are often normal in someone with myocarditis following COVID-19 vaccination.”
This is why the gold standard for detecting myocarditis following COVID-19 vaccination is cardiac magnetic resonance imaging, also known as a cardiac MRI, Dr. Milhoan said. A cardiac MRI is used for more complex heart conditions and shows a more detailed picture of what’s happening in the heart. It can detect damage to the heart muscle that goes undetected by other tests.

I wanted to see if I can search for these tests inside VAERS reports.

I first created a list of VAERS reports where the LAB_DATA field had the text “cardiac mri” or “cardiac magnetic resonance imaging”

Since people seem to ask this question often - do note that all analysis of the foreign dataset is based on the last “good” version from Nov 2022, the most recent version before the CDC stopped publishing the writeups for EU

Unacceptable Jessica

The foreign data set was gutted this week in VAERS and the cancer signal was halved, the myocarditis dose 3 response signal was lost and 994 spontaneous abortions/still births were dropped

As most of you know, me and a bunch of other people are monitoring VAERS data very closely week-by-week. This week (11.18.22), the first thing I noticed was that the Foreign data set was less than a fraction of the size it was last week (11.11.22): down from 283.51 MB to 96.81 MB. There is a disclaimer under the VAERS data that states the following, so …

3 years ago · 742 likes · 47 comments · Jessica Rose

There were 1410 such reports.

A while back I wrote an article about how you can parse the LAB_DATA text using a state-machine-based approach.

How to parse the LAB DATA field in VAERS

Aravind Mohanoor

January 9, 2023

Read full story

I recently realized that the LAB_DATA field is in fact more structured than I initially thought when I wrote that article.

As a result, I was able to construct a state machine to extract the relevant information from individual reports.

For those who are interested, this is what the state machine looks like:

def parse_lab_data(lab_data: str):
    parsed_lab_data = lab_data.replace('\n', '')
    parsed_lab_data = parsed_lab_data.replace('; Test Date:', '\nTest Date:')
    parsed_lab_data = parsed_lab_data.replace('; Test Name:', '\nTest Name:')
    parsed_lab_data = parsed_lab_data.replace('; Comments:', '\nComments:')
    parsed_lab_data = parsed_lab_data.replace('; Test Result:', '\nTest Result:')
    parsed_lab_data = parsed_lab_data.replace('; Result Unstructured Data:', '\nResult Unstructured Data:')

    state = 'Unknown'
    list_items = [l.strip() for l in parsed_lab_data.split('\n')]
    tests = []
    curr_date = ''
    curr_name = ''
    test_result = {
        "result": "",
        "value": "",
        "units": ""
    }
    comments = ''
    for list_item in list_items:
        if list_item.lower().startswith('test name'):
            if state == 'results':
                tests.append({
                    "date": curr_date,
                    "name": curr_name,
                    "result": test_result['result'],
                    "value": test_result['value'],
                    "units": test_result['units'],
                    "comments": comments
                })
                test_result = {
                    "result": "",
                    "value": "",
                    "units": ""
                }
                comments = ''
            state = 'name_date'
            parts = list_item.split(':')
            curr_name = parts[1].strip()
        elif list_item.lower().startswith('test date'):
            if state == 'results':
                tests.append({
                    "date": curr_date,
                    "name": curr_name,
                    "result": test_result['result'],
                    "value": test_result['value'],
                    "units": test_result['units'],
                    "comments": comments
                })
                test_result = {
                    "result": "",
                    "value": "",
                    "units": ""
                }
                comments = ''
            state = 'name_date'
            parts = list_item.split(':')
            curr_date = parts[1].strip()
        else:
            parts = [l.strip() for l in list_item.split(':')]
            if len(parts) > 1:
                if parts[0].lower() == 'Test Result'.lower():
                    state = 'results'
                    test_result = parse_result_string(parts[1])
                elif parts[0].lower() == 'Result Unstructured Data'.lower():
                    state = 'results'
                    if parts[1].lower() == 'Test Result'.lower():
                        test_result = parse_result_string(parts[2].strip())
                elif parts[0].lower() == 'Comments'.lower():
                    state = 'results'
                    comments = parts[1]
                else:
                    print('came here')
    if state == 'results':
        tests.append({
            "date": curr_date,
            "name": curr_name,
            "result": test_result['result'],
            "value": test_result['value'],
            "units": test_result['units'],
            "comments": comments
        })

    return tests

As you can see, it is a fairly simple state machine1.

The original dataset

The code tries to parse out 6 pieces of information from the LAB_DATA text for each test

test date
test name
test result = this is just the full text of the test result
test value = if the test result starts with a numerical value, I extract that number. This makes it possible to construct an “Aggregate Statistics” where I can extract a number. This field is empty if the test result does not start with a numerical value
test units = if the test result starts with a numerical value, I just use the second part of the text as the “units”. Sometimes this is correct, sometimes it is not. For now, a human should take a second look at the dataset before using the Aggregate Statistics2.
test comments = if there are any comments, they are stored in this column

Here is an example LAB_DATA field (read the report on MedAlerts):

Test Date: 20220627; Test Name: CK; Result Unstructured Data: Test Result:964 IU/l; Comments: Elevated; Test Date: 20220627; Test Name: CK-MB; Result Unstructured Data: Test Result:68 IU/l; Comments: Elevated; Test Date: 20220624; Test Name: Body temperature; Result Unstructured Data: Test Result:36.2 Centigrade; Comments: before vaccination; Test Date: 20220625; Test Name: Body temperature; Result Unstructured Data: Test Result:39.8 Centigrade; Test Date: 20220627; Test Name: coronary CT; Result Unstructured Data: Test Result:no significant stenosis; Comments: No Coronary artery stenosis; Test Date: 20220627; Test Name: CRP; Test Result: 5.76 mg/dl; Comments: Elevated; Test Date: 20220627; Test Name: Cardiac ultrasonography; Result Unstructured Data: Test Result:Left ventricular ejection fraction 50 %; Comments: Reduction or abnormality in systolic function or diastolic function of the entire cardiac ventricle; Test Date: 20220627; Test Name: Echocardiography; Result Unstructured Data: Test Result:mild systolic function decreased; Test Date: 20220627; Test Name: Electrocardiography; Result Unstructured Data: Test Result:ST elevated or negative T wave; Test Date: 20220627; Test Name: Twelve-lead electrocardiography; Result Unstructured Data: Test Result:PQ elevation in the aVR lead; Test Date: 20220627; Test Name: D-dimer; Result Unstructured Data: Test Result:1.1 ug/ml; Comments: Elevated; Test Date: 20220629; Test Name: Cardiac MRI examination; Result Unstructured Data: Test Result:Suspected Abnormal findings; Comments: (Myocardial injury) Late gadolinium enhancement image on T1 weighted image. However, myocardial signal intensity is higher than skeletal muscle signal intensity, and typically late gadolinium enhancement image was confirmed in at least one non-ischemic region. Contrast-enhanced: Yes; Test Date: 20220627; Test Name: Cardiac enzymes; Result Unstructured Data: Test Result:increased; Test Date: 20220627; Test Name: NT-proBNP; Result Unstructured Data: Test Result:324 pg/mL; Test Date: 20220627; Test Name: Troponin I; Result Unstructured Data: Test Result:17966 ng/ml; Comments: Elevated

When you run the Python script, it will first split them into lines which look like these:

Test Date: 20220627
Test Name: CK
Result Unstructured Data: Test Result:964 IU/l
Comments: Elevated
Test Date: 20220627
Test Name: CK-MB
Result Unstructured Data: Test Result:68 IU/l
Comments: Elevated
Test Date: 20220624
Test Name: Body temperature
Result Unstructured Data: Test Result:36.2 Centigrade
Comments: before vaccination
Test Date: 20220625
Test Name: Body temperature
Result Unstructured Data: Test Result:39.8 Centigrade
Test Date: 20220627
Test Name: coronary CT
Result Unstructured Data: Test Result:no significant stenosis
Comments: No Coronary artery stenosis
Test Date: 20220627
Test Name: CRP
Test Result: 5.76 mg/dl
Comments: Elevated
Test Date: 20220627
Test Name: Cardiac ultrasonography
Result Unstructured Data: Test Result:Left ventricular ejection fraction 50 %
Comments: Reduction or abnormality in systolic function or diastolic function of the entire cardiac ventricle
Test Date: 20220627
Test Name: Echocardiography
Result Unstructured Data: Test Result:mild systolic function decreased
Test Date: 20220627
Test Name: Electrocardiography
Result Unstructured Data: Test Result:ST elevated or negative T wave
Test Date: 20220627
Test Name: Twelve-lead electrocardiography
Result Unstructured Data: Test Result:PQ elevation in the aVR lead
Test Date: 20220627
Test Name: D-dimer
Result Unstructured Data: Test Result:1.1 ug/ml
Comments: Elevated
Test Date: 20220629
Test Name: Cardiac MRI examination
Result Unstructured Data: Test Result:Suspected Abnormal findings
Comments: (Myocardial injury) Late gadolinium enhancement image on T1 weighted image. However, myocardial signal intensity is higher than skeletal muscle signal intensity, and typically late gadolinium enhancement image was confirmed in at least one non-ischemic region. Contrast-enhanced: Yes
Test Date: 20220627
Test Name: Cardiac enzymes
Result Unstructured Data: Test Result:increased
Test Date: 20220627
Test Name: NT-proBNP
Result Unstructured Data: Test Result:324 pg/mL
Test Date: 20220627
Test Name: Troponin I
Result Unstructured Data: Test Result:17966 ng/ml
Comments: Elevated

Then each individual line will be checked for what type of information it has, and then the Python script will use the state machine to extract the individual test information into something which looks like this:

[
  {
    "date": "20220627",
    "name": "CK",
    "result": "964 IU/l",
    "value": 964.0,
    "units": "IU/l",
    "comments": "Elevated"
  },
  {
    "date": "20220627",
    "name": "CK-MB",
    "result": "68 IU/l",
    "value": 68.0,
    "units": "IU/l",
    "comments": "Elevated"
  },
  {
    "date": "20220624",
    "name": "Body temperature",
    "result": "36.2 Centigrade",
    "value": 36.2,
    "units": "Centigrade",
    "comments": "before vaccination"
  },
  {
    "date": "20220625",
    "name": "Body temperature",
    "result": "39.8 Centigrade",
    "value": 39.8,
    "units": "Centigrade",
    "comments": ""
  },
  {
    "date": "20220627",
    "name": "coronary CT",
    "result": "no significant stenosis",
    "value": "",
    "units": "",
    "comments": "No Coronary artery stenosis"
  },
  {
    "date": "20220627",
    "name": "CRP",
    "result": "5.76 mg/dl",
    "value": 5.76,
    "units": "mg/dl",
    "comments": "Elevated"
  },
  {
    "date": "20220627",
    "name": "Cardiac ultrasonography",
    "result": "Left ventricular ejection fraction 50 %",
    "value": "",
    "units": "",
    "comments": "Reduction or abnormality in systolic function or diastolic function of the entire cardiac ventricle"
  },
  {
    "date": "20220627",
    "name": "Echocardiography",
    "result": "mild systolic function decreased",
    "value": "",
    "units": "",
    "comments": ""
  },
  {
    "date": "20220627",
    "name": "Electrocardiography",
    "result": "ST elevated or negative T wave",
    "value": "",
    "units": "",
    "comments": ""
  },
  {
    "date": "20220627",
    "name": "Twelve-lead electrocardiography",
    "result": "PQ elevation in the aVR lead",
    "value": "",
    "units": "",
    "comments": ""
  },
  {
    "date": "20220627",
    "name": "D-dimer",
    "result": "1.1 ug/ml",
    "value": 1.1,
    "units": "ug/ml",
    "comments": "Elevated"
  },
  {
    "date": "20220629",
    "name": "Cardiac MRI examination",
    "result": "Suspected Abnormal findings",
    "value": "",
    "units": "",
    "comments": "(Myocardial injury) Late gadolinium enhancement image on T1 weighted image. However, myocardial signal intensity is higher than skeletal muscle signal intensity, and typically late gadolinium enhancement image was confirmed in at least one non-ischemic region. Contrast-enhanced"
  },
  {
    "date": "20220627",
    "name": "Cardiac enzymes",
    "result": "increased",
    "value": "",
    "units": "",
    "comments": ""
  },
  {
    "date": "20220627",
    "name": "NT-proBNP",
    "result": "324 pg/mL",
    "value": 324.0,
    "units": "pg/mL",
    "comments": ""
  },
  {
    "date": "20220627",
    "name": "Troponin I",
    "result": "17966 ng/ml",
    "value": 17966.0,
    "units": "ng/ml",
    "comments": "Elevated"
  }
]

I know this might all seem a bit boring, but please do make note of how the parsing happens if you wish to cite or use this work. Without a good understanding of the limitations of such parsing, it is quite possible to make erroneous conclusions about what the data actually says3.

Calculating the days after vaccination

As you can see, there is a field called DAYS_AFTER_VAX in the dataset.

It is a calculated field - if there is a full date in the TESTDATE column, and the VAX_DATE is known, I calculate the difference between (TESTDATE - VAX_DATE) in number of days.

As you can see, it also has some negative values.

But this is not an error or a bug. Sometimes the reporter also adds information about tests which were done prior to the most recent vaccination.

Then I also try and infer other information as in my previous articles.

Click here to see the full dataset

Aggregate Statistics Dataset

The best part about extracting numerical values from the LAB_DATA field is that you can do aggregate statistical and mathematical operations on them.

Here is a screenshot of what this dataset looks like:

Here are the field definitions:

RESULT Count = number of times this test name was seen in the 1410 reports

RESULT_VALUE Count = of the RESULT Count, how many had numerical values?

Std of RESULT_VALUE = Standard Deviation for the numerical values for this particular test among the 1410 reports

And then of course we have the MIN, MAX and MEDIAN values.

You can drill down into individual tests

Notice that all the test names are clickable. When you click on the test name, you see the full list of tests.

Here is a snapshot for “troponin”

You can check the individual reports

You can then click on the URL link to read the full VAERS report on the MedAlerts website.

Download the dataset for individual tests

Once you clicked into a given test, you can also export the data and do your own analysis on it.

You can see how to do this in the screenshot below:

Limitations

There are some important limitations that you should be aware of if you are going to be using this for further analysis.

The test names need to be deduped - this is quite clear when you see the names of the tests

The units are not the same and need to be normalized - one of the challenges with the aggregate statistics is that it may not be applied on the same unit of measurement!

Future work

In addition to deduping test names and normalizing test result units, I can see two more projects for future work.

Qualitative analysis of Test Result field

One of the surprising things in the Aggregate Statistics is the number of tests where there is a large difference between RESULT count and RESULT_VALUE count, including for cardiac-mri itself. This means the results of these tests are qualitative and not quantitative.

When you click into cardiac-mri for example, you see the Test Result is mostly just text information and does not have any numbers

In turn, this means that we need to do more text analysis on individual test result values to get a deeper insight into the existing VAERS reports.

And we can use text ML to categorize these test results if there are a lot of rows.

Extending the analysis to other vaccine injuries

We can also extend the analysis to other vaccine injuries. I read something very interesting on Twitter recently:

I think John Beaudoin is right4.

If someone is interested in doing such an analysis they can look only at VAERS reports corresponding to a specific class of vaccine injury and do a similar analysis and see what the corresponding test data says.

Which means there is a chance it cannot handle all the corner cases while parsing the text - for e.g. if the LAB_DATA field is not in the usual format, but just has some plain text. These corner cases are very rare, so I just ignore them and don’t try to do any error handling or error recovery.

Since I don’t do any advanced analysis of the test units, the test units are not “normalized”. This means sometimes the aggregate statistics are calculated across different units of measurement.

A good rule of thumb is to just go and verify the original VAERS report and see if the test results are correctly parsed

Obviously I don’t have the background in medicine or biology to be sure of this. But intuitively, plus based on what I have seen till now in VAERS, I think Myocarditis is only the tip of the vaccine-injury iceberg unfortunately :-(

Vaccine Data Science

How to parse the LAB DATA field in VAERS

Discussion about this post

Vaccine Data Science

Analysis of Cardiac MRI tests in VAERS

The LAB_DATA field should be studied more systematically

Summary

Myocarditis Caused by COVID-19 Vaccination Often Evades Normal Tests

How to parse the LAB DATA field in VAERS

The original dataset

Calculating the days after vaccination

Aggregate Statistics Dataset

You can drill down into individual tests

You can check the individual reports

Download the dataset for individual tests

Limitations

Future work

Qualitative analysis of Test Result field

Extending the analysis to other vaccine injuries

Discussion about this post