In a previous article, I discussed how we can use a state machine to extract all the test results from the LAB_DATA field in VAERS.
I used that idea to analyze cardiac MRI tests in VAERS.
Given that I don’t have a background in biology or medicine, obviously I could not go very far in my analysis :-)
But I did notice something recently.
There has been almost no research on the topic of text mining the LAB_DATA field.
You see exactly six results on Google Scholar for a quoted search for VAERS and LAB_DATA, and one of them is from a paper written by
All the other five publications mention it only because they list all the attributes in the VAERS DATA CSV file!
In my previous analysis, I was using some simple Python string matching code to do the parsing.
It is now possible to improve on this approach quite a bit using Large Language Models (LLMs) and I will be writing more on this topic over the coming weeks.