Understanding the VAERS data format
If you are not familiar with the VAERS data format, reading this article will make it much easier for you to understand the remaining articles
You can download the VAERS (Vaccine Adverse Events Reporting System) data here.
Downloading the files for a given year
The VAERS dataset comprises of 3 CSV files - the DATA file, the SYMPTOMS file and the VAX file. You can get all the information corresponding to a given year by downloading the appropriate files for that year.
Every VAERS report is assigned an identifier called VAERS_ID.
The Symptoms CSV file lists all the symptoms associated with each VAERS_ID.
The vaccine CSV file lists information about the vaccine type.
The VAERS data CSV file contains all the information for the actual vaccination itself, including a text narrative of what happened, and is obviously the largest of the three files.
Here is the list of column names in the symptoms CSV file:
VAERS_ID,SYMPTOM1,SYMPTOMVERSION1,SYMPTOM2,SYMPTOMVERSION2,SYMPTOM3,SYMPTOMVERSION3,SYMPTOM4,SYMPTOMVERSION4,SYMPTOM5,SYMPTOMVERSION5
Here is the list of column names in the vaccine CSV file:
VAERS_ID,VAX_TYPE,VAX_MANU,VAX_LOT,VAX_DOSE_SERIES,VAX_ROUTE,VAX_SITE,VAX_NAME
And here is the list of column names in the data CSV file:
VAERS_ID,RECVDATE,STATE,AGE_YRS,CAGE_YR,CAGE_MO,SEX,RPT_DATE,SYMPTOM_TEXT,DIED,DATEDIED,L_THREAT,ER_VISIT,HOSPITAL,HOSPDAYS,X_STAY,DISABLE,RECOVD,VAX_DATE,ONSET_DATE,NUMDAYS,LAB_DATA,V_ADMINBY,V_FUNDBY,OTHER_MEDS,CUR_ILL,HISTORY,PRIOR_VAX,SPLTTYPE,FORM_VERS,TODAYS_DATE,BIRTH_DEFECT,OFC_VISIT,ER_ED_VISIT,ALLERGIES
As you can see, the data CSV file also has many more columns. Also notice that the VAERS_ID is the common “link” joining these tables. For a given VAERS_ID in the data CSV file, you should be able to look up the vaccine type as well as the list of associated symptoms by looking at the other two files.
An example
Here is an example which shows what the data looks like. I have taken a single VAERS_ID (2002051) to zoom in on the information inside the three CSV files. Since the VAXDATA CSV file has a lot of columns, it has been truncated.
You can see all the details for this VAERS report here
You can observe three things:
a) the top row is from the VAERSDATA CSV file and has information about what happened after the vaccine was administered
b) the second row is from the VAERSVAX CSV file which has information about the vaccine which was administered, including VAX_TYPE (which disease) and VAX_MANU (manufacturer)
c) the third row lists all the symptoms associated with a given VAERS_ID. In this case, you can see that this particular VAERS report had a lot of symptoms associated with it. The format supports a maximum of 5 symptoms per report, so if there are additional symptoms, you just add them into another row. Obviously, this is a list which has an unbounded size.
The simplicity of the VAERS CSV file format has a lot of benefits, but you need to remember things like 1-to-many mapping between VAERS_ID to symptoms etc.
An important thing to understand is that VAERS is self-reported information. As a result, you will find that a lot of information is missing in many of these reports. Here is a good overview of these issues.
Field descriptions
I have just copied and pasted this information from the VAERS data use guide.
VAERSDATA.CSV
VAERS identification number (VAERS_ID): A sequentially assigned number used for identification
purposes. It serves as a link between the three data files.Receive date (RECVDATE): The date the VAERS form information was received to our processing
center.State (STATE): The two-letter US Postal Service abbreviation for the home state of the vaccinee.
Please note that all foreign reports are contained in a separate data file.Age in years (AGE_YRS): The recorded vaccine recipient’s age in years.
Age in years (CAGE_YR): Age of patient in years calculated by (vax_date-birthdate).
Age in months (CAGE_MO): Age of patient in months calculated by (vax_date-birthdate). The
values for this variable range from zero to less than one (due to rounding, the value in this field
may be one). It is only calculated for patients age 2 years or less. The sum of the two variables
CAGE_YR and CAGE_MO provide the calculated age of a person. For example, if CAGE_YR=1 and
CAGE_MO=0.5 then the age of the individual is 1.5 years or one year six months.Sex (SEX): Sex of the vaccine recipient (M = Male, F = Female, Unknown = Blank).
Date form completed (RPT_DATE): Date the VAERS form was completed by the reporter as
recorded on the specified field of the form. This is a VAERS 1 form field only.Reported symptom text (SYMPTOM_TEXT): This is the symptom text recorded in the form.
MedDRA Terms are derived from this text and placed in the VAERSSYMPTOMS file.Patient outcomes: The reporter’s assessment of the vaccine recipient outcome is recorded on the
VAERS form. Selections checked in the form determine whether a report is considered to be a
non-serious report, a serious report, or a death report.
• Died (DIED): If the vaccine recipient died a “Y” is used; otherwise, the field will be blank.
• Date of death (DATEDIED): If the vaccine recipient died there is space in this field to record
the date of death; otherwise, the field will be blank.
• Life threatening (L_THREAT): If the vaccine recipient had a life-threatening event associated
with the vaccination a “Y” is placed is used; otherwise, the field will be blank.
• Emergency room (ER_VISIT): If the vaccine recipient required an emergency room or doctor
visit a “Y” is placed in this field; otherwise, the field will be blank. If this is the only option
checked the report is not considered serious. This is a VAERS 1 form field only.
• Hospitalized (HOSPITAL): If the vaccine recipient was hospitalized as a result of the
vaccination a “Y” is used; otherwise, the field will be blank.
• Days hospitalized (HOSPDAYS): If the reporter checked that the vaccine recipient was
hospitalized a space is provided in this field to record the number of days hospitalized;
otherwise, the field will be blank.
• Prolonged hospitalization (X_STAY): If a patient’s hospitalization is prolonged as a result of
the adverse event associated with the vaccination a “Y” will be placed in this field; otherwise,
the field will be blank.
• Disability (DISABLE): If the vaccine recipient was disabled as a result of the vaccination a “Y”
is placed in this field; otherwise, the field will be blank.
• Congenital anomaly or birth defect (BIRTH_DEFECT): If the vaccine recipient had a congenital
anomaly or birth defect associated with the vaccination, a “Y” is used; otherwise, the field will
be blank. This is a VAERS 2 form field only.
• Doctor or other healthcare professional office/clinic visit:If the vaccine recipient had a
doctor or other healthcare professional office/clinic visit associated with the vaccination a “Y”
is used; otherwise, the field will be blank. This is a VAERS 2 form field only.
• Emergency room/department or urgent care:If the vaccine recipient had an emergency
room/department or urgent care visit associated with the vaccination a “Y” is used;
otherwise, the field will be blank. This is a VAERS 2 form field only.Recovered (RECOVD): A “Y” is placed in the field if the vaccine recipient recovered from the
adverse event. “N” indicates that the vaccinee has not recovered from the adverse event. “U” or
blank indicates that the vaccine recipient’s recovery status is unknown.Vaccination date (VAX_DATE): The date of vaccination as recorded in the specified field of the
form.Onset date (ONSET_DATE): The date of the onset of adverse event symptoms associated with the
vaccination as recorded in the specified field of the form.Onset interval (NUMDAYS): The calculated interval (in days) from the vaccination date to the
onset date.Relevant diagnostic tests/laboratory data (LAB_DATA): This text field contains narrative about
any relevant diagnostic tests or laboratory results as recorded on the specified field of the form.Vaccine administered at (V_ADMINBY): The reporter may note on the VAERS form the type of
facility administering the vaccine. The options are different depending on the form version;
additional options were added on the VAERS 2 form.
• VAERS 1: PUB = Public, PVT = Private, MIL = Military, OTH = Other, UNK = Unknown
• VAERS 2: PUB = Public, PVT = Private, MIL = Military, PHM = Pharmacy or store, SCH = School
or student health clinic, SEN = Nursing home or senior living facility, WRK = Workplace clinic,
OTH = Other, UNK = UnknownVaccine purchased with (V_FUNDBY): This is a VAERS 1 field only. The reporter may note in Box
16 on the VAERS form which type of funds were used to purchase the vaccines administered in
Box 13 (PUB = Public, PVT = Private, MIL = Military; OTH = Other/Unknown).Other medications (OTHER_MEDS): This text field contains narrative about any prescription or
non-prescription drugs the vaccine recipient was taking at the time of vaccination as recorded on
the specified field of the form.Current illnesses (CUR_ILL): This text field contains narrative about any illnesses at the time of
the vaccination as noted on the specified field of the form.Pre-existing conditions (HISTORY): This text field contains narrative about any pre-existing
physician-diagnosed birth defects or medical condition that existed at the time of vaccination as
noted on the specified field of the form. For the VAERS 1 form, this field also includes pre-existing
physician-diagnosed allergies.Allergies to medications, food or other products (ALLERGIES): This text field contains narrative
about any pre-existing physician-diagnosed allergies that existed at the time of vaccination as
noted in the specified field of the form. This is a VAERS 2 form field only.Prior vaccination event information (PRIOR_VAX): This field provides prior vaccination event
information as recorded on the specified field of the form.Manufacturer number (SPLTTYPE): Manufacturer number or Immunization Project number as
recorded on the specified field of the form.
VAERSVAX.CSV
VAERS Identification Number (VAERS_ID): A sequentially assigned number used for identification
purposes. It serves as a link between the three data files.Vaccine Type (VAX_TYPE): The data list the vaccines group name by code. Similar vaccines are
grouped together (e.g., FLU, DTAP).Vaccine manufacturer (VAX_MANU): This field identifies the manufacturer of the each of the
vaccines listed.Manufacturers vaccine lot (VAX_LOT): This field identified the lot number of the vaccines listed.
Doses administered (VAERS_DOSE_SERIES): This field identifies the vaccine dose of the recorded
vaccines listed. The VAERS 1 field VAX_DOSE was discontinued in the VAERS 2 form; when a value
exists, a 1 is added to equate to the VAX_DOSE_SERIES field.Vaccination route (VAX_ROUTE): This field identifies the vaccine route of administration
Vaccination site (VAX_SITE): This field identified the anatomic site where the vaccination was
administered.Vaccine name (VAX_NAME): This field provides the brand name of the vaccine administered
VAERSSYMPTOMS.CSV
VAERS identification number (VAERS_ID): A sequentially assigned number used for identification
purposes. It serves as a link between the three data files.MedDRA term (SYMPTOM1-5): The data in these fields are equivalent to the PT TERM from the
MedDRA codebook. MedDRA terms are extracted from the narrative text in VAERS 2 (Item 18 and
19) and VAERS 1 (Box 7 and 12). Duplicates may appear in data and terms are listed in alphabetical
order. In case a report has more than 5 terms multiple rows with 5 terms each will be listed for
that VAERS ID.MedDRA term version (SYMPTOMVERSION1-5): Version of MedDRA dictionary from which the
MedDRA term was first created.
In the SPLTTYPE field, when the reports are foreign (about 35% of all covid), the first two characters are often the country code like GB for Great Britain, applying only up to 2022-11-11. CDC now censors this information plus blanking out all of the writeups for those marked foreign (FR) in the STATE field. Censorship helps to keep depop safe and effective.