Is the TGA DAEN understating COVID19 vaccine injuries?
An investigation mapping the DAEN Case Number and the VAERS ID
Key takeaways:
Australia’s adverse events reporting system DAEN reports the case number, entry date, age, gender, suspected medicine for which adverse reaction is reported, list of MedDRA coded terms
It does not report the days to symptom onset, or provide a text writeup
We can map the case number to the corresponding VAERS report by parsing the VAERS writeup
We can find out who made the report (i.e. was it a healthcare professional?) by parsing the writeup
There are also some data discrepancies between VAERS reports from Aus and the DAEN database, which is making the mRNA vaccines look safer
Update 5th May 2023 12:00 UTC: I realized that the DAEN data I downloaded was not the full COVID19 dataset, and that my calculations below are not accurate. I will be writing an update to this article soon with my new findings. But it does not substantially change my findings (the numbers do change though).
Update 5th May 2023 14:50 UTC - all updates done.
There are about 10K injury reports in the foreign VAERS dataset where the SPLTTYPE field starts with AU. These are all reports from Australia.
Interestingly, over 85% of the Australian VAERS reports mention the case number from the DAEN database maintained by the TGA (which is Australia’s health regulator).
What is a DAEN “case number”?
As best as I can see, the case number in the DAEN system also does not “vet” the information, so it is somewhat similar to VAERS in that it could be possible for non healthcare professionals to directly report to the DAEN.
Similar to VAERS, DAEN does not publish the reporter information.
However, as you will see later, most of the VAERS entries are made by healthcare professionals.
What DAEN does not report
DAEN is a much simpler reporting system than VAERS and in fact publishes only six pieces of information -
case number
entry date
age
gender
suspected medicine for which adverse reaction is reported
list of MedDRA coded terms
You can download this information as a CSV file from the DAEN website, and I downloaded a copy of all the adverse event reports for the COVID-19 vaccine a few days back.
Recently, someone asked me this question on Twitter:
Of course, if you have been following my research for a while, you probably know that VAERS provides a lot more information simply because it has a writeup associated with vaccine injuries1.
However, the DAEN database is more comprehensive.
As of May 2023, there are about 90K adverse event reports associated with the COVID19 vaccine in the DAEN database, while there are only about 10K VAERS reports from Australia.
But VAERS reports are much better than DAEN reports because they collect a lot of information.
As you can see, just the “DATA” CSV file alone in VAERS has over 30 different fields. Also DAEN does not provide the NUMDAYS field, which is the number of days between vaccination and symptom onset. It is a crucial piece of information to assess causality according to the Bradford Hill criteria, and the fact that DAEN omits this is not a good look.
A reasonable pushback (against publishing this information) is that sometimes it is not possible to exactly determine the date of symptom onset. In that case, I think health agencies should do what VAERS does2 and provide the month of symptom onset. This will at least help us calculate a bound on the number of days to symptom onset, which can still be very helpful.
How to map the case number to the VAERS report ID
I wrote some code and added a small condition and from what I can see it works very well3.
I split the writeup into individual sentences using spaCy, and if the sentence has a number (identified using the part-of-speech tag in spaCy) and also has one of the words “regulatory”, “reference” or “case” elsewhere in the same sentence, it is a candidate for being a case number.
Then I convert it to a number and see if it is within the range of 500000 and 9999994 (in other words a six digit number over 500000), and if it is I consider it a case number from DAEN.
Note the code sample above also does other things related to the dataset I am creating. The actual heuristic for identifying the case number is pretty simple.
Using this, I was able to identify over 8000 case numbers in the writeup.
Derived field values
In addition to identifying the DAEN case number where possible, I also added the following fields to the VAERS DATA CSV file.
Please take some time to read and understand them because you cannot follow the remaining article unless you are clear about the definitions (and there are many).
TGA_CASE_NUMBER: already discussed before
ESTIMATE: if the case number is missing in the VAERS report, I try and use the combination of age, gender and symptom matches to see if I can guess the case number. This only worked for about 130 reports (ESTIMATE = True), but it is still useful to note.
MISSING_CASE_NUMBER: if the VAERS report mentions a case number, but I cannot find this case number in the DAEN database, I mark it as true. Unfortunately, the DAEN database skips a lot of numbers, so you cannot infer much merely from the fact that it is missing.
URL: The URL is the link to the VAERS report on MedAlerts Wayback Machine
TGA_SYMPTOMS_LIST: A text field which displays the list of symptoms for the DAEN Case number (based on parsing the DAEN database) and uses the pipe symbol (|) as the separator.
NUM_TGA_SYMPTOMS: The number of symptoms in the list. If the case number cannot be identified, this will be zero.
VAERS_SYMPTOMS_LIST: A text field which displays the list of symptoms for the given VAERS ID. I calculate this by using the VAERS SYMPTOMS CSV file.
NUM_VAERS_SYMPTOMS: The number of VAERS symptoms
RATIO: This is a percentage calculated as
(NUM_TGA_SYMPTOMS * 100)/(NUM_VAERS_SYMPTOMS)
Out of a total of 9965 rows, the ratio is 100 for 6035 rows, less than 100 for 3383 rows and greater than 100 for 547 rows. In other words, a large majority of the cases the symptoms in VAERS and DAEN match. For a significant percentage (over one-third), the number of symptoms in DAEN is less than in VAERS. For about 5% of the cases, DAEN has more symptoms than VAERS.
DERIVED_AGE: There are a lot of foreign reports where the AGE_YRS is missing even though you can find it in the writeup. Where possible, I infer this value using some code and add it into this field
IS_HCP: This field is based on my previous article where I parse the writeup to identify the reporter. This has three possible values - Unknown, Yes or No
REPORTER_VALUE: If I can parse the writeup and identify the reporter, I add it into this field. As you can see, I can identify multiple different types of reporters (also discussed in the previous article)
THIRD_PERSON: this field is True if there are at least two instances of third person keywords (patient, he, she, her, him) in the writeup
FIRST_PERSON: this field is True if there are at least two instances of first person keywords (I, me, my) in the writeup
IS_SERIOUS: this field is True if one of the following fields is marked ‘Y’ - DIED, L_THREAT, HOSPITAL or DISABLE. I created this field to make it easier to filter the serious AEs inside this dataset.
These fields are sentences which are not usually useful by themselves, but I use them primarily to double-check and verify my results.
REPORT_NUMBER: this is the sentence which contains the DAEN case number I parsed from the writeup
REPORT_FROM: this is the complete sentence which contains the REPORTER_VALUE I parsed from the writeup. This is a superset of the previous field and I found it more useful as it is more comprehensive (while the previous field is more readable as it is often much shorter)
Who is the Reporter for Australian VAERS reports?
First I will address the question by the person on Twitter.
I will have a look at VAERs though just to see whats on there. I think those cases would be self reported are they?
As I mentioned in my previous article, because foreign VAERS report writeups are usually very detailed, it is quite easy to infer the REPORTER for a large majority of the reports. You will see that the same thing holds for Australian VAERS reports.
For example, out of the total of 9965 reports, 8204 reports are made by healthcare professionals (82%).
If we restrict it to just reports which have case numbers (8679), 7885 of those have been reported by healthcare professionals. In other words, over 90% of the VAERS reports where I could map the DAEN number were actually reported by HCPs.
Data discrepancies
Here is the list of data discrepancies I observed between VAERS and DAEN.
Death reports in VAERS where death is not mentioned in DAEN
Do the following steps:
1 Filter the dataset for DIED=’Y’ (total rows = 498)
2 Set TGA_CASE_NUMBER is not empty (total rows = 432)
3 Set TGA_SYMPTOM_LIST does not contain ‘death’ (total rows = 430)
So unless I am just completely missing something, these are death reports in VAERS (you can see that a large majority have been sent in by HCPs according to the writeup), which clearly mention a DAEN case number, but the DAEN case number does not mention Death as the adverse reaction. You can just look up the case number in DAEN and verify this for yourself.
Does anyone know what is going on here? Does the DAEN exclude the adverse reaction called ‘Death’ (note that they do mention other adverse reactions) unless they are certain it was caused by the vaccine?
Serious VAERS reports which are missing in the DAEN database
This is obviously a big concern. There are over 800 serious AEs in VAERS (from Australia). About 40 of them do not have a corresponding entry in the DAEN database.
These are VAERS reports where
a) I identified a case number in the writeup
b) the IS_SERIOUS is true, meaning it was a serious AE according to VAERS
c) the case number is missing in the DAEN database (MISSING=True)
There are 39 such reports5
VAERS reports mentioning heart issues are not reported in DAEN
This is also very concerning.
If I look for reports which mention the case number in the writeup, but the case number is missing in DAEN, and search the VAERS_SYMPTOM_LIST for the word “cardi” (I am trying to figure out a term which can capture myocarditis, pericarditis and cardiac arrest), there are 2236 such reports.
What makes this even worse is that if I remove the filter for the word ‘cardi’ there are only a total of 3697 such reports.
In other words, reports where ‘cardi’ is part of the symptoms is over 60%8 of all such missing DAEN case numbers!
I also encourage readers to go through all the VAERS reports where TGA_CASE_NUMBER is not empty, and MISSING_CASE_NUMBER is set to True. You will see that most of them are fairly severe adverse events.
In other words unless there is a good explanation for these missing DAEN Case numbers, it is already quite clear that DAEN is understating vaccine injuries.
Summary
Based on all these discrepancies, in my opinion, the DAEN database does underreport vaccine injuries.
However, I am not as familiar with the DAEN database as I am with VAERS, so it is possible all the data discrepancies I have noticed here have good explanations.
If you know the explanation for these discrepancies, please do let me know in the comments.
This is another reason that David Gorski is wrong about VAERS given that most of the stuff he says does not apply to foreign reports (e.g. percentage of “crazy” reports). The foreign reports barely registered on anyone’s radar until the COVID19 vaccine rollout. Once it did, it became very clear that the quality of US VAERS reports is not comparable to the foreign ones because outside the US, the healthcare professionals actually seem to take vaccine injuries very seriously.
This is why I keep saying that as bad as VAERS is, it is easily the best vaccine injury database by a long margin. You don’t understand how useful it is until you look at the systems in other countries. It is not a surprise to me that some people with questionable Pharma affiliations do their best to stop people from investigating VAERS by calling it “dumpster diving”
I did not see any false positives - that is, DAEN case numbers identified by my code which turned out not to be actual DAEN case numbers. However, because the heuristic I used is so simple, it is possible there are some false negatives - that is, actual DAEN case numbers in the text which did not match the conditions I have set.
All the case numbers in DAEN for the COVID19 vaccine are within the range 500000 and 700000
This was 134 before the update
It was 427 before the update
It was 1109 before the update
The percentage was 40% before the update. In other words, while the actual number of such reports has come down, the proportion of serious reports (such as cardiac issues) has gone higher.
In UK many people I know did not report to Yellow Card because their tremor, lung clot, cancer etc. did not appear within say 2 weeks or so from their last jab. Australia looks to be cooking the books.