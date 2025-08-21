On 7th August 2025, OpenAI announced GPT-5.

GPT-5 is very unique, because it now provides an LLM which is capable of extracting fairly complex structured data from clinical narratives like VAERS reports.

But there is an important factoid that many people in the US missed - OpenAI were forced to reduce prices (compared to previous flagship versions) not just to get more customers, because also because there are now other open weight LLMs which produce equally good results at a fraction of the cost of even GPT5.

Note: Equally good is not the same as “much better than”. I think GPT5 is still overall a better LLM for this task, but

a) its lead has shrunk dramatically compared to other LLMs and

b) this is soon going to turn into an even closer contest, which is great news for end users who are working on problems which don’t have much commercial demand, like proper vaccine pharmacovigilance :-)

I have built a tool to visualize structured data extracted from VAERS autism reports. Being able to visualize the data is an important part of the analysis itself, especially when you want to present the results to other people. I will be working on the analysis and writing up my findings over the coming weeks.

This is what the tool looks like:

I am using a different LLM called Mistral for this screenshot to demonstrate that my ideas are not GPT5-specific but apply broadly to any LLM which incorporates some basic “reasoning” skills.

I put reasoning in quotes because when I use the word reasoning, I don’t mean it in the sense people normally think of reasoning. Personally, I don’t think LLMs - at least the ones which the riffraffs are allowed to use - are even capable of human like reasoning. But there are notable improvements in the ability of these LLMs to “put 2 and 2 together” and do basic inference, and that’s what I am referring to as reasoning in this context.

I will also be using this tool to teach a course on Udemy, where I will explain the technical aspects of various techniques you can use for automated data extraction using LLMs - these techniques are often called prompt engineering.

If there is something specific that you would want me to analyze within VAERS Autism reports (and especially if it is based on the data that you see in the visualization tool), please leave a comment and I will try to incorporate it into my analysis.