PDF Processing Upgrades: Or, Why Platform Quality Matters
May 6, 2025
The AI world is exploding with an unprecedented amount of new tools, features, and noise. It can be really difficult to parse through this information and understand the real-world implications.
We at elvex have a commitment to quality.
To us, product quality is the expectation that “it should just work.” AI and LLMs in particular are perhaps the most confounding technology when it comes to reliability—which is why we exist: to make AI usage reliable, scalable, and useful for business workflows.
We know lots of our users have tried other AI tools that say they integrate with another software, or can handle certain data formats, but the results are subpar when the users try to actually apply the features. As such, we have a deep continuing focus on improving the base quality and user experience of elvex.
An example of this would be that our integrations with Google Drive and similar tools (OneDrive, Sharepoint, etc) sync live: users would expect that if they update the content of a spreadsheet or doc, the AI tool would be able to access that update. Many other tools do not.
PDFs are where huge amounts of corporate data are stored, but many AI applications struggle to extract the data completely accurately. We've been able to process PDFs for quite some time. However, today we are releasing an improvement to our backend that extends our capability to reliably extract data even further.
Examples include a better ability to extract information from images embedded within PDFs, and a better ability to read complicated table structures.
This is the kind of improvement that isn't immediately visible to users, but is critical to quality. Let's show some examples of how subtle improvements can be the difference between "the AI got it right" and "the AI got it wrong."
(Open the images in a new tab if some of the details are too small to read on your screen).
Example 1 - Does AI read the table correctly?

Here you can see data that is inside a table inside a PDF. The table has some minor complexity—cells that are merged across multiple columns, for example. Let's look at the before and after results of asking AI to use information from this PDF:

You can see that the "before" query got the answer incorrect, and the "after" query correctly understood the PDF's data.
Example 2 - More tabular data
There are other ways tables can be complex.

Here, we have a PDF that has two tables loaded into it, that are similar in topic but not completely the same.

Here, we have asked the LLM to combine the two tables into one new table. You can see that in the "before" example, the new table is missing the year 2008, but that it correctly combines them in the "after" example.
Example 3 - Images embedded into PDFs
Many PDF processing tools simply extract the text from the PDF and ignore the images—but that's not always how people create PDFs. Frequently, people paste images and screenshots into the PDF they are making because it's easier to just do that, instead of recreating everything within the PDF software. Similarly, sometimes PDFs are completely images, for example when documents are scanned. This can befuddle PDF processing software.

Here we can see a scanned document: it's entirely imagery.

You can see that the AI in the "before" section really only has the name of the document to work from, and it hallucinates the rest of the answer. In the "after" example, the AI is able to work with the actual data inside the PDF, and supply the correct answer.
Quality matters
In a landscape where AI tools often overpromise and underdeliver, elvex remains steadfast in our commitment to reliability and quality. These improvements to our PDF processing capabilities represent just one facet of our ongoing mission to make AI truly useful for business workflows.
While other platforms may focus on flashy features that fail in real-world applications, we're dedicated to the fundamentals, ensuring that "it just works" when you integrate elvex into your processes.
As AI continues to evolve, you can count on us to prioritize the improvements that matter most to your productivity, even when they happen behind the scenes. Because at elvex, we believe the best AI isn't the one that makes the boldest claims—it's the one you can consistently rely on to get the job done right.