Skip Navigation

December 8, 2025 |

Maximum-severity XXE vulnerability in Apache Tika

Loading table of contents...

At a glance: A critical XXE flaw in Apache Tika enables malicious PDF files to trigger info disclosure, SSRF, denial-of-service, or even remote code execution. Tika-core, tika-pdf-module, and tika-parsers versions prior to 3.2.2 / 2.0.0 are affected. Update immediately or apply interim hardening measures. 

Threat summary

On December 4, 2025, the Apache Software Foundation disclosed a critical vulnerability affecting Apache Tika’s tika-core, tika-pdf-module, and tika-parsers components.

Apache Tika is an open-source toolkit designed to parse and extract metadata and text from a wide range of file formats. It is widely integrated into enterprise search engines, content management systems, and data analysis pipelines.

The flaw, tracked as CVE-2025-66516, is described as an XML External Entity (XXE) injection issue. It received a maximum Common Vulnerability Scoring System (CVSS) score of 10.0.

The flaw allows a threat actor to embed malicious XML Form Architecture (XFA) files inside PDFs, which Tika parses when extracting metadata or text. Exploitation can lead to information disclosure, server-side request forgery, denial-of-service, and remote code execution. Additionally, threat actors could gain access to sensitive internal files or execute arbitrary code on servers running Tika-based services.

Public advisories emphasize that exploitation is straightforward, requiring only crafted PDF files with embedded malicious XFA content. Patches are available in tika-core version 3.2.2, tika-pdf-module version 3.2.2, and tika-parsers version 2.0.0.

Analyst insight

The vulnerability is exploitable by any adversary capable of submitting malicious files to systems that rely on Apache Tika parsing. This includes opportunistic attackers and potentially advanced persistent threat groups targeting enterprise data pipelines.

XML Forms Architecture (XFA), introduced in 1999 and later incorporated into the PDF 1.5 standard, enables dynamic, interactive forms within PDFs. Enterprises adopted XFA heavily in government, financial services, and regulated industries where complex, data-driven forms were required.

Even though XFA was deprecated in PDF 2.0, parsing libraries such as Apache Tika continue to support it because organizations still encounter XFA content in production. This makes XFA parsing a critical attack surface, and the vulnerability demonstrates how legacy standards remain exploitable in modern infrastructure.

Organizations are recommended to upgrade immediately to tika-core 3.2.2, tika-pdf-module 3.2.2, and tika-parsers 2.0.0. Dependency reviews are important, as Tika modules are often embedded indirectly in other packages. Managed service providers are advised to inventory all applications that rely on Apache Tika and confirm patch levels.

Where patching is delayed, restricting file upload sources, sandboxing Tika parsing processes, and monitoring for anomalous PDF parsing activity are recommended interim measures. Logging and alerting on unexpected outbound requests from Tika services can help detect exploitation attempts.

Field Effect MDR users will be alerted via ARO if vulnerable systems are detected in their environment.

ThreatRoundUp_SignUp_Simplifiedx2

Stay on top of emerging threats like this.

Sign up to receive a weekly roundup of our security intelligence feed. You'll be the first to know of emerging attack vectors, threats, and vulnerabilities. 

Sign up