An open API service providing security vulnerability metadata for many open source software ecosystems.

GSA_kwCzR0hTQS00dnZtLTR3M3YtNm1yOM4AA0KJ

Moderate EPSS: 0.00023% (0.04584 Percentile) EPSS:

pypdf and PyPDF2 possible Infinite Loop when a comment isn't followed by a character

Affected Packages Affected Versions Fixed Versions
pypi:PyPDF2
PURL: pkg:pypi/pypdf2
>= 2.2.0, <= 3.0.1 No known fixed version
160 Dependent packages
1,626 Dependent repositories
17,169,680 Downloads last month

Affected Version Ranges

All affected versions

2.2.0, 2.2.1, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 2.4.2, 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.8.1, 2.9.0, 2.10.0, 2.10.1, 2.10.2, 2.10.3, 2.10.4, 2.10.5, 2.10.6, 2.10.7, 2.10.8, 2.10.9, 2.11.0, 2.11.1, 2.11.2, 2.12.0, 2.12.1, 3.0.0, 3.0.1

pypi:pypdf
PURL: pkg:pypi/pypdf
>= 3.1.0, < 3.9.0 3.9.0
390 Dependent packages
3,809 Dependent repositories
17,702,019 Downloads last month

Affected Version Ranges

All affected versions

3.1.0, 3.2.0, 3.2.1, 3.3.0, 3.4.0, 3.4.1, 3.5.0, 3.5.1, 3.5.2, 3.6.0, 3.7.0, 3.7.1, 3.8.0, 3.8.1

All unaffected versions

3.9.0, 3.9.1, 3.10.0, 3.11.0, 3.11.1, 3.12.0, 3.12.1, 3.12.2, 3.13.0, 3.14.0, 3.15.0, 3.15.1, 3.15.2, 3.15.3, 3.15.4, 3.15.5, 3.16.0, 3.16.1, 3.16.2, 3.16.3, 3.16.4, 3.17.0, 3.17.1, 3.17.2, 3.17.3, 3.17.4, 4.0.0, 4.0.1, 4.0.2, 4.1.0, 4.2.0, 4.3.0, 4.3.1, 5.0.0, 5.0.1, 5.1.0, 5.2.0, 5.3.0, 5.3.1, 5.4.0, 5.5.0, 5.6.0, 5.6.1, 5.7.0, 5.8.0, 5.9.0, 6.0.0

Impact

An attacker who uses this vulnerability can craft a PDF which leads to an infinite loop if __parse_content_stream is executed. This infinite loop blocks the current process and can utilize a single core of the CPU by 100%. It does not affect memory usage. That is, for example, the case if the user extracted text from such a PDF.

Example Code and a PDF that causes the issue:

from pypdf import PdfReader

# https://objects.githubusercontent.com/github-production-repository-file-5c1aeb/3119517/11367871?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230627%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230627T201018Z&X-Amz-Expires=300&X-Amz-Signature=d71c8fd9181c4875f0c04d563b6d32f1d4da6e7b2e6be2f14479ce4ecdc9c8b2&X-Amz-SignedHeaders=host&actor_id=1658117&key_id=0&repo_id=3119517&response-content-disposition=attachment%3Bfilename%3DMiFO_LFO_FEIS_NOA_Published.3.pdf&response-content-type=application%2Fpdf
reader = PdfReader("MiFO_LFO_FEIS_NOA_Published.3.pdf")
page = reader.pages[0]
page.extract_text()

The issue was introduced with https://github.com/py-pdf/pypdf/pull/969

Patches

The issue was fixed with https://github.com/py-pdf/pypdf/pull/1828

Workarounds

It is recommended to upgrade to pypdf>=3.9.0. PyPDF2 users should migrate to pypdf.

If you cannot update your version of pypdf, you should modify pypdf/generic/_data_structures.py:

OLD: while peek not in (b"\r", b"\n"):
NEW: while peek not in (b"\r", b"\n", b""):
References: