CVE-2025-46570

GHSA-4qjh-9fv9-r85r LOW
Published May 29, 2025

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the...

Full analysis pending. Showing NVD description excerpt.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip < 0.9.0 0.9.0
vllm pip No patch

Severity & Risk

CVSS 3.1
2.6 / 10
EPSS
0.1%
chance of exploitation in 30 days
KEV Status
Not in KEV
Sophistication
N/A

Recommended Action

Patch available

Update vllm to version 0.9.0

Compliance Impact

Compliance analysis pending. Sign in for full compliance mapping when available.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.

CVSS Vector

CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:L/I:N/A:N

Timeline

Published
May 29, 2025
Last Modified
June 27, 2025
First Seen
May 29, 2025