CVE-2025-12638
UNKNOWNKeras 3.11.3 has a path traversal vulnerability in get_file() that enables arbitrary file writes via crafted tar archives, bypassing the library's own safety filter through a PATH_MAX symlink resolution bug. Any ML training pipeline or data science environment that fetches external datasets or model weights using Keras is at risk of remote code execution. Patch immediately or restrict get_file() calls to verified internal artifact registries until patched.
Severity & Risk
Recommended Action
- 1. Upgrade Keras to the patched version (fix is adding filter='data' to tarfile.extractall()). 2. If patching is blocked, audit all calls to keras.utils.get_file() and restrict to downloads from internal, checksum-verified artifact registries only. 3. Run training workloads in isolated containers with read-only host mounts and least-privilege write permissions. 4. Enable filesystem integrity monitoring (Falco, AIDE) on training nodes — alert on writes outside ~/.keras/ cache during extraction. 5. Pin Keras versions explicitly in requirements files and verify checksums in your supply chain pipeline. 6. In CI/CD, sandbox model download steps away from sensitive mounts before extraction.
Classification
Compliance Impact
This CVE is relevant to:
Technical Details
NVD Description
Keras version 3.11.3 is affected by a path traversal vulnerability in the keras.utils.get_file() function when extracting tar archives. The vulnerability arises because the function uses Python's tarfile.extractall() method without the security-critical filter='data' parameter. Although Keras attempts to filter unsafe paths using filter_safe_paths(), this filtering occurs before extraction, and a PATH_MAX symlink resolution bug triggers during extraction. This bug causes symlink resolution to fail due to path length limits, resulting in a security bypass that allows files to be written outside the intended extraction directory. This can lead to arbitrary file writes outside the cache directory, enabling potential system compromise or malicious code execution. The vulnerability affects Keras installations that process tar archives with get_file() and does not affect versions where this extraction method is secured with the appropriate filter parameter.
Exploitation Scenario
An adversary publishes a weaponized dataset or model weights archive to a public repository (HuggingFace, Kaggle, public S3). The archive contains a symlink chain crafted to exceed PATH_MAX during extraction. A victim's automated training pipeline calls keras.utils.get_file() — the standard pattern for fetching benchmarks or fine-tuning weights. Keras runs filter_safe_paths() pre-extraction, but the PATH_MAX bug causes symlink resolution to fail silently, bypassing the check. During tarfile.extractall(), files land at attacker-controlled paths: overwriting ~/.bashrc, SSH authorized_keys, cron jobs, or dropping a backdoor into a Python site-packages directory. In shared MLOps platforms where training jobs run on multi-tenant GPU clusters, this becomes a privilege escalation and lateral movement vector across tenant boundaries.