15-Year Python Vulnerability Found in “Over 350,000” Projects • Registry

At least 350,000 open source projects are believed to be potentially vulnerable to a Python module vulnerability that has not been fixed in 15 years.

Security firm Trellix announced on Tuesday that its threat researchers stumbled upon a Python vulnerability tarfile a module that allows you to read and write compressed file packages known as tar archives. At first, the worm hunters thought they had hit Day Zero.

It turned out that the problem lasted about 5,500 days: the worm has been living its best life in the last decade and a half, waiting to become extinct.

Identified as CVE-2007-4559, this vulnerability appeared on the Python mailing list on August 24, 2007 by Jan Matejk, who at the time was the Python maintainer for SUSE. It can be used to potentially overwrite and hijack files on a victim’s computer when a vulnerable application opens a malicious tar archive by tarfile.

“The vulnerability looks like this: If you tare a file named "../../../../../etc/passwd" and then make the administrator untar then, / etc / passwd is overwritten “- explained Matejek then.

A bug traversing the tar directory was reported on August 29, 2007 by Tomas Hoger, software engineer at Red Hat.

But that has, in a way, been addressed. The day before, Lars Gustäbel, the maintainer of the tarfile module, approved a code change that adds the default value to true check_paths parameter and helper function to TarFile.extractall() method that reports an error if the path to the tar archive file is uncertain.

But the amendment did not apply TarFile.extract() a method – which, according to Gustäbel, “should not be used at all” – and left open the possibility that extracting data from untrusted archives could cause problems.

In a comment thread, Gustäbel explained that he no longer considers this a safety issue. “Tarfile.py does nothing wrong, its behavior conforms to the pax definition and POSIX pathname resolution guidelines,” he wrote.

“There is no known or possible practical use [updated] documentation with a warning that extracting archives from untrusted sources may be dangerous. This is the only thing to do IMO.

Indeed, the documentation describes this footgun:

Warning: Never extract archives from untrusted sources without prior checking. It is possible that files are created outside of pathfor example, members that have absolute filenames starting with "/" or filenames with two periods "..".

And yet here we are, with both of them extract() and extractall() still poses the threat of arbitrarily traversing the path.

“The vulnerability is an attack based on traversing the path in extract and extractall functions in the tar module that allow an attacker to overwrite arbitrary files by appending “..” to the file names in the tar archive, “Kasimir Schulz, a Trellix vulnerability researcher, explained in a blog post.

The sequence “..” changes the current working path to the parent directory. Using a code like the following six-line snippet, Schulz says, tarfile you can tell the module to read and modify a file’s metadata before adding it to the tar archive. The result is an exploit.

import tarfile

def change_name(tarinfo):
    tarinfo.name = "../" + tarinfo.name
    return tarinfo

with tarfile.open("exploit.tar", "w:xz") as tar:
    tar.add("malicious_file", filter=change_name)

According to Schulz, Trellix has built a free tool called Creosote to scan for CVE-2007-4559. The software has already detected a bug in applications such as Spyder IDE, an open source research community written for Python, and Polemarch, an IT infrastructure management service for Linux and Docker.

The company estimates tarfile the bug can be found “in over 350,000 open source projects and common in closed source projects.” It also indicates that tarfile is the default module in every Python project and is present in frameworks made by AWS, Facebook, Google, and Intel, as well as in machine learning, automation, and Docker container applications.

Trellix says it is working to make repaired code available to affected projects.

“Using our tools, we currently have fixes for 11,005 pull-request repositories,” explained Charles McFarland, Vulnerability Investigator for Trellix, in a blog post. “Each patch will be added to the forked repository and a pull request will be sent over time. This will help both individuals and organizations to become aware of the problem and give them a one-click solution.

“Due to the size of the sensitive projects, we expect to continue this process in the next few weeks. This is expected to affect 12.06 percent of all sensitive projects, or just over 70,000 projects by the time they are completed. ”

The remaining 87.94 percent of affected projects may wish to consider other possible options. ®

Leave a Reply