Hollow leaf. How vulnerabilities work in the SheetJS library

SheetJS XLSX Package Vulnerabilities

The XLSX package released by SheetJS is widely used by developers to interact with spreadsheets in XLSX and XLSM formats, including those used in enterprise products. While analyzing the package, we found several vulnerabilities. In this article, I will show how they arose and how they can be exploited by an attacker.

How SheetJS Works

Let’s take a quick look at how SheetJS works. When an XLSX spreadsheet file is passed to the XLSX function readFile, the following happens:
The function checks the file type by parsing the first bytes of the header. If the file type is recognized as a ZIP archive, the process continues.
The archive file is decompressed into process memory, allowing you to work directly with XML files that describe the structure and data of the spreadsheet, as well as with other resources, including images and fonts.
The parser built into the library starts parsing XML tags. It parses the structure of the file and extracts the necessary data such as cell values, formatting, and other table properties.
The resulting data is usually represented in convenient structures such as arrays or objects so that it can be easily used in an application.

Limited Prototype Pollution

The Limited Prototype Pollution vulnerability occurs when comments are handled within a loaded document in the cmntcommon function. It assigns the value of an object by a key, which can be controlled by the user.
For further analysis, it is important to understand what a comment is. ref. This value gets into the code from threadedCommentXXX. xml (where XXX is the document number with comments). Example:
< threadedComment ref= "G7" dT= "2023-04-11T09:41:09.71" personId= "{ 29DB960B-0822-594C-AB20-3D499FA339C7} " id= "{ 962D1EF3-37F7-FF40-983D-B0762466C 0AF } ">
Usually, when a file is created in a spreadsheet editor, this does not cause problems, since the editor will automatically generate the cell addresses and these will be valid values.
However, the developers of the XLSX package did not take into account that an attacker can manually create an XLSX file with arbitrary content and specially form cell addresses.
For successful operation, let’s load a regular file, but with the following threadedComment :
< threadedComment ref= "__proto__" dT= "2023-04-11T09:41:09.71" personId= "{ 29DB960B-0822-594C-AB20-3D499FA339C7} " id= "{ 962D1EF3-37F7-FF40-983D-B076246 6C0AF } ">
In this case, the value of comments. ref will be equal to __proto__, which is a special keyword in JavaScript. This keyword allows you to access the prototype of an object.
When the cmntcommon function is called, it will assign the value of the object to the __proto__ key. This allows an attacker to modify the prototype of an object, which can lead to unexpected behavior and security issues.
For example, an attacker can use this vulnerability to modify the prototype of the global object, which can lead to the execution of malicious code.

Exploiting the Vulnerability

To exploit the vulnerability, an attacker must first create an XLSX file with a specially crafted threadedComment. This can be done using a variety of tools, such as 7-Zip.
7z x normal. xlsx ; We make a change to the desired files 7z a NotNormal. zip . / \ [ Content_Types \ ] . xml _ rels/ docProps/ xl mv NotNormal. zip NotNormal. xlsx
The attacker then needs to upload the file to a system that uses the XLSX package. If the system is vulnerable, the attacker can then execute arbitrary code on the system.

Conclusion

The Limited Prototype Pollution vulnerability in the XLSX package is a serious security issue that can be exploited by an attacker to execute arbitrary code on a system. It is important to ensure that the XLSX package is updated to the latest version to ensure that the vulnerability is patched. Additionally, it is important to ensure that XLSX files are only uploaded to trusted systems.