This page lists security fixes that the Hadoop PMC felt warranted a CVE. If you think something is missing from this list or if you think the set of impacted or fixed versions is incomplete then please ask on the Security list.
CVEs are presented in most-recent-first order of announcement.
Relative library resolution in linux container-executor binary in Apache Hadoop 3.3.1-3.3.4 on Linux allows local user to gain root privileges. If the YARN cluster is accepting work from remote (authenticated) users, this MAY permit remote users to gain root privileges.
Hadoop 3.3.0 updated the YARN Secure Containers to add a feature for executing user-submitted applications in isolated linux containers.
The native binary HADOOP_HOME/bin/container-executor is used to launch these containers; it must be owned by root and have the suid bit set in order for the YARN processes to run the containers as the specific users submitting the jobs.
The patch YARN-10495 “make the rpath of container-executor configurable” modified the library loading path for loading .so files from $ORIGIN/
to $ORIGIN/:../lib/native/
. This is the a path through which libcrypto.so is located. Thus it is is possible for a user with reduced privileges to install a malicious libcrypto library into a path to which they have write access, invoke the container-executor command, and have their modified library executed as root.
If the YARN cluster is accepting work from remote (authenticated) users, and these users’ submitted job are executed in the physical host, rather than a container, then the CVE permits remote users to gain root privileges.
The fix for the vulnerability is to revert the change, which is done in YARN-11441, “Revert YARN-10495”. This patch is in hadoop-3.3.5.
To determine whether a version of container-executor is vulnerable, use the readelf command. If the RUNPATH or RPATH value contains the relative path ./lib/native/
then it is at risk
$ readelf -d container-executor|grep 'RUNPATH\|RPATH'
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/:../lib/native/]
If it does not, then it is safe:
$ readelf -d container-executor|grep 'RUNPATH\|RPATH'
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/]
For an at-risk version of container-executor to enable privilege escalation, the owner must be root and the suid bit must be set
$ ls -laF /opt/hadoop/bin/container-executor
---Sr-s---. 1 root hadoop 802968 May 9 20:21 /opt/hadoop/bin/container-executor
A safe installation lacks the suid bit; ideally is also not owned by root.
$ ls -laF /opt/hadoop/bin/container-executor
-rwxr-xr-x. 1 yarn hadoop 802968 May 9 20:21 /opt/hadoop/bin/container-executor
This configuration does not support Yarn Secure Containers, but all other hadoop services, including YARN job execution outside secure containers continue to work.
ZKConfigurationStore which is optionally used by CapacityScheduler of Apache Hadoop YARN deserializes data obtained from ZooKeeper without validation. An attacker having access to ZooKeeper can run arbitrary commands as YARN user by exploiting this.
Apache Hadoop’s FileUtil.unTar(File, File) API does not escape the input file name before being passed to the shell. An attacker can inject arbitrary commands.
This is only used in Hadoop 3.3 InMemoryAliasMap.completeBootstrapTransfer, which is only ever run by a local user.
It has been used in Hadoop 2.x for yarn localization, which does enable remote code execution.
It is used in Apache Spark, from the SQL command ADD ARCHIVE. As the ADD ARCHIVE command adds new binaries to the classpath, being able to execute shell scripts does not confer new permissions to the caller.
SPARK-38305. “Check existence of file before untarring/zipping”, which is included in 3.3.0, 3.1.4, 3.2.2, prevents shell commands being executed, regardless of which version of the hadoop libraries are in use.
In Apache Hadoop 2.2.0 to 2.10.1, 3.0.0-alpha1 to 3.1.4, 3.2.0 to 3.2.2, and 3.3.0 to 3.3.1, A user who can escalate to yarn user can possibly run arbitrary commands as root user.
If you are using the affected version of Apache Hadoop and some users can escalate to yarn user and cannot escalate to root user, remove the permission to escalate to yarn user from them.
There is a potential heap buffer overflow in libhdfs native code. Opening a file path provided by user without validation may result in a denial of service or arbitrary code execution.
In Apache Hadoop, The unTar
function uses unTarUsingJava
function on Windows and the built-in tar utility on Unix and other OSes. As a result, a TAR entry may create a symlink under the expected extraction directory which points to an external directory. A subsequent TAR entry may extract an arbitrary file into the external directory using the symlink name. This however would be caught by the same targetDirPath
check on Unix because of the getCanonicalPath
call. However on Windows, getCanonicalPath
doesn’t resolve symbolic links, which bypasses the check. unpackEntries
during TAR extraction follows symbolic links which allows writing outside expected base directory on Windows.
Users of the affected versions should apply either of the following mitigations:
WebHDFS client might send SPNEGO authorization header to remote URL without proper verification. A crafty user can trigger services to send server credentials to a webhdfs path for capturing the service principal.
Users of the affected versions should apply either of the following mitigations:
Web endpoint authentication check is broken. Authenticated users may impersonate any user even if no proxy user is configured.
When Kerberos authentication is enabled and SPNEGO through HTTP is not enabled, any users can access some servlets without authentication.
There is a mismatch in the size of the fields used to store user/group information between memory and disk representation. This causes the user/group information to be corrupted across storing in fsimage and reading back from fsimage.
This vulnerability fix contains a fsimage layout change, so once the image is saved in the new layout format you cannot go back to a version that doesn’t support the newer layout. This means that once 2.7.x users upgraded to the fixed version, they cannot downgrade to 2.7.x because there is no fixed version in 2.7.x. We suggest downgrade to 2.8.5 or upper version that contains the vulnerability fix.
A user who can escalate to yarn user can possibly run arbitrary commands as root user.
After the security fix for CVE-2017-15713, KMS has an access control regression, blocking users or granting access to users incorrectly, if the system uses non-default groups mapping mechanisms such as LdapGroupsMapping, CompositeGroupsMapping, or NullGroupsMapping.
HDFS exposes extended attribute key/value pairs during listXAttrs, verifying only path-level search access to the directory rather than path-level read permission to the referent. This affects features that store sensitive data in extended attributes, such as HDFS encryption secrets.
In Apache Hadoop 2.7.4 to 2.7.6, the security fix for CVE-2016-6811 is incomplete. A user who can escalate to yarn user can possibly run arbitrary commands as root user.
Vulnerability allows a cluster user to publish a public archive that can affect other files owned by the user running the YARN NodeManager daemon. If the impacted files belong to another already localized, public archive on the node then code can be injected into the jobs of other cluster users using the public archive.
A user who can escalate to yarn user can possibly run arbitrary commands as root user.
Note: The fix for this vulnerability is incomplete in Apache Hadoop 2.7.4 to 2.7.6 (CVE-2018-11766).
In Apache Hadoop 2.7.3 and 2.7.4, the security fix for CVE-2016-3086 is incomplete. The YARN NodeManager can leak the password for credential store provider used by the NodeManager to YARN Applications.
If you use the CredentialProvider feature to encrypt passwords used in NodeManager configs, it may be possible for any Container launched by that NodeManager to gain access to the encryption password. The other passwords themselves are not directly exposed.
Vulnerability allows a cluster user to expose private files owned by the user running the MapReduce job history server process. The malicious user can construct a configuration file containing XML directives that reference sensitive files on the MapReduce job history server host.
In a cluster where the YARN user has been granted access to all HDFS encryption keys, if a file in an encryption zone with access permissions that make it world readable is localized via YARN’s localization mechanism, e.g. via the MapReduce distributed cache, that file will be stored in a world-readable location and shared freely with any application that requests to localize that file, no matter who the application owner is or whether that user should be allowed to access files from the target encryption zone.
The following section describes thirdparty vulnerabilities that may be of interest to Hadoop users. Please contact the respective project owners for details.
It is understood that the log4shell vulnerability CVE-2021-44228 impacts log4j2. Hadoop, as of 3.3.x depends on log4j 1.x, which is NOT susceptible to the attack. Once we migrate over to log4j2, we will adopt a version that is not susceptible to the attack, too. Therefore, no ASF version of Hadoop has ever been vulnerable. Third party products and applications based on Hadoop may be vulnerable, please consult the vendor or the project owner.
JMSAppender in Log4j 1.2, used by all versions of Apache Hadoop, is vulnerable to the Log4Shell attack in a similar fashion to CVE-2021-44228. However, the JMSAppender is not the default configuration shipped in Hadoop. When JMSAppender is not enabled, Hadoop is not vulnerable to the attack.
To mitigate the risk, you can remove JMSAppender from the log4j-1.2.17.jar artifact yourself following the instructions in this link.