Hadoop 服务注册表建立在 Apache Zookeeper 之上。它通过 Hadoop Configuration
类进行配置:用于创建服务的实例控制客户端的行为。
本文档列出了控制注册表客户端的配置参数。
所有这些设置的默认值都在 core-default.xml
中定义。此文件中的值可能与本文档中列出的值不匹配。如果是这种情况,则 MUST 将 core-default.xml
中的值视为规范值。
SHOULD 在 core-site.xml
中更改配置值。这将确保客户端和非 YARN 应用程序将获取这些值,从而使它们能够从注册表中读取数据,并可能向注册表中写入数据。
hadoop.registry.zk.quorum
这是一个基本设置:它标识 zookeeper 主机列表和 ZK 服务正在监听的端口。
<property> <description> A comma separated list of hostname:port pairs defining the zookeeper quorum binding for the registry </description> <name>hadoop.registry.zk.quorum</name> <value>localhost:2181</value> </property>
它采用逗号分隔的列表,例如 zk1:2181 ,zk2:2181, zk3:2181
hadoop.registry.zk.root
此路径设置注册表的 zookeeper 基本节点
<property> <description> The root zookeeper node for the registry </description> <name>hadoop.registry.zk.root</name> <value>/registry</value> </property>
/registry
的默认值通常足够。出于安全原因或因为 /registry
路径正在使用,可能需要不同的值。
根值被预先添加到所有注册表路径,以创建绝对路径。例如
/
映射到 /registry
/services
映射到 /registry/services
/users/yarn
映射到 /registry/users/yarn
hadoop.registry.zk.root
的不同值将导致到绝对 Zookeeper 路径的不同映射。
当属性 hadoop.registry.secure
设置为 true
时,将启用注册表安全性。设置后,将创建具有权限的节点,以便只有特定用户和配置的集群“超级用户”帐户才能在其主路径 ${hadoop.registry.zk.root}/users
下写入。只有超级用户帐户才能操作根路径,包括 ${hadoop.registry.zk.root}/services
和 ${hadoop.registry.zk.root}/users
。
对注册表的所有写操作(包括删除条目和路径)都必须经过身份验证。未经身份验证的调用者仍然允许执行读取操作。
安全注册表支持的关键设置是
hadoop.registry.secure
hadoop.registry.system.acls
hadoop.registry.kerberos.realm
hadoop.registry.jaas.context
<property> <description> Key to set if the registry is secure. Turning it on changes the permissions policy from "open access" to restrictions on kerberos with the option of a user adding one or more auth key pairs down their own tree. </description> <name>hadoop.registry.secure</name> <value>false</value> </property>
注册表客户端必须识别它们用于向注册表进行身份验证的 JAAS 上下文。
<property> <description> Key to define the JAAS context. Used in secure mode </description> <name>hadoop.registry.jaas.context</name> <value>Client</value> </property>
注意由于资源管理器只是注册表的另一个客户端,因此它也必须定义此上下文。
hadoop.registry.system.acls
这些是获得对注册表基准完全访问权限的帐户。资源管理器需要此选项来创建根路径。
将数据写入注册表的客户端应用程序访问它创建的节点。
hadoop.registry.system.acls
采用 Zookeeper ACL
的逗号分隔列表,这些 ACL 被授予对已创建节点的完全访问权限;权限为 READ | WRITE | CREATE | DELETE | ADMIN
。digest:
方案。sasl:
用于识别由 sasl 识别的哪些调用者具有完全访问权限。这些是超级用户帐户。sasl:[email protected]
的元素来识别。@
符号结尾的条目)的 sasl:
条目都会附加当前领域。hadoop.registry.kerberos.realm
覆盖。<property> <description> A comma separated list of Zookeeper ACL identifiers with system access to the registry in a secure cluster. These are given full access to all entries. If there is an "@" at the end of a SASL entry it instructs the registry client to append the default kerberos domain. </description> <name>hadoop.registry.system.acls</name> <value>sasl:yarn@, sasl:mapred@, sasl:mapred@, sasl:hdfs@</value> </property> <property> <description> The kerberos realm: used to set the realm of system principals which do not declare their realm, and any other accounts that need the value. If empty, the default realm of the running process is used. If neither are known and the realm is needed, then the registry service/client will fail. </description> <name>hadoop.registry.kerberos.realm</name> <value></value> </property>
示例:hadoop.registry.system.acls
条目为 sasl:yarn@, sasl:[email protected], sasl:system@REALM2
,在领域为 EXAMPLE.COM
的 YARN 集群中,会将以下管理员帐户添加到每个节点
sasl:[email protected]
sasl:[email protected]
sasl:system@REALM2
创建注册表条目的客户端应用程序的身份将自动包含在所有已创建条目的权限中。例如,如果创建条目的帐户是 hbase
,则会创建另一个条目
sasl:[email protected]
重要提示:在设置系统 ACL 时,必须包含 YARN 资源管理器的身份。
RM 需要能够创建根路径和用户路径,并在应用程序和容器清理期间删除服务记录。
一些低级选项管理 ZK 连接,更具体地说,是其故障处理。
Zookeeper 注册表客户端使用 Apache Curator 连接到 Zookeeper,这是一个检测超时并尝试重新连接到形成 Zookeeper 仲裁组的服务器之一的库。只有在检测到超时后才会触发重试。
<property> <description> Zookeeper session timeout in milliseconds </description> <name>hadoop.registry.zk.session.timeout.ms</name> <value>60000</value> </property> <property> <description> Zookeeper connection timeout in milliseconds </description> <name>hadoop.registry.zk.connection.timeout.ms</name> <value>15000</value> </property> <property> <description> Zookeeper connection retry count before failing </description> <name>hadoop.registry.zk.retry.times</name> <value>5</value> </property> <property> <description> </description> <name>hadoop.registry.zk.retry.interval.ms</name> <value>1000</value> </property> <property> <description> Zookeeper retry limit in milliseconds, during exponential backoff. This places a limit even if the retry times and interval limit, combined with the backoff policy, result in a long retry period </description> <name>hadoop.registry.zk.retry.ceiling.ms</name> <value>60000</value> </property>
注册表客户端中使用的重试策略是 BoundedExponentialBackoffRetry
:在最终得出仲裁组不可达并失败之前,它会在连接失败时呈指数级退避。
<!-- YARN registry --> <property> <description> A comma separated list of hostname:port pairs defining the zookeeper quorum binding for the registry </description> <name>hadoop.registry.zk.quorum</name> <value>localhost:2181</value> </property> <property> <description> The root zookeeper node for the registry </description> <name>hadoop.registry.zk.root</name> <value>/registry</value> </property> <property> <description> Key to set if the registry is secure. Turning it on changes the permissions policy from "open access" to restrictions on kerberos with the option of a user adding one or more auth key pairs down their own tree. </description> <name>hadoop.registry.secure</name> <value>false</value> </property> <property> <description> A comma separated list of Zookeeper ACL identifiers with system access to the registry in a secure cluster. These are given full access to all entries. If there is an "@" at the end of a SASL entry it instructs the registry client to append the default kerberos domain. </description> <name>hadoop.registry.system.acls</name> <value>sasl:yarn@, sasl:mapred@, sasl:mapred@, sasl:hdfs@</value> </property> <property> <description> The kerberos realm: used to set the realm of system principals which do not declare their realm, and any other accounts that need the value. If empty, the default realm of the running process is used. If neither are known and the realm is needed, then the registry service/client will fail. </description> <name>hadoop.registry.kerberos.realm</name> <value></value> </property> <property> <description> Key to define the JAAS context. Used in secure mode </description> <name>hadoop.registry.jaas.context</name> <value>Client</value> </property> <property> <description> Zookeeper session timeout in milliseconds </description> <name>hadoop.registry.zk.session.timeout.ms</name> <value>60000</value> </property> <property> <description> Zookeeper session timeout in milliseconds </description> <name>hadoop.registry.zk.connection.timeout.ms</name> <value>15000</value> </property> <property> <description> Zookeeper connection retry count before failing </description> <name>hadoop.registry.zk.retry.times</name> <value>5</value> </property> <property> <description> </description> <name>hadoop.registry.zk.retry.interval.ms</name> <value>1000</value> </property> <property> <description> Zookeeper retry limit in milliseconds, during exponential backoff: {@value} This places a limit even if the retry times and interval limit, combined with the backoff policy, result in a long retry period </description> <name>hadoop.registry.zk.retry.ceiling.ms</name> <value>60000</value> </property>