0% found this document useful (0 votes)

20 views22 pages

Hive Securing Hive

Uploaded by

ahmed.kallel.tn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views22 pages

Hive Securing Hive

Uploaded by

ahmed.kallel.tn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Cloudera Runtime 7.1.

Securing Apache Hive

Date published: 2019-08-21
Date modified:

https://docs.cloudera.com/
Legal Notice
© Cloudera Inc. 2024. All rights reserved.
The documentation is and contains Cloudera proprietary information protected by copyright and other intellectual property
rights. No license under copyright or any other intellectual property right is granted herein.
Unless otherwise noted, scripts and sample code are licensed under the Apache License, Version 2.0.
Copyright information for Cloudera software may be found within the documentation accompanying each component in a
particular release.
Cloudera software includes software from various open source or other third party projects, and may be released under the
Apache Software License 2.0 (“ASLv2”), the Affero General Public License version 3 (AGPLv3), or other license terms.
Other software included may be released under the terms of alternative open source licenses. Please review the license and
notice files accompanying the software for additional licensing information.
Please visit the Cloudera software product page for more information on Cloudera software. For more information on
Cloudera support services, please visit either the Support or Sales page. Feel free to contact us directly to discuss your
specific needs.
Cloudera reserves the right to change any products at any time, and without notice. Cloudera assumes no responsibility nor
liability arising from the use of products, except as expressly agreed to in writing by Cloudera.
Cloudera, Cloudera Altus, HUE, Impala, Cloudera Impala, and other Cloudera marks are registered or unregistered
trademarks in the United States and other countries. All other trademarks are the property of their respective owners.
Disclaimer: EXCEPT AS EXPRESSLY PROVIDED IN A WRITTEN AGREEMENT WITH CLOUDERA,
CLOUDERA DOES NOT MAKE NOR GIVE ANY REPRESENTATION, WARRANTY, NOR COVENANT OF
ANY KIND, WHETHER EXPRESS OR IMPLIED, IN CONNECTION WITH CLOUDERA TECHNOLOGY OR
RELATED SUPPORT PROVIDED IN CONNECTION THEREWITH. CLOUDERA DOES NOT WARRANT THAT
CLOUDERA PRODUCTS NOR SOFTWARE WILL OPERATE UNINTERRUPTED NOR THAT IT WILL BE
FREE FROM DEFECTS NOR ERRORS, THAT IT WILL PROTECT YOUR DATA FROM LOSS, CORRUPTION
NOR UNAVAILABILITY, NOR THAT IT WILL MEET ALL OF CUSTOMER’S BUSINESS REQUIREMENTS.
WITHOUT LIMITING THE FOREGOING, AND TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE
LAW, CLOUDERA EXPRESSLY DISCLAIMS ANY AND ALL IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY, QUALITY, NON-INFRINGEMENT, TITLE, AND
FITNESS FOR A PARTICULAR PURPOSE AND ANY REPRESENTATION, WARRANTY, OR COVENANT BASED
ON COURSE OF DEALING OR USAGE IN TRADE.
Cloudera Runtime | Contents | iii

Contents

Hive access authorization.........................................................................................4

Transactional table access....................................................................................... 5

External table access................................................................................................ 5

Accessing Hive files in Ozone..................................................................................5

Configuring access to Hive on YARN.................................................................... 7

Configuring HiveServer for ETL using YARN queues.......................................................................................9
Managing YARN queue users........................................................................................................................... 10
Configuring queue mapping to use the user name from the application tag using Cloudera Manager..............10

Disabling impersonation (doas).............................................................................11

Connecting to an Apache Hive endpoint through Apache Knox....................... 12

Hive authentication.................................................................................................12
Securing HiveServer using LDAP..................................................................................................................... 13
Client connections to HiveServer.......................................................................................................................14
Pluggable authentication modules in HiveServer.............................................................................................. 15
JDBC connection string syntax.......................................................................................................................... 15

Communication encryption................................................................................... 17
Enabling TLS/SSL for HiveServer.................................................................................................................... 18
Enabling SASL in HiveServer........................................................................................................................... 19

Securing an endpoint under AutoTLS................................................................. 20

Securing Hive metastore.....................................................................................................................................21

Token-based authentication for Cloudera Data Warehouse integrations......... 21

Activating the Hive Web UI.................................................................................. 21

Cloudera Runtime Hive access authorization

Hive access authorization

As administrator, you need to understand that the Hive default authorization for running Hive queries is insecure and
what you need to do to secure your data. You need to set up Apache Ranger.
To limit Apache Hive access to approved users, Cloudera recommends and supports only Ranger. Authorization is the
process that checks user permissions to perform select operations, such as creating, reading, and writing data, as well
as editing table metadata. Apache Ranger provides centralized authorization for all Cloudera Runtime Services.
You can set up Ranger to protect managed, ACID tables or external tables using a Hadoop SQL policy. You can
protect external table data on the file system by using an HDFS policy in Ranger.

Preloaded Ranger Policies

In Ranger, preloaded Hive policies are available by default. Users covered by these policies can perform Hive
operations. All users need to use the default database, perform basic operations such as listing database names, and
query the information schema. To provide this access, preloaded default database tables columns and information_
schema database policies are enabled for group public (all users). Keeping these policies enabled for group public is
recommended. For example, if the default database tables columns policy is disabled preventing use of the default
database, the following error appears:

hive> USE default;

Error: Error while compiling statement: FAILED: HiveAccessControlException
Permission denied: user [hive] does not have [USE] privilege on [default]

Apache Ranger policy authorization

Apache Ranger provides centralized policy management for authorization and auditing of all Cloudera Runtime
services, including Hive. All Cloudera Runtime services are installed with a Ranger plugin used to intercept
authorization requests for that service, as shown in the following illustration.

The following table compares authorization models:

4
Cloudera Runtime Transactional table access

Authorization model Secure? Fine-grained Privilege management Centralized management

authorization (column, using GRANT/REVOKE GUI
row level) statements

Apache Ranger Secure Yes Yes Yes

Hive default Not secure. No restriction Yes Yes No

on which users can run
GRANT statements

When you run grant/revoke commands and Apache Ranger is enabled, a Ranger policy is created/removed.
Related Information
HDFS ACLS
Configure a Resource-based Policy: Hive
Row-level Filtering and Column Masking in Hive
Query Hive

Transactional table access

As administrator, you must enable the Apache Ranger service to authorize users who want to work with transactional
tables. These types of tables are the default, ACID-compliant tables in Hive 3 and later.
ACID tables reside by default in /warehouse/tablespace/managed/hive. Only the Hive service can own and interact
with files in this directory. Ranger is the only available authorization mechanism that Cloudera recommends for
ACID tables.

External table access

As administrator, you must set up Apache Ranger to allow users to access external tables.
External tables reside by default in /warehouse/tablespace/external on your object store. To specify some other
location of the external table, you need to include the specification in the table creation statement as shown in the
following example:

CREATE EXTERNAL TABLE my_external_table (a string, b string)

LOCATION '/users/andrena';

Hive assigns a default permission of 777 to the hive user, sets a umask to restrict subdirectories, and provides a
default ACL to give Hive read and write access to all subdirectories. External tables must be secured using Ranger.

Accessing Hive files in Ozone

Learn how to set up policies to give users access to Hive external files in Ozone. For example, if Ozone users are
running SparkSQL statements that query Hive tables, you must set up an Ozone access policy and Ozone file system
access policy.

About this task

When Ranger is enabled in the cluster, any user other than the default admin user, "om" requires the necessary Ranger
permissions and policy updates to access the Ozone filesystem. To create a Hive external table that points to the
Ozone filesystem, the "hive" user should have the required permissions in Ranger.

5
Cloudera Runtime Accessing Hive files in Ozone

In this task, you first enable Ozone in the Ranger service, and then set up the required policies.

Procedure
1. In Cloudera Manager, click Clusters Ozone Configuration to navigate to the configuration page for Ozone.
2. Search for ranger_service, and enable the property.
3. Click Clusters Ranger Ranger Admin Web UI , enter your user name and password, then click Sign In.
The Service Manager for Resource Based Policies page is displayed in the Ranger console.

4. Click the cm_ozone preloaded resource-based service to modify an Ozone policy.

5.
In the cm_ozone policies page, click the Policy ID or click Edit against the "all - volume, bucket, key"
policy to modify the policy details.
6. In the Allow Conditions pane, add the "hive" user, choose the necessary permissions, and then click Save.

7. Click the Service Manager link in the breadcrumb trail and then click the Hadoop SQL preloaded resource-based
service to update the Hadoop SQL URL policy.

6
Cloudera Runtime Configuring access to Hive on YARN

8.
In the Hadoop SQL policies page, click the Policy ID or click Edit against the "all - url" policy to modify
the policy details.
By default, "hive", "hue", "impala", "admin" and a few other users are provided access to all the Ozone URLs.
You can select users and groups in addition to the default. To grant everyone access, add the "public" group to the
group list. Every user is then subject to your allow conditions.

What to do next
Create a Hive external table having source data in Ozone.
Also, it is recommended that you set certain Hive configurations before querying Hive tables in Ozone.
Related Information
Set up Ozone security
Cloudera's Ranger documentation
Creating an Ozone-based Hive external table

Configuring access to Hive on YARN

By default, access to Hive and YARN by unauthorized users is not allowed. You also cannot run unauthorized
workloads on YARN. You need to know how to give end users and workloads the access rules necessary for querying
Hive workloads in YARN queues.

About this task

You must configure the following access to query Hive workloads on YARN:
• Allow the end user to access Hive
• Allow the Hive workload on YARN
• Allow the end user to access YARN
Follow the steps in this topic to configure Hive and YARN for end users to access Hive on YARN.

7
Cloudera Runtime Configuring access to Hive on YARN

Procedure
1. In Cloudera Manager, click Clusters Hive on Tez Configuration and search for hive.server2.enable.doAs.

2. Set the value of doas to false. Uncheck Hive (Service-Wide) to disable impersonation.
For more information about configuring doas, see "Enabling or disabling impersonation".
Save changes.
3. Search for the Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml setting.

4.
In the Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml setting, click .
5. Add the properties and values to allow the Hive workload on YARN.

Name: hive.server2.tez.initialize.default.sessions Value: false

Name: hive.server2.tez.queue.access.check Value: true
Name: hive.server2.tez.sessions.custom.queue.allowed Value: true

For more information about allowing the Hive workload on YARN, see "Configuring HiveServer for ETL using
YARN queues".
Save changes.

8
Cloudera Runtime Configuring access to Hive on YARN

6. In Cloudera Manager, click Clusters YARN Configuration , and search for ResourceManager Advanced
Configuration Snippet (Safety Valve) for yarn-site.xml.

7.
In the ResourceManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml setting, click .
8. Add the properties and values to allow the end user to access YARN using placement rules.

Name: yarn.resourcemanager.application-tag-based-placement.enable Value:

true
Name: yarn.resourcemanager.application-tag-based-placement.username.white
list Value: < Comma separated list of users who can use the application
tag based placement.>

For more information about allowing end user access to YARN, see "Configure queue mapping to use the user
name from the application tag using Cloudera Manager".
Save changes.
9. Restart the YARN ResourceManager service for the changes to apply.
End users you specified can now query Hive workloads in YARN queues.
Related Information
Disabling impersonation (doas)
Configuring HiveServer for ETL using YARN queues
Managing YARN queue users
Configuring queue mapping to use the user name from the application tag using Cloudera Manager

Configuring HiveServer for ETL using YARN queues

You need to set several configuration properties to allow placement of the Hive workload on the Yarn queue
manager, which is common for running an ETL job. You need to set several parameters that effectively disable the
reuse of containers. Each new query gets new containers routed to the appropriate queue.

About this task

Hive configuration properties affect mapping users and groups to YARN queues. You set these properties to use with
YARN Placement Rules.
To set Hive properties for YARN queues:

Procedure
1. In Cloudera Manager, click Clusters Hive-on-Tez Configuration .
2. Search for the Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml setting.

9
Cloudera Runtime Configuring access to Hive on YARN

3. In the Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml setting, click +.
4. In Name enter the property hive.server2.tez.initialize.default.sessions and in value enter false.
5. In Name enter the property hive.server2.tez.queue.access.check and in value enter true.
6. In Name enter the property hive.server2.tez.sessions.custom.queue.allowed and in value enter true.

Managing YARN queue users

To manage users of secure YARN queues, you need to know how to configure impersonation for Ranger.
To allow access to YARN queues, as Administrator, you configure HiveServer user impersonation to false. You
also need to configure hive.server2.tez.queue.access.check=true. To manage YARN queues, you need the following
behavior:
• User submits the query through HiveServer (HS2) to the YARN queue
• Tez app starts for the user
• Access to the YARN queue is allowed for this user.
As administrator, you can allocate resources to different users.
Managing YARN queues under Ranger
When you use Ranger, you configure HiveServer not to use impersonation (doas=false). HiveServer authorizes only
the hive user, not the connected end user, to access Hive tables and YARN queues unless you also configure the
following parameter:
hive.server2.tez.queue.access.check=true

Configuring queue mapping to use the user name from the application tag
using Cloudera Manager
You learn how to add service users to the YARN queue by following a mapping procedure.
You can configure queue mapping to use the user name from the application tag instead of
the proxy user who submitted the job. You can add only service users like hive using the
yarn.resourcemanager.application-tag-based-placement.username.whitelist property
and not normal users.
When a user runs Hive queries, HiveServer2 submits the query in the queue mapped from an end user instead of a
hive user. For example, when user alice submits a Hive query with doAs=false mode, job will run in YARN as hive
user. If application-tag based scheduling is enabled, then the job will be placed to a target queue based on the queue
mapping configuration of user alice.
For more information about queue mapping configurations, see Manage placement rules. For information about Hive
access, see Apache Hive documentation.
1. In Cloudera Manager, select the YARN service.
2. Click the Configuration tab.
3. Search for ResourceManager. In the Filters pane, under Scope, select ResourceManager.
4. In ResourceManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml add the following:
a. Enable the application-tag-based-placement property to enable application placement based on the user ID
passed using the application tags.

Name: yarn.resourcemanager.application-tag-based-placement.enable
Value: true
Description: Set to "true" to enable application placement based on the
user ID passed using the application tags. When it is enabled, it chec
ks for the userid=<userId> pattern and if found, the application will be

10
Cloudera Runtime Disabling impersonation (doas)

placed onto the found user's queue, if the original user has the requir
ed rights on the passed user's queue.
b. Add the list of users in the allowlist who can use application tag based placement. The applications when
the submitting user is included in the allowlist, will be placed onto the queue defined in the yarn.scheduler.c
apacity.queue-mappings property defined for the user from the application tag. If there is no user defined, the
submitting user will be used.

Name: yarn.resourcemanager.application-tag-based-placement.username.whit
elist
Value:
Description: Comma separated list of users who can use the application
tag based placement, if "yarn.resourcemanager.application-tag-based-pla
cement.enable" is enabled.
5. Restart the ResourceManager service for the changes to apply.

Disabling impersonation (doas)

As administrator, you must understand the permissions model supported in CDP Private Cloud Base is Apache
Ranger.
Disable impersonation to use Ranger
When you enable Ranger, you disable user impersonation (doAs=false). This is the Hive default and Ranger is the
only supported and recommended security model. Managed, ACID tables as well as external tables, secured by
Ranger, are supported in this configuration. Impersonation of the end user is disabled, which is the state required by
Hive for managing ACID tables.
In Cloudera Manager, click Hive on Tez Configuration and search for (hive.server2.enable.doAs).

Uncheck Hive (Service-Wide) to disable impersonation.

With no impersonation, HiveServer authorizes only the hive user to access Hive tables.
Related Information
Apache Software Foundation HDFS Permissions Guide
HDFS ACLS

11
Cloudera Runtime Connecting to an Apache Hive endpoint through Apache Knox

Connecting to an Apache Hive endpoint through Apache

Knox
If your cluster uses Apache Knox for perimeter security in CDP Private Cloud Base, you can connect to an Apache
Hive endpoint through Knox. You set the HiveServer transport mode and reference your Java keystore.

Before you begin

Automate the creation of an internal certificate authority (CA) using Auto-TLS (see link below). Set up SSL,
including trust, for Knox Gateway clients.

Procedure
1. In Cloudera Manager, click Clusters Hive on Tez Configuration , and change the Hive on Tez service transport
mode in Cloudera Manager to http.

KNOX discovers the service automatically and builds a proxy URL for Hive on Tez only when the transport mode
is http.
2. Download the Knox Gateway TLS/SSL client trust store JKS file from Knox, and save it locally.
You can find the location of the JKS file from value of the Knox property gateway.tls.keystore.path.
3. In the Hive connection string, include parameters as follows:

jdbc:hive2://<host>:8443/;ssl=true;transportMode=http; \
httpPath=gateway/cdp-proxy-api/hive; \
sslTrustStore=/<path to JKS>/bin/certs/gateway-client-trust.jks; \
trustStorePassword=<Java default password>

In this example, changeit is the Java default password for the trust store.

Hive authentication
HiveServer supports authentication of clients using Kerberos or user/password validation backed by LDAP.
If you configure HiveServer to use Kerberos authentication, HiveServer acquires a Kerberos ticket during startup.
HiveServer requires a principal and keytab file specified in the configuration. Client applications (for example, JDBC
or Beeline) must have a valid Kerberos ticket before initiating a connection to HiveServer2. JDBC-based clients must
include principal=<hive.server2.authentication.principal> in the JDBC connection string. For example:

String url = "jdbc:hive2://node1:10000/default;principal=hive/HiveServerHost

@YOUR-REALM.COM"

12
Cloudera Runtime Hive authentication

Connection con = DriverManager.getConnection(url);

where hive is the principal configured in hive-site.xml and HiveServerHost is the host where HiveServer is running.
To start Beeline and connect to a secure HiveServer, enter a command as shown in the following example:

beeline -u "jdbc:hive2://10.65.13.98:10000/default;principal=hive/_HOST@CLOU
DERA.SITE"

Securing HiveServer using LDAP

You can secure the remote client connection to Hive by configuring HiveServer to use authentication with LDAP.

About this task

When you configure HiveServer to use user and password validation backed by LDAP, the Hive client sends a
username and password during connection initiation. HiveServer validates these credentials using an external LDAP
service. You can enable LDAP Authentication with HiveServer using Active Directory or OpenLDAP.

Procedure
1. In Cloudera Manager, select Hive-on-Tez Configuration .
2. Search for ldap.
3. Check Enable LDAP Authentication for HiveServer2 for Hive (Service Wide).
4. Enter your LDAP URL in the format ldap[s]://<host>:<port>.
LDAP_URL is the access URL for your LDAP server. For example, ldap://ldap_host_name.xyz.com:389
5. Enter the Active Directory Domain or LDAP Base DN for your environment.
• Active Directory (AD)
• LDAP_BaseDN
Enter the domain name of the AD server. For example, corp.domain.com.

Enter the base LDAP distinguished name (DN) for your LDAP server. For example, ou=dev, dc=xyz.

6. Click Save Changes.

13
Cloudera Runtime Hive authentication

7. Restart the Hive service.

8. Construct the LDAP connection string to connect to HiveServer.
The following simple example is insecure because it sends clear text passwords.

String URL = "jdbc:hive2://node1:10000/default;user=LDAP_Userid;password

=LDAP_Password"
Connection con = DriverManager.getConnection(url);

The following example shows a secure connection string that uses encrypted passwords.

String url ="jdbc:hive2://node1:10000/default;ssl=true;sslTrustStore=/my

truststore_path;trustStorePassword=my_truststore_password"
Connection con = DriverManager.getConnection(url);

For information about encrypting communication, see links below.

Related Information
Custom Configuration (about Cloudera Manager Safety Valve)

Client connections to HiveServer

You can use Beeline, a JDBC, or an ODBC connection to HiveServer.

JDBC Client-HiveServer Authentication

The JDBC client requires a connection URL as shown below. JDBC-based clients must include a user name and
password in the JDBC connection string. For example:

String url = "jdbc:hive2://node1:10000/default;user=LDAP_Userid;password=LDA

P_Password" Connection con = DriverManager.getConnection(url);

where the LDAP_Userid value is the user ID and LDAP_Password is the password of the client user.

HiveServer modes of operation

HiveServer supports the following modes for interacting with Hive:
Operating Mode Description

Embedded The Beeline client and the Hive installation reside on the same host
machine or virtual machine. No TCP connectivity is required.

Remote Use remote mode to support multiple, concurrent clients executing

queries against the same remote Hive installation. Remote transport
mode supports authentication with LDAP and Kerberos. It also
supports encryption with SSL. TCP connectivity is required.

Remote mode: Launch Hive using the following URL:

jdbc:hive2://<host>:<port>/<db>.

The default HiveServer2 port is 10000.

Embedded mode: Launch Hive using the following URL:

jdbc:hive2://

14
Cloudera Runtime Hive authentication

Transport Modes
As administrator, you can start HiveServer in one of the following transport modes:
Transport Mode Description

TCP HiveServer uses TCP transport for sending and receiving Thrift RPC
messages.

HTTP HiveServer uses HTTP transport for sending and receiving Thrift RPC
messages.

Pluggable Authentication Modules in HiveServer

While running in TCP transport mode, HiveServer supports Pluggable Authentication Modules (PAM). Using
Pluggable Authentication Modules, you can integrate multiple authentication schemes into a single API. You use the
Cloudera Manager Safety Valve technique on HIVE_ON_TEZ-1 Configuration to set the following properties:
• hive.server2.authentication
Value = CUSTOM
• hive.server2.custom.authentication.class
Value = <the pluggable auth class name>
The class you provide must be a proper implementation of the org.apache.hive.service.auth.PasswdAuthenticatio
nProvider. HiveServer calls its Authenticate(user, passed) method to authenticate requests. The implementation can
optionally extend the Hadoop's org.apache.hadoop.conf.Configured class to grab the Hive Configuration object.

HiveServer Trusted Delegation

HiveServer determines the identity of the connecting user from the authentication subsystem (Kerberos or LDAP).
Any new session started for this connection runs on behalf of this connecting user. If the server is configured to proxy
the user, the identity of the connecting user is used to connect to Hive. Users with Hadoop superuser privileges can
request an alternate user for the given session. HiveServer checks that the connecting user can proxy the requested
userid, and if so, runs the new session as the alternate user.

Pluggable authentication modules in HiveServer

While running in TCP transport mode, HiveServer supports Pluggable Authentication Modules (PAM). Using
Pluggable Authentication Modules, you can integrate multiple authentication schemes into a single API.
You use the Cloudera Manager Safety Valve technique on HIVE_ON_TEZ-1 Configuration to set the following
properties:
• hive.server2.authentication
Value = CUSTOM
• hive.server2.custom.authentication.class
Value = <the pluggable auth class name>
The class you provide must be a proper implementation of the org.apache.hive.service.auth.PasswdAuthenticati
onProvider. HiveServer calls its Authenticate(user, passed) method to authenticate requests. The implementation can
optionally extend the Hadoop's org.apache.hadoop.conf.Configured class to grab the Hive Configuration object.

JDBC connection string syntax

The JDBC connection string for connecting to a remote Hive client requires a host, port, and Hive database name.
You can optionally specify a transport type and authentication.

15
Cloudera Runtime Hive authentication

jdbc:hive2://<host>:<port>/<dbName>;<sessionConfs>?<hiveConfs>#<hiveVars>

Connection string parameters

The following table describes the parameters for specifying the JDBC connection.
JDBC Parameter Description Required

host The cluster node hosting HiveServer. yes

port The port number to which HiveServer listens. yes

dbName The name of the Hive database to run the yes

query against.

sessionConfs Optional configuration parameters for the no

JDBC/ODBC driver in the following format:
<key1>=<value1>;<key2>=<key2>...;

hiveConfs Optional configuration parameters for Hive on no

the server in the following format: <key1>=<
value1>;<key2>=<key2>; ...
The configurations last for the duration of the
user session.

hiveVars Optional configuration parameters for Hive no

variables in the following format: <key1>=<
value1>;<key2>=<key2>; ...
The configurations last for the duration of the
user session.

TCP and HTTP Transport

The following table shows variables for use in the connection string when you configure HiveServer. The JDBC
client and HiveServer can use either HTTP or TCP-based transport to exchange RPC messages. Because the default
transport is TCP, there is no need to specify transportMode=binary if TCP transport is desired.
transportMode Variable Value Description

http Connect to HiveServer2 using HTTP transport.

binary Connect to HiveServer2 using TCP transport.

The syntax for using these parameters is:

jdbc:hive2://<host>:<port>/<dbName>;transportMode=http;httpPath=<http_endpoi
nt>; \
<otherSessionConfs>?<hiveConfs>#<hiveVars>

User Authentication
If configured in remote mode, HiveServer supports Kerberos, LDAP, Pluggable Authentication Modules (PAM), and
custom plugins for authenticating the JDBC user connecting to HiveServer. The format of the JDBC connection URL
for authentication with Kerberos differs from the format for other authentication models. The following table shows
the variables for Kerberos authentication.
User Authentication Variable Description

principal A string that uniquely identifies a Kerberos user.

16
Cloudera Runtime Communication encryption

User Authentication Variable Description

saslQop Quality of protection for the SASL framework. The level of quality is
negotiated between the client and server during authentication. Used by
Kerberos authentication with TCP transport.

user Username for non-Kerberos authentication model.

password Password for non-Kerberos authentication model.

The syntax for using these parameters is:

jdbc:hive://<host>:<port>/<dbName>;principal=<HiveServer2_kerberos_principal
>;<otherSessionConfs>?<hiveConfs>#<hiveVars>

Transport Layer Security

HiveServer2 supports SSL and Sasl QOP for transport-layer security. The format of the JDBC connection string for
SSL uses these variables:
SSL Variable Description

ssl Specifies whether to use SSL

sslTrustStore The path to the SSL TrustStore.

trustStorePassword The password to the SSL TrustStore.

The syntax for using the authentication parameters is:

jdbc:hive2://<host>:<port>/<dbName>; \
ssl=true;sslTrustStore=<ssl_truststore_path>;trustStorePassword=<truststo
re_password>; \
<otherSessionConfs>?<hiveConfs>#<hiveVars>

When using TCP for transport and Kerberos for security, HiveServer2 uses Sasl QOP for encryption rather than SSL.
Sasl QOP Variable Description

principal A string that uniquely identifies a Kerberos user.

saslQop The level of protection desired. For authentication, checksum, and

encryption, specify auth-conf. The other valid values do not provide
encryption.

The JDBC connection string for Sasl QOP uses these variables.

jdbc:hive2://fqdn.example.com:10000/default;principal=hive/_H
[email protected];saslQop=auth-conf

The _HOST is a wildcard placeholder that gets automatically replaced with the fully qualified domain name (FQDN)
of the server running the HiveServer daemon process.

Communication encryption
Encryption between HiveServer2 and its clients is independent from Kerberos authentication. HiveServer supports the
following types of encryption between the service and its clients (Beeline, JDBC/ODBC):
• SASL (Simple Authentication and Security Layer)

17
Cloudera Runtime Communication encryption

• TLS/SSL (Transport Layer Security/Secure Sockets Layer)

TLS/SSL requires certificates. SASL QOP encryption does not. SASL QOP is aimed at protecting core Hadoop RPC
communications. SASL QOP might cause performance problems when handling large amounts of data. You can
configure HiveServer to support TLS/SSL connections from JDBC/ODBC clients using Cloudera Manager.

Client connections to HiveServer2 over TLS/SSL

A client connecting to a HiveServer2 over TLS/SSL must access the trust store on HiveServer to establish a chain of
trust and verify server certificate authenticity. The trust store is typically not password protected. The trust store might
be password protected to prevent its contents from being modified. However, password protected trust stores can be
read from without using the password.
The client needs the path to the trust store when attempting to connect to HiveServer2 using TLS/SSL. You can
specify the trust store in one of the following ways:
• Pass the path to the trust store each time you connect to HiveServer in the JDBC connection string:

jdbc:hive2://fqdn.example.com:10000/default;ssl=true;\
sslTrustStore=$JAVA_HOME/jre/lib/security/jssecacerts;trustStorePassword
=extraneous
• Set the path to the trust store one time in the Java system javax.net.ssl.trustStore property:

java -Djavax.net.ssl.trustStore=/usr/java/jdk1.8.0_141-cloudera/jre/lib/
security/jssecacerts \
-Djavax.net.ssl.trustStorePassword=extraneous MyClass \
jdbc:hive2://fqdn.example.com:10000/default;ssl=true

Enabling TLS/SSL for HiveServer

You can secure client-server communications using symmetric-key encryption in the TLS/SSL (Transport Layer
Security/Secure Sockets Layer) protocol. To encrypt data exchanged between HiveServer and its clients, you can use
Cloudera Manager to configure TLS/SSL.

Before you begin

• HiveServer has the necessary server key, certificate, keystore, and trust store set up on the host system.
• The hostname variable ($(hostname -f)-server.jks) is used with Java keytool commands to create keystore, as
shown in this example:

$ sudo keytool -genkeypair -alias $(hostname -f)-server -keyalg RSA -key

store \
/opt/cloudera/security/pki/$(hostname -f)-server.jks -keysize 2048 -
dname \
"CN=$(hostname -f),OU=dept-name-optional,O=company-
name,L=city,ST=state,C=two-digit-nation" \
-storepass password -keypass password

About this task

On the beeline command line, the JDBC URL requirements include specifying ssl=true;sslTrustStore=<path_to_trus
tstore>. Truststore password requirements depend on the version of Java running in the cluster:
• Java 11: the truststore format has changed to PKCS and the truststore password is required; otherwise, the
connection fails.
• Java 8: The trust store password does not need to be specified.

18
Cloudera Runtime Communication encryption

Procedure
1. In Cloudera Manager, navigate to Clusters Hive Configuration .
2. In Filters, select HIVE for the scope.
3. Select Security for the category.
4. Accept the default Enable TLS/SSL for HiveServer2, which is checked for Hive (Service-Wide).
5. Enter the path to the Java keystore on the host system.
/opt/cloudera/security/pki/keystore_name.jks
6. Enter the password for the keystore you used on the Java keytool command-line when the key and keystore were
created.
The password for the keystore must match the password for the key.
7. Enter the path to the Java trust store on the host system.
8. Click Save Changes.
9. Restart the Hive service.
10. Construct a connection string for encrypting communications using TLS/SSL.

jdbc:hive2://#<host>:#<port>/#<dbName>;ssl=true;sslTrustStore=#<ssl_trus
tstore_path>; \
trustStorePassword=#<truststore_password>;#<otherSessionConfs>?#<hiveCon
fs>#<hiveVars>

Enabling SASL in HiveServer

You can provide a Quality of Protection (QOP) that is higher than the cluster-wide default using SASL (Simple
Authentication and Security Layer).

About this task

HiveServer2 by default uses hadoop.rpc.protection for its QOP value. Setting hadoop.rpc.protection to a higher level
than HiveServer (HS2) does not usually make sense. HiveServer ignores hadoop.rpc.protection in favor of hive.ser
ver2.thrift.sasl.qop.
You can determine the value of hadoop.rpc.protection: In Cloudera Manager, click Clusters HDFS Configuration
Hadoop , and search for hadoop.rpc.protection.
If you want to provide a higher QOP than the default, set one of the SASL Quality of Protection (QOP) levels as
shown in the following table:
auth Default. Authentication only.

auth-int Authentication with integrity protection. Signed message digests (checksums) verify the integrity of
messages sent between client and server.

auth-conf Authentication with confidentiality (transport-layer encryption) and integrity. Applicable only if
HiveServer is configured to use Kerberos authentication.

Procedure
1. In Cloudera Manager, navigate to Clusters Hive Configuration .
2. In HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site click + to add a property and value.
3. Specify the QOP auth-conf setting for the SASL QOP property.
For example,
Name:hive.server2.thrift.sasl.qop
Value: auth-conf
4. Click Save Changes.

19
Cloudera Runtime Securing an endpoint under AutoTLS

5. Restart the Hive service.

6. Construct a connection string for encrypting communications using SASL.

jdbc:hive2://fqdn.example.com:10000/default;principal=hive/_HOST@EXAMPLE
.COM;saslqop=auth-conf

The _HOST is a wildcard placeholder that gets automatically replaced with the fully qualified domain name
(FQDN) of the server running the HiveServer daemon process.

Securing an endpoint under AutoTLS

The default cluster configuration for HiveServer (HS2) with AutoTLS secures the HS2 WebUI Port, but not the
JDBC/ODBC endpoint.

About this task

The default cluster configuration for HS2 with AutoTLS will secure the HS2 Server WebUI Port, but not the JDBC/
ODBC endpoint.
Assumptions:
• Auto-TLS Self-signed Certificates.
• Proper CA Root certs eliminate the need for any of the following truststore actions.
When HS2 TLS is enabled hive.server2.use.SSL=true, the auto-connect feature on gateway servers is not supported.
The auto-connect feature uses /etc/hive/conf/beeline-site.xml to automatically connect to Cloudera Manager
controlled HS2 services. Also, with hive.server2.use.SSL=true, ZooKeeper discovery mode is not supported because
the HS2 reference stored in ZooKeeper does not include the ssl=true and other TLS truststore references (self-signed)
needed to connect with TLS.
The beeline-site.xml file managed for gateways doesn't not include ssl=true or a reference to a truststore that includes
a CA for the self-signed TLS certificate used by ZooKeeper or HiveServer.
The best practice, under the default configuration, is to have all external clients connect to Hive (JDBC/ODBC)
through the Apache Knox proxy. With TLS enabled via Auto-TLS with a self-signed cert, you can use the jks file
downloaded from Knox as the client trusted CA for the Knox host. That cert will only work for KNOX. And since
KNOX and HS2 TLS server certs are from the same CA, Knox connects without adjustments.
To connect through Knox:

Procedure
1. Configure the HS2 transport mode as http to support the Knox proxy interface.

jdbc:hive2://<host>:8443/;ssl=true;\
transportMode=http;httpPath=gateway/cdp-proxy-api/hive;\
...

The TLS Public Certificate in <path>/bin/certs/gateway-client-trust.jks will not work.

2. Build a TLS Public Certificate from the self-signed root CA used for the cluster in Cloudera Manager.

keytool -import -v -trustcacerts -alias home90-ca -file \

/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_cacerts.pem \
-keystore <my_cacert>.jks

3. Connect to HS2 on the Beeline command line, using the -u option.

hive -u jdbc:hive2://<HS2 host>:10001/default;ssl=true;\

transportMode=http;httpPath=cliservice;\

20
Cloudera Runtime Token-based authentication for Cloudera Data Warehouse
integrations

principal=hive/_HOST@<realm>;user=<user name>;\
sslTrustStore=<path>/certs/home90_cacert.jks;\
trustStorePassword=changeit

The httpPath default is configured in Cloudera Manager. The sslTrustStore is required is you are using a self-
signed certificate.

Securing Hive metastore

Cloudera recommends using Apache Ranger policies to secure Hive data in Hive metastore. You need to perform
a few actions to prevent users from bypassing HiveServer to access the Hive metastore and the Hive metastore
database.

Procedure
1. Add a firewall rule on the metastore service host to allow access to the metastore port only from the HiveServer2
host. You can do this using iptables.
2. Grant access to the metastore database only from the metastore service host.
For example, in MySQL: GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'metastorehost'; where
metastorehost is the host where the metastore service is running.
3. Make sure users who are not administrators cannot log into the HiveServer host.

Token-based authentication for Cloudera Data Warehouse

integrations
Using a token, you can sign on to use Hive and Impala in Cloudera Data Warehouse for a period of time instead
of entering your single-sign on (SSO) credentials every time you need to run a query. This feature is in a technical
preview state. Contact your account team for more information.

Activating the Hive Web UI

HiveServer2 GUI/ Web UI does not display active client connections after enabling Kerberos. You must correct this
problem, which leads to a Kerberos ticket problem for a browser client.

About this task

HiveServer2 GUI/ Web UI does not display active client connections after enabling Kerberos. This issue occurs when
Spnego authentication is disabled. This problem leads to issues in getting a Kerberos ticket to client on browsers.

Procedure
1. In Cloudera Manager, go to Clusters Hive-on-Tez Configuration .
2. Search for HiveServer2 Advanced Configuration Snippet (Safety valve) for hive-site.xml

3.
Click and add the following property and value: hive.server2.webui.spnego.keytab = hive.keytab

21
Cloudera Runtime Activating the Hive Web UI

4.
Click and add the following property and value: hive.server2.webui.spnego.principal = HTTP/
_HOST@<REALM NAME>
5.
Click and add the following property and value: hive.server2.webui.use.spnego = true
6.
Click and add the following property and value: hive.users.in.admin.role = [***USERNAME1,USERNAME2,
…***]
Note: [***USERNAME1,USERNAME2,…***] is the list of comma separated users who want to access
historic query detail from web UI.

7. Save changes, and restart Hive-on-Tez.

The Hive Web UI shows active client connections.

Cloudera Administration PDF
100% (1)
Cloudera Administration PDF
476 pages
Cloudera Hbase
100% (1)
Cloudera Hbase
145 pages
Admin Cloudera
100% (3)
Admin Cloudera
637 pages
Filenet Glossary
No ratings yet
Filenet Glossary
21 pages
hive_using_hiveql
No ratings yet
hive_using_hiveql
56 pages
hive_integrating_hive_and_bi
No ratings yet
hive_integrating_hive_and_bi
42 pages
Cloudera Hive
No ratings yet
Cloudera Hive
106 pages
hbase-configuring
No ratings yet
hbase-configuring
43 pages
Cloudera ODBC Driver for Apache Hive Install Guide
No ratings yet
Cloudera ODBC Driver for Apache Hive Install Guide
91 pages
Cloudera Hive
No ratings yet
Cloudera Hive
118 pages
Apache Hive Guide
No ratings yet
Apache Hive Guide
99 pages
Cloudera JDBC Driver For Apache Hive Install Guide 2 5 4
No ratings yet
Cloudera JDBC Driver For Apache Hive Install Guide 2 5 4
21 pages
Configure Apache Ranger
No ratings yet
Configure Apache Ranger
25 pages
Cloudera Hive
No ratings yet
Cloudera Hive
132 pages
Cloudera Administration Handbook
From Everand
Cloudera Administration Handbook
Rohit Menon
No ratings yet
Cloudera Hive
No ratings yet
Cloudera Hive
107 pages
DW Managing Warehouses
No ratings yet
DW Managing Warehouses
68 pages
Cloudera ODBC Driver For Apache Hive Install Guide 2 5 0
No ratings yet
Cloudera ODBC Driver For Apache Hive Install Guide 2 5 0
28 pages
Cloudera Hive
No ratings yet
Cloudera Hive
137 pages
Cloudera Security
No ratings yet
Cloudera Security
315 pages
cloudera-security
No ratings yet
cloudera-security
418 pages
Cloudera Security
100% (1)
Cloudera Security
443 pages
Cloudera ODBC Driver For Apache Hive Install Guide
No ratings yet
Cloudera ODBC Driver For Apache Hive Install Guide
72 pages
Scripting
No ratings yet
Scripting
88 pages
Cloudera Installation
No ratings yet
Cloudera Installation
180 pages
Cloudera JDBC Driver For Apache Hive Install Guide PDF
No ratings yet
Cloudera JDBC Driver For Apache Hive Install Guide PDF
111 pages
Cloudera Administration
No ratings yet
Cloudera Administration
486 pages
Configuring Hadoop Security With Cloudera Manager
No ratings yet
Configuring Hadoop Security With Cloudera Manager
52 pages
Cloudera Installation
No ratings yet
Cloudera Installation
181 pages
Cloudera Connector For Tableau
No ratings yet
Cloudera Connector For Tableau
12 pages
iceberg-how-to
No ratings yet
iceberg-how-to
53 pages
Cloudera Enterprise: The Ultimate Data Engine
No ratings yet
Cloudera Enterprise: The Ultimate Data Engine
2 pages
Cloudera Datamgmt
No ratings yet
Cloudera Datamgmt
63 pages
RT PVC Release Notes
No ratings yet
RT PVC Release Notes
165 pages
Apache Ranger Auditing
No ratings yet
Apache Ranger Auditing
17 pages
Cloudera Administration
No ratings yet
Cloudera Administration
481 pages
Cloudera Installation
No ratings yet
Cloudera Installation
618 pages
Cloudera Administration PDF
No ratings yet
Cloudera Administration PDF
478 pages
Cloudera Manager Backup and Data Recovery
No ratings yet
Cloudera Manager Backup and Data Recovery
22 pages
Cloudera Administration
No ratings yet
Cloudera Administration
399 pages
Cde Create Manage Jobs
No ratings yet
Cde Create Manage Jobs
28 pages
Installing and Using Impala
No ratings yet
Installing and Using Impala
248 pages
Cloudera Administration
No ratings yet
Cloudera Administration
424 pages
CDI HiveConnector En
No ratings yet
CDI HiveConnector En
55 pages
BigData Theory
No ratings yet
BigData Theory
65 pages
CM 4.5 Enterprise Help Guide
No ratings yet
CM 4.5 Enterprise Help Guide
194 pages
Cloudera Operation
No ratings yet
Cloudera Operation
100 pages
Cloudera Introduction
No ratings yet
Cloudera Introduction
93 pages
Cloudera Overview PDF
No ratings yet
Cloudera Overview PDF
20 pages
cm-monitoring-and-diagnostics
No ratings yet
cm-monitoring-and-diagnostics
182 pages
Cloudera Releases PDF
No ratings yet
Cloudera Releases PDF
185 pages
Cloudera Introduction PDF
No ratings yet
Cloudera Introduction PDF
85 pages
Cloudera Manager Administration Guide
No ratings yet
Cloudera Manager Administration Guide
78 pages
Cloudera Impala
No ratings yet
Cloudera Impala
526 pages
Install Cloudera Manager Using AMI On Amazon EC2
No ratings yet
Install Cloudera Manager Using AMI On Amazon EC2
39 pages
Installing and Using Impala
0% (1)
Installing and Using Impala
288 pages
Cloudera Encryption with CTM
No ratings yet
Cloudera Encryption with CTM
108 pages
VMware vRealize Orchestrator Essentials: Get hands-on experience with vRealize Orchestrator and automate your VMware environment
From Everand
VMware vRealize Orchestrator Essentials: Get hands-on experience with vRealize Orchestrator and automate your VMware environment
Daniel Langenhan
No ratings yet
Cloudera Apache Impala Guide
No ratings yet
Cloudera Apache Impala Guide
691 pages
Getting Started with Red Hat Enterprise Virtualization
From Everand
Getting Started with Red Hat Enterprise Virtualization
Pradeep Subramanian
No ratings yet
OpenStack Object Storage (Swift) Essentials
From Everand
OpenStack Object Storage (Swift) Essentials
Amar Kapadia
No ratings yet
AIX7.1 Security PDF
No ratings yet
AIX7.1 Security PDF
520 pages
Aveva Licensing System
No ratings yet
Aveva Licensing System
66 pages
MCSM Exchange 2013 - CAS - 2 Autodiscover
No ratings yet
MCSM Exchange 2013 - CAS - 2 Autodiscover
42 pages
Configuring Active Directory with MX Security Appliances
No ratings yet
Configuring Active Directory with MX Security Appliances
15 pages
CS505 Linux Practical File
No ratings yet
CS505 Linux Practical File
26 pages
Spectrum Report Manager
No ratings yet
Spectrum Report Manager
154 pages
Log Insight Administration Guide
No ratings yet
Log Insight Administration Guide
133 pages
Leveraging LDAP Groups Users With SonicWALL UTM Appliance Technote
No ratings yet
Leveraging LDAP Groups Users With SonicWALL UTM Appliance Technote
57 pages
Unisphere 360 v9 2 1 Online Help
No ratings yet
Unisphere 360 v9 2 1 Online Help
27 pages
Worksite Server Administrators Guide 8 5 English
No ratings yet
Worksite Server Administrators Guide 8 5 English
382 pages
Jenis-Jenis Port Pada Komputer
No ratings yet
Jenis-Jenis Port Pada Komputer
7 pages
Spring Security
No ratings yet
Spring Security
134 pages
Use RESTful API
100% (1)
Use RESTful API
12 pages
Zimbra OS Quick Start 8.0
No ratings yet
Zimbra OS Quick Start 8.0
36 pages
FGT1 04 Firewall Authentication
No ratings yet
FGT1 04 Firewall Authentication
42 pages
Hpe6-A82 Exam: Aruba Certified Clearpass Associate Exam
No ratings yet
Hpe6-A82 Exam: Aruba Certified Clearpass Associate Exam
27 pages
TOPIC: Authentication Protocols: Course Lecturer: MR - Ako Course Code
No ratings yet
TOPIC: Authentication Protocols: Course Lecturer: MR - Ako Course Code
6 pages
K Analytics Design Doc
No ratings yet
K Analytics Design Doc
9 pages
Publish OWA Using FortiWeb
No ratings yet
Publish OWA Using FortiWeb
8 pages
Acano Manager Installation Guide R1.1
No ratings yet
Acano Manager Installation Guide R1.1
31 pages
KeyTalk Anything You Ever Wanted To Know About SMIME Email Encryption DigitalSigning Configurations. But Were Afraid To Ask
No ratings yet
KeyTalk Anything You Ever Wanted To Know About SMIME Email Encryption DigitalSigning Configurations. But Were Afraid To Ask
19 pages
Mediant Software SBC Users Manual Ver 74
No ratings yet
Mediant Software SBC Users Manual Ver 74
1,449 pages
Avaya Aura Device Services Administering R8.1.5 September 2021
No ratings yet
Avaya Aura Device Services Administering R8.1.5 September 2021
446 pages
Red Hat Directory Server 9.0 Administration Guide en US
No ratings yet
Red Hat Directory Server 9.0 Administration Guide en US
902 pages
Gfi Max MP Ldap Guide
No ratings yet
Gfi Max MP Ldap Guide
16 pages
Ddei 3.0 BPG
No ratings yet
Ddei 3.0 BPG
82 pages
Jenkins Pipeline Project 4
No ratings yet
Jenkins Pipeline Project 4
52 pages
IBM Spectrum Protect v8.1.0
No ratings yet
IBM Spectrum Protect v8.1.0
232 pages
SA 9.0 Administration Guide
No ratings yet
SA 9.0 Administration Guide
286 pages