Hive

Steps to prepare and connect to Hive

Lightup account setup

Lightup needs a user account with SELECT privileges on all data objects that you want Lightup to see. Hive handles authorization separately from authentication, and no explicit DDL granting of privileges is needed.

Kerberos authentication settings

hive.server2.authentication.kerberos.principal – Kerberos principal for server.

hive.server2.authentication.kerberos.keytab – Keytab for server principal.

Connect to a Hive datasource

  1. In the left pane, open a workspace menu and select Datasources.
  2. In the main page select Create Datasource +.
  1. Enter a Datasource Name, and then for Connector Type select Hive.
  2. Under Configure connector, provide the following inputs:
    • Auth Type - Specify No Auth or Kerberos (whichever your Hive database is set up to use).
      • No Auth - No identity service is used. Lightup credentials are passed directly to the database.
      • Kerberos - Use a Kerberos key pair/ticketing service. You may be prompted to enter information, especially if it's the first time you're connecting Lightup.
    • Hive Host - Enter the IP address and port for the Hive host.
  1. After entering the required settings and any optional settings that apply, below the Configure connector section select Test Connection.
  2. After a successful connection test, select Save.
  3. Your new datasource appears in the list of available datasources. By default, these are listed in alphabetical order, so you might have to scroll or change the sort order to see your new datasource.

Kerberos-only settings

You can find the values for the following settings in your /etc/hosts and /etc/krb5.conf files.

  • Hive Master Node Hostname
  • Hive Cluster Realm
  • Hive Master Node Domains
  • Kerberos Authentication Server IP
  • Kerberos Authentication Server Hostname
  • Kerberos Authentication Server Realm
  • Kerberos Authentication Server Domains
  • Kerberos Principal (for information about this setting, see hive.server2.authentication.kerberos.principal)
  • BASE64-Encoded Kerberos Keytab File (for information about this setting, see hive.server2.authentication.kerberos.keytab)

Advanced/Schema scan frequency

You can adjust how often scans run for a datasource.

  • In section 3 - Advanced, select a value for Schema scan frequency: Hourly, Daily, or Weekly.

Query Governance

Hive datasources support the Query History, Scheduling, Enable data storage, and Maximum backfill duration, and Maximum distinct values settings. For steps, see Set query governance settings for a datasource.

Date/time data types

These Hive date/time data types are supported:

Object types

These Hive object types are supported:

  • Internal tables

Partitions

Hive datasources support partitions, multi-partitions, and partition time zones.