Hive

Steps to prepare and connect to Hive

Lightup account setup

Lightup needs a user account with SELECT privileges on all data objects that you want Lightup to see. Hive handles authorization separately from authentication, and no explicit DDL granting of privileges is needed.

Kerberos authentication settings

hive.server2.authentication.kerberos.principal – Kerberos principal for server.

hive.server2.authentication.kerberos.keytab – Keytab for server principal.

Configure connector

  • Auth Type - Specify No Auth or Kerberos (whichever your Hive database is set up to use).
    • No Auth - No identity service is used. Lightup credentials are passed directly to the database.
    • Kerberos - Use a Kerberos key pair/ticketing service. You may be prompted to enter information, especially if it's the first time you're connecting Lightup.
  • Hive Host - Enter the IP address and port for the Hive host.

Kerberos-only settings

You can find the values for the following settings in your /etc/hosts and /etc/krb5.conf files.

  • Hive Master Node Hostname
  • Hive Cluster Realm
  • Hive Master Node Domains
  • Kerberos Authentication Server IP
  • Kerberos Authentication Server Hostname
  • Kerberos Authentication Server Realm
  • Kerberos Authentication Server Domains
  • Kerberos Principal (for information about this setting, see hive.server2.authentication.kerberos.principal)
  • BASE64-Encoded Kerberos Keytab File (for information about this setting, see hive.server2.authentication.kerberos.keytab)

Advanced/Schema scan frequency

You can adjust how often scans run for a datasource.

  • In section 3 - Advanced, select a value for Schema scan frequency: Hourly, Daily, or Weekly.

Query Governance

Hive datasources support the Query History, Scheduling, Enable data storage, and Maximum backfill duration, and Maximum distinct values settings. For steps, see Set query governance settings for a datasource.

Date/time data types

These Hive date/time data types are supported:

Object types

These Hive object types are supported:

  • Internal tables

Partitions

Hive datasources support partitions, multi-partitions, and partition time zones.