Databricks

Steps to prepare and connect to Databricks

Prepare in Databricks

To create a Databricks connection, you'll need an access token and the server hostname and path.

Step 1: Get a Databricks personal access token

  1. Generate a personal access token in Databricks.
  2. Copy the generated token and store in a secure location.

Step 2 - Get your Server Hostname and HTTP Path

Do one of the following:

  • If you're using a SQL endpoint:

    1. Click SQL Warehouses in the left nav.
    2. Choose an endpoint to connect to.
    3. Navigate to the Connection Details tab.
    4. Copy the Server Hostname and the HTTP Path.\
  • If you're using a compute cluster:

    1. Click Compute in the left nav.
    2. Choose a cluster to connect to.
    3. Navigate to Advanced Options.
    4. Click on the JDBC/ODBC tab.
    5. Copy the Server Hostname and the HTTP Path.

Configure connector

  • Workspace URL - The Server Hostname of the compute cluster
  • HTTP Path - The HTTP path for the compute cluster or the SQL Warehouse
  • Token - The Databricks personal access token
  • Catalog (Optional) - If you want to specify a Databricks Unity catalog, enter the catalog name here. You may leave this blank, in which case Lightup will connect to the default Databricks catalog, hive_metastore

Advanced

  • Schema scan frequency - Set how often scans run for the datasource: Hourly, Daily, or Weekly.

Query Governance

Databricks datasources support the Query History, Scheduling, Enable data storage, and Maximum backfill duration, and Maximum distinct values settings. For steps, see Set query governance settings for a datasource.

Date/time data types

These Databricks date/time data types are supported:

Object types

These Databricks object types are supported:

  • Tables
  • Views

Partitions

Databricks datasources support partitions, multi-partitions, and partition time zones.