Azure blob storage (Beta)

If you are interested in using this feature, please contact Lightup Support.

🚧

As this feature is currently in Beta, there are certain conditions that must be met to leverage Lightup for File Inspection:

  1. Lightup currently only supports parquet files
  2. Files must be less than 4GB in size
  3. Each file must contain all data relating to specific interval
    Example: If the file is intended to represent hourly data, then all of the data for the hour must be in a single file, rather than spread across multiple files
  4. The folder structure must be in a date-time-oriented, iterative naming convention (data_subject/%Y/%m/%d/%H/file_name.parquet)
    Example: purchase_data/2024/10/11/01/datapoints.parquet
  5. Only raw parquet files are supported (no zip, gz, gzip, etc)

Lightup account setup

Lightup needs an Azure account with read access to the data you want to monitor. You can use a Shared Key or Managed Identity for authentication, and assign the built-in Data Reader role to grant Lightup sufficient privileges.

Azure Shared Key Services

You can use a Shared Key for access to Azure Blob Storage. There are numerous possible formats and contents of an Azure Shared Key. All the information you need to employ this authentication method— including syntax examples— is available on the Microsoft page, Authorize with Shared Key.

Managed Identity access

If you decide to use a Managed Identity for Lightup, consider setting up a user-assigned managed identity as this will let you grant the identity access to multiple Azure resources.

Assign the built-in Azure Storage Blob Data Reader role

You can use Azure's built-in Storage Blob Data Reader role to grant sufficient privileges to the Lightup Azure account.

{
  "assignableScopes": [
    "/"
  ],
  "description": "Allows for read access to Azure Storage blob containers and data",
  "id": "/subscriptions/{subscriptionId}/providers/Microsoft.Authorization/roleDefinitions/2a2b9908-6ea1-4ae2-8e65-a410df84e7d1",
  "name": "2a2b9908-6ea1-4ae2-8e65-a410df84e7d1",
  "permissions": [
    {
      "actions": [
        "Microsoft.Storage/storageAccounts/blobServices/containers/read",
        "Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action"
      ],
      "notActions": [],
      "dataActions": [
        "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read"
      ],
      "notDataActions": []
    }
  ],
  "roleName": "Storage Blob Data Reader",
  "roleType": "BuiltInRole",
  "type": "Microsoft.Authorization/roleDefinitions"
}

Connect to an Azure blob storage datasource

  1. In the left pane, open a workspace menu and select Datasources.
  2. In the main page select Create Datasource +.
  3. Enter a Datasource Name, then for Connector Type select Azure blob storage.
  4. Under Configure connector, provide the following inputs:
    • If you're using a Managed Identity for access control, select Managed Identity.
    • For Account Name, enter the name of the Lightup Azure user account.
    • If present, enter the user Account Key.
  5. After entering the required settings and any optional settings that apply, below the Configure connector section select Test Connection.
  6. After a successful connection test, select Save.
  7. Your new datasource appears in the list of available datasources. By default, these are listed in alphabetical order, so you might have to scroll or change the sort order to see your new datasource.

Note that Azure blob storage datasources use virtual tables and not tables.

Query Governance

Azure blob storage datasources support the Scheduling, Enable data storage, Maximum backfill duration, and Maximum distinct values settings. For steps, see Set query governance settings for a datasource.

Metadata metrics

Azure blob storage datasources currently do not support metadata metrics.

Date/time data types

These Azure blob storage date/time data types are supported:

  • TIMESTAMP WITH TIME ZONE

Object types

These Azure blob storage object types are supported:

Partitions

Azure blob storage datasources support partitions.

Deep metrics

Azure blob storage datasources support all deep metrics except for row by row and SQL metrics. However, the following features are not supported: