Set up a datasource, configure a table, add a metric, attach a monitor, and create an alerting channel
- You'll need a user account for a Lightup deployment.
- You must be a member of a workspace and have the Workspace Admin role.
- You'll need a Lightup user account in any datasource you want to monitor. For steps to set up the Lightup account, go to the list of datasource connectors.
Connect a datasource
- In the left pane, open your workspace menu and then select Datasources.
- Select + to add a new datasource. Provide the required information for your data's connector, and then select Test Connection at the bottom left. When the test succeeds, select Save.
For more help, go to manage datasources, or see your database software support content.
Some settings are only available when connecting to a specific type of datasource - for example, Databricks. Help configuring these settings is on the connector subpage for the datasource type, which is on Manage datasources.
Activate a table and enable metrics
Before you can create or enable metrics for a table, the table and its parent schema must be active. When data assets are active, they appear in the Explorer tree.
When you activate columns in a table, the number of columns you see may be one less than the count of columns displayed in the Explorer summary for the table. If so, the table has a timestamp column, which is used in many metric calculations and cannot itself be the target data asset for metrics.
- On the Explorer tab, select a datasource in the Explorer tree.
- In the right pane, on the Actions menu, select Manage Metrics.
- The Manage metrics modal opens, listing the schemas for the selected datasource. To activate a schema, slide its Active toggle to the right. It will take a few moments for the schema to become active. When a schema is active, you can activate its tables.
- In the Schema column of the Manage metrics datasource modal, select an active schema to open it in the Manage metrics modal, which then displays a list of the schema's tables. To enable metrics for a table, you must first activate it: slide the Active toggle to the right. You can also use the checkboxes in the far-left column to select multiple tables and perform bulk actions.
- When you first activate a table, the Manage Table Settings modal opens so you can set up the table's default configuration. This default configuration provides values for the table's metrics. Review the values and make any changes. See the next section for detailed help.
Query Scope and Data Collection Schedule
Two inheritable settings aren't covered in this Quick start:
- Query Scope changes the amount of data that's queried during metric data collection. The default value, Incremental, means that only rows with timestamps from the most recent aggregation window are included in the query. If set to Full Table, all rows are included every time. Full Table scope supports tables that don't have time-based columns. Query Scope can't be changed for Active tables.
- Data Collection Schedule controls whether metric data collection happens on a regular schedule (the default value, Scheduled) or only when triggered by API (the value, Triggered).
For more information about using these settings, see Manage Table Settings.
Set the default configuration for a table
- Specify an Aggregation Interval to set how often the metric's value is aggregated. For daily, metric values are calculated over the 24-hour period starting at 12AM. For hourly, metrics will be calculated at the top of each hour. For weekly, metrics will be calculated starting at 12AM on Monday.
- Select the Aggregation Time Zone. This specifies the time zone that is used for metric aggregation. For example, if Aggregation Interval is daily, the data will be aggregated for the 24-hour period starting at 12AM in the Aggregation Time Zone.
- Evaluation Delay is a blackout period during which data is considered not stable or not ready for running data quality checks. For example, if a metric has an Evaluation Delay of one day, data with a timestamp in the past 24 hours is ignored. Set Evaluation Delay to a value which represents the time period required for your data to be guaranteed stable.
- Choose the timestamp column for your table. You can also create a virtual timestamp if no suitable column is ready to use. The aggregation period for a metric is based on the value in the timestamp column, translated into the time zone specified by the Aggregation Time Zone. As described above, only data with timestamps prior to [now] - Evaluation Delay are considered.
- Some timestamp columns don't include time zone information, so you might need to specify the Time Zone where the data was written.
- Under Partitions, if your table has time-based partitions, you can specify the column and format of the partition so that you can use it to improve metric's query performance. Format should be specified using Python format codes. If your table doesn't have any partitions, the Partitions section doesn't appear in the Manage Table Settings modal.
- Click Confirm and Save to save your settings. Now that you've activated the table, you can enable its autometrics.
Enable table autometrics
The first time you activate a specific table and close the Manage Table Settings modal, the schema's Manage metrics modal remains open so you can enable table autometrics (pre-built metrics for common data quality scenarios). Each autometric measures one dimension of data quality: accuracy, completeness, or timeliness. Lightup offers three autometrics for tables: Activity, Data Delay, and Data Volume. These autometrics provide valuable insights into your data's behavior that can help you understand table-level issues, such as late or missing data.
Manage schema metrics
- To bulk-select tables, select the checkbox just left of the Active column heading, and then toggle settings for all of them at once.
- Select the name of an active table to open the Manage metrics modal for that table so you can work with its columns' autometrics.
Activity is the only autometric available for all three kinds of data asset (schema, table, and column). It measures changes in the definition of the asset - tables added or removed from a schema; columns added, altered (data type), or deleted from a table; category values added or removed from a column. This measures the accuracy data quality dimension.
Here at the table level, Activity detects changes to the table columns— columns being added, altered, or deleted. When there is no activity, the bars are blue. When the columns have changed from the previous measurement (i.e., columns are added or dropped), the bars are orange. Hover over an orange bar to get more details.
Data Delay measures the time elapsed since the last data timestamp (which represents the last event time); this is also known as "lag" and is a measure of the timeliness data quality dimension. It's a good way to confirm that your pipeline is operating as expected. For example, if you have a data pipeline that loads data once a day and you look at Data Delay aggregated hourly, you will see its lag go to its lowest value at the hour that data is loaded, and gradually get larger and larger until the next load of data, as depicted in the following image.
The X axis of the Data Delay chart shows current time, in your own time zone. The Y axis reflects the time elapsed since the last data timestamp. In the above image, this lag is zero from 10AM to around 10PM, indicating data being loaded. Starting at about 10PM, the lag gets greater and greater until 10AM arrives again, a new data load starts and the lag goes back to zero. Note the Live indicators: one next to the metric name that indicates the metric is currently active, and another on the Monitor menu that indicates a monitor is currently active.
Most metrics show timestamp values on the X axis. Because the purpose of the Data Delay metric is to show lag with respect to real time, it shows real time on the X axis, with lag values on the Y axis. Also, this metric does not include the value of Evaluation Delay in its calculations: all data that is available is considered when calculating lag.
Data Volume measures the volume of data loaded into the table, measured by row count. For example, for an hourly metric, data volume is row count per hour. Measuring data volume enables you to confirm that the volume of data arriving is as expected. This measures the completeness data quality dimension.
The X axis of the Data Volume profile represents timestamp time (converted into your time zone) and the Y axis represents the number of rows loaded. The aggregation is hourly in this example. This profile shows that data being loaded throughout the day, with some variety based on time of day.
Activate a column and enable its autometrics
When you activate a table, you can then activate its columns and enable relevant autometrics. At the column level, the available types of autometrics vary according to the column's data type, but may include Activity, Distribution, and Null Percent. You use the Manage metrics modal to adjust these settings, as depicted below.
- To bulk-adjust columns, select the checkbox just left of the Active heading, and then toggle settings for all of them at once.
- The Activity autometric is unavailable for numeric-data columns and can't be enabled before Distribution is enabled for the same column.
- Activity measures changes to the count of categories (unique column values). Before you can enable Activity, you must enable Distribution so the column categories are enumerated. Activity is not available for numeric data types.
- Distribution measures changes to the frequencies of unique column values. For example, a column of restaurant names in a table of food transactions likely has relatively few restaurant names, each appearing in numerous transactions. In a stable enterprise those restaurants probably have relatively predictable levels of transactions, which can be seen as a frequency for each restaurant name - how often it appears, or how likely a transaction is to come from that restaurant. You could set up a Distribution metric with defined normal variability, and then add a monitor to get alerts when the metric exceeds normal bounds.
- Null Percent measures changes to the percent of null values. For example, a column might not require a value for every row, and its data might have a fairly stable ratio or number of null values. In some cases the presence of null has meaning, similar to an actual value. For example, a transaction that didn't include a customerID could mean the customer isn't part of your loyalty program, or it might mean the cashier did not collect the information during the transaction. Analysis of a Null Precent autometric can help you determine which is the case.
Column autometrics inherit some configuration settings from their table.
- In the Explorer tree, browse to and select the table.
- In the right pane, on the Actions menu, select Manage Metrics. The Manage metrics modal opens and lists the table's columns.
- On the left side of the Manage metrics modal, slide the Active toggle to the right to activate the column in that row.
- To enable an autometric for an active column, slide the autometric's toggle to the right. Remember - the Activity autometric can't be enabled for a column unless the Distribution autometric is already enabled.
View metrics in Explorer
Now that you have set up a table with some autometrics, head over to Explorer and look at your data.
To search the Explorer tree, enter a search term in the Search Datasources box. The Explorer tree opens all nodes that contain matches for your search term. Searching for common terms in a workspace with many datasources may take a moment to get results.
- On the top menu bar, select the Explorer tab. The Explorer tree displays your datasources, with active datasources expanded to show their data assets.
- Select and expand the datasource to see the tables you activated and enabled autometrics for in the previous section.
- Choose one of the tables to view its autometrics charts in the pane to its right, as shown below.
Explorer always shows the past seven days of data. Viewing your metric in Explorer is a good way to manually monitor it to understand its behavior. If you want to see more than the seven days shown, preview the metric.
Edit a metric
- In the Explorer tree, select a table that has autometrics enabled. (These will appear as charts in the right pane).
- Find the chart for the metric you want to edit, and then in its top-right corner, select the vertical ellipse and then select Edit. The next steps depend on the metric type; for more information, see Edit a metric.
Monitor your metrics
While you can easily view your metrics in Explorer, it's not a practical long-term solution for keeping an eye on things. For that, you set up monitors on your metrics. A monitor causes Lightup to generate an incident whenever the data for the monitored metric falls outside of expected values. You can specify these bounds explicitly or have Lightup's ML anomaly monitoring intelligence use historical trends to choose the bounds for you.
For example, let's set up a monitor for the Data Delay metric we reviewed in the preceding section.
- With the table selected in Explorer, on the Monitors menu for the metric select + Add. Note that the Monitors menu label always reflects the most relevant fact about the metric's monitors.
- The Add monitor modal opens so you can choose Threshold or Anomaly Detection. Choose Threshold to set thresholds manually, then set the Upper threshold to the highest acceptable value for your metric, and then set Lower threshold to the lowest acceptable value for your metric. In the example we've been using, imagine that we never want our lag to be greater than 1200. We'll set the upper threshold to 1200 and leave the lower threshold at 0.
If someone has added an alerting channel, you can choose it in the Select the channels to send notifications box, so that incidents will send notifications to that channel.
Training anomaly monitors
Monitors train to learn what metric values are in-bounds/normal. There are two monitor types that you can create:
- Manual threshold monitors where you specify the normal range of metric values.
- Anomaly monitors that learn thresholds from your data's historical behavior, where you optionally specify date ranges to use as training periods— periods where all metric values are within normal ranges as learned by the monitor.
There is additional basic configuration associated with both monitor types. For help, see Monitor a metric.
- Click OK to save. Your new monitor begins training, as reflected by the Monitors menu.
- When training completes, the new monitor will generate incidents whenever the metric's data falls outside the thresholds. Data is backfilled for seven days during monitor training so when training completes you will immediately see incidents that occurred during that time frame.
You may have to refresh your screen to see the chart update. Here's a snapshot of a Data Delay chart after the monitor is done training and has recorded some recent incidents.
- When incidents are generated, you will see a count of incidents (3 in the preceding image), displayed in red (high severity) or yellow (medium severity) font above the metric chart, to the left of the Monitors menu. Select the count to review an incident report with details about all of them. For more information about incidents, see the next section.
There are two ways to view incidents: from the Incidents report of a metric opened in Explorer, and from the Incidents List.
Open the Incident report for a metric
Open the metric in Explorer. If it has incidents, you'll see a count and the word "Incidents" (the Incidents menu). Select Incidents to open the Incident report and review them.
When you open the Incidents report for a metric, you see information about the data asset, the metric's chart including incidents, and a table of details for the displayed incidents, including a link to get more information.
Incident report - Top section: data asset, time period, and report filters
- At the top left edge, review basic Info about the metric's data asset, and check or change the time period for the report.
- On the right, set or adjust the report's filters.
Incident report - Middle section: A metric chart of recent history highlights any incidents
- Hover on the chart to see more information about specific metric values.
Incident report - Bottom section: A table listing the metric's recent incidents
The values of the Incident column are links to details about the corresponding incident.
With the Incidents List, you can view all incidents for all metrics over a particular timeframe. You can filter incidents based on attribute and can also pick an incident to review in detail.
- On the top bar, select Incidents. The Incidents List opens.
- To review a specific incident, in the list select the value of the Incident column for the incident you want to review.
- From the Incident Detail View you can review the incident in the context of the metric. You can expand the time period to see more of the metric's profile as well as additional incidents that have been triggered during that time period. Some incidents provide blue links at the top (Data Delay and datadelay_ManualThreshold in this example) that give you access to the metric and monitor associated with the incident.
As part of your analysis, you might want to correlate an incident with other incidents. The panel at the right helps you do this.
Add an alerting channel to a monitor
When a monitor begins generating incidents, you may want to automatically alert others on your team when incidents occur. The simplest option is to create an email list, which is supported directly by Lightup. Third-party Integrations offer more choices for getting out the alerts to the right audiences.
In the left pane, on your workspace menu select Integrations.
Click + and choose an option.
The rest of the procedure depends on which alerting channel you chose. For steps, go to Choose a different alerting channel.
Open the monitor that you want to add the alerting channel to:
- Navigate to its metric in Explorer, select the Monitors menu and choose the selection for editing your monitor.
- or -
- Select Monitors in the top menu bar, filter for your monitor, then select it and choose Edit.
In the monitor edit page, at the top right of the screen select the arrow just right of Notify (or Muted if notifications are muted for the monitor). Then in the Select the channels to send notifications modal, select the input box and then select the channel that you just created. Your monitor will immediately begin sending incident notifications to your new channel.
Updated 2 months ago