Profile your data

To better understand your data and help enhance your data quality analysis, enable data profiling. Data profiling provides a static analysis of your tables that you can view in the Explorer tree.

  • For a table with under a million rows, the data profile covers the entire table.
  • If Lightup can determine the timestamp column of a table with at least a million rows, the data profile covers the latest 30 days worth of data.
  • If Lightup can't tell which column has timestamps and the table has at least a million rows, the data profile covers one million randomly-selected rows.

If you aren't sure what types of data quality checks you want, we recommend you enable data profiling to learn about your data. Reviewing these data profiles will give you the info you need to proceed with your data quality journey.

Enable data profiling

After you add your datasources, you add data assets, and can enable data profiling in the process.

  • Add the schemas that contain your tables to your Explorer tree.
  • Add the tables that you want to profile. For each table, enable profiling by setting its Profile toggle in the Manage tables modal. Data profile generation will begin in background. When you select the table in the Explorer tree, you'll see the data profile for the table, which includes both general information about the table as well as statistics for each column.
  • Optionally, add columns from your profiled tables to your Explorer tree. Though this doesn't change how the data profile is generated, you can select a column you've added to the Explorer tree to see column statistics that are part of its table's data profile. The same column statistics are also available when you view the data profile at the table level.

Columns detected during profiling

Lightup uses pattern detection to determine the contents of table columns. In addition to timestamps, Lightup detects UUIDs, phone numbers, US ZIP codes, SSNs, and URLs.

Review data profiles

After data profiles are generated, you can review them to help you plan your data quality analysis. For example, reviewing a table's profile will clarify which columns can be used for timestamps in metrics.

If a table has a data profile but doesn't have any metrics yet, you'll see the table's profile on the right when you select the table in the Explorer tree. Select one of the table's columns to see the column's profile.

If a table has metrics, you won't see the data profile displayed when you select the table in the Explorer tree— you'll see its metrics. To access the data profile, use the table's Actions menu.

  1. Select the table in the Explorer tree.
  2. On the Actions menu, select Data Profiling.

The table's data profile opens in a panel: