Data Quality overview
Data quality is often measured by setting expectations of the data and being alerted when the data doesn't meet those expectations. Lightup offers a full anomaly detection suite for automated data quality expectations, an absolute or dynamic thresholding option, and many customizable options in between.
In Lightup, the process of setting expectations is broken into three parts:
- Metric: A specific measurement on the data
- Monitor: The performance of a defined metric
- Incident: When the metric value breaches a monitor’s expected or defined range
By decoupling the metric and monitor, the user is able to deliver multiple desired outcomes without needing to create multiple metric and monitor pairings. For example, imagine you have a metric that measures the number of sales hourly. Any of the following might represent your expectation of how the this metric behaves:
- I expect hourly sales to be above $1000
- I expect the data to behave similarly every summer
- I expect number of sales to increase week over week
In other tools, where metrics and monitors are tightly coupled, you would have to create three separate measurements of the same dataset. With Lightup, you measure your data with a single metric, and apply three different monitors, reducing the need to measure the same metric multiple times.
Data Quality Management process
Now that you know how Lightup is laid out, you may be wondering, "How do I use this and how hard is it?".
Using Lightup is easy. The basic flow is listed below, with links to pages that walk you through each step of implementing an always-on observability platform that's customized for your data and business use cases.
- Connect to datasources that you want to analyze.
- Add data assets on which to perform data quality analysis.
- Profile your data to better understand the data quality checks that you should set up.
- Create metrics that run in background and measure the behavior of your tables and columns.
- Add monitors to those metrics to detect whether they have deviated from expected values and generate incidents when they do.
- Manage any incidents that the monitors detect.
Updated 3 months ago