Lightup is a data quality observability and monitoring tool.
We consider data quality broken when data in a table or a view or a stream does not match expectation. This could mean issues with data availability, data conformity, or data validity. Read more at https://blog.lightup.ai.
Lightup is designed for data engineers and analytics engineers building and operating ETL and ELT pipelines. Data quality is a shared responsibility of data producers and data consumers within an organization. Often, a data quality issue requires engineers, analysts, and product managers to collaborate. Lightup facilitates collaborative workflows within the platform.
Lightup can run data quality checks against SQL-speaking data stores including popular data warehouses -- Redshift, Snowflake, Databricks, Spark SQL, BigQuery, Athena, Postgres, MySQL, and more -- using built-in connectors. Lightup can also be integrated with streaming data sources including Kafka, Segment, and Rudder.
Lightup leverages the scalability of the data warehouse by issuing aggregation queries to run its data quality checks. Raw data is never pulled out from the data source.
No. Lightup only runs time-window aggregation queries against the data source, such as querying for number of new rows inserted in the last hour or last day. If your data is already indexed on time, Lightup does not cause full table scans.
No. Lightup data quality checks can be configured to operate on a schedule that matches your duty cycles. For example, you could configure a check to run once every hour instead of every second or every minute. That way, the warehouse needs to wake up only once every hour. You could also synchronize Lightup checks with your data production jobs so that data quality checks are only executed when new data is produced and the warehouse is already awake.
Lightup does not pull out or store raw data out from warehouse. Aggregate metrics pulled out from the warehouse are cached by Lightup for a configurable retention period to make the system efficient and avoid repeated queries to the warehouse. Lightup is not intended to be a system of record for customer's data, raw or aggregate. Your own data warehouse is considered the source of truth.
Yes. Lightup takes security and privacy very seriously. We have invested heavily in our security program and following best practices. We are on our way to achieving SOC2 certification by May 2021. Documentation on our security posture is available upon request.
Yes. Lightup supports both SaaS and on-premise deployment models.
The on-premise deployment can be configured to be entirely managed by you, the customer, or managed by Lightup. We have invested heavily in making the on-prem deployment easy enough to get started with that you can try it on your own in a couple of hours.
Absolutely! There is no one-size-fits-all data warehouse, data lake, or data streaming platform. Chances are you have a set of data sources in your stack. Reach out to us to discuss possibilities for adding support for additional data sources.
Yes! We support a free tier for both SaaS and on-premise deployment models. You can try out the entire feature set for free and keep using the system within the free usage capacity as long as you need to.
Sign up at https://signup.lightup.ai for either the SaaS model or the on-premise model.
Find setup instructions specific to the deployment model in the left navigation bar.