Add virtual tables (Amazon S3 and Azure blob storage)
Instead of regular tables, Amazon S3 and Azure blob storage datasources support virtual tables— collections of data files inside a time-ordered hierarchy. When you create one, you specify a file pattern that identifies the file name and paths.
- Currently, data files must be in Parquet file format.
- Individual files must have the same name.
- Each file becomes one record covering one aggregation interval with a timestamp derived from its path.
Create a virtual table
You can create a virtual table when you manage a schema's tables.
- On the Manage Tables tab, select Virtual table configuration +.
- In the Configuration section that appears:
- Enter a Table Name.
- Specify a file path pattern, for example, events/%Y/%m/%d/%H/data.parquet.
- Select Find Matching files.
- When you've found the files you want, select Create Virtual Table.
After you've created your virtual table, it appears on the Manage Tables tab.
- Toggle on the Configure toggle if you want to create data quality metrics. This requires you to configure your table with a variety of defaults such as the timestamp column, the data lag (evaluation delay), and partition formats. View Table configuration for help on how to configure your table. These defaults are inherited by your data quality metrics (but you can override them per-metric).
- If you choose to Configure your table, optionally toggle on the table auto metrics: Data Delay and Data Volume. For more info on these auto metrics, see Auto metrics.
Updated about 1 month ago