Athena

Steps to prepare and connect to Athena

Overview

More details about Athena can be found on the AWS website: Interactive SQL - Amazon Athena

Athena needs to access S3 buckets where data is stored and also S3 buckets where the query results are stored.

Before you start, please note down the names of the buckets where data is stored as well as where the Athena result sets will be stored. Ideally, these should be 2 different buckets in order to keep the permissions separated. Please create a query results bucket if needed.

Create an IAM User

In this example we are creating a user named lightup-athena-read-only

Go to IAM/Users and click on Create User

Click: Next

Attach Policies

We will attaching 2 policies to the user, this will include:

  1. Access to S3 buckets
  2. Access to Athena, Glue and other services needed by Athena

Create and attach S3 bucket policy for data, query results

As discussed earlier, we need to create an S3 policy to allow user to access data bucket (read-only) as well as the query result set bucket (read-write)

Choose Attach existing policies directly

Then click on Create Policy, use JSON
This will launch a separate tab or window so, after this step is done, you will have to go back to the create user workflow.

Please paste in the JSON below (AFTER making the necessary edits). Please note that DATA_BUCKET and RESULTS_BUCKET keywords need to be replaced by the appropriate S3 bucket names that were described earlier in this doc:

{  
    "Version": "2012-10-17",  
    "Statement": \[  
        {  
            "Effect": "Allow",  
            "Action": [  
                "s3:GetObject",  
                "s3:ListBucket"  
            ],  
            "Resource": [  
                "arn:aws:s3:::DATA_BUCKET/*",  
                "arn:aws:s3:::DATA_BUCKET"  
            ]  
        },  
        {  
            "Effect": "Allow",  
            "Action": [  
                "s3:GetBucketLocation",  
                "s3:GetObject",  
                "s3:ListBucket",  
                "s3:ListBucketMultipartUploads",  
                "s3:ListMultipartUploadParts",  
                "s3:AbortMultipartUpload",  
                "s3:CreateBucket",  
                "s3:PutObject",  
                "s3:PutBucketPublicAccessBlock"  
            ],  
            "Resource": [  
                "arn:aws:s3:::RESULTS_BUCKET/*",  
                "arn:aws:s3:::RESULTS_BUCKET"  
            ]  
        }  
    ]  
}

Click Next and name the policy

Once the policy is created, go back to the tab or window where the user was being created.

Search for and attach this newly created policy. NOTE: You may need to refresh the policies to see the newly created policy.

Attach AWSQuicksightAthenaAccess policy

Once the S3 policy is attached, search for and attach the AWS managed AWSQuicksightAthenaAccess policy. This allows the access to Athena as well to Glue for metadata.


Verify that both policies have been added

Save the user with these permissions.

Generate access keys

Generate access keys for the user that was created.

Remember the access key ID as well as the secret access key as it will be needed when Athena is configured in Lightup in the next step. Click Download .csv file to keep track of these keys.

Configure Athena in Lightup

📘

Workgroup support

Lightup supports Athena workgroups, which can help you manage costs and performance. A workgroup must be in place before you connect to the Athena datasource in Lightup— you can't add a workgroup to an existing datasource.

  1. In the left pane, open a workspace menu and select Datasources.
  2. In the main page select Create Datasource +.
  1. Enter a Datasource Name, then for Connector Type select Athena.
  2. Under Configure connector, provide the following inputs:
  • Region - Specify the AWS Region where your data is hosted, e.g. "us-west-2". Read more about Athena Regions.
  • Access Key ID - Part of the access key for the lightup user. If you didn't set up the lightup user account and don't know its Access Key ID, ask your system administrator for this information.
  • Secret Access Key - Part of the access key for the lightup user. If you didn't set up the lightup user account and don't know its Secret Access Key, ask your system administrator for this information.
  • Staging Directory - Enter the S3 bucket for your Athena query results bucket. It should start with s3://....
  • Workgroup - If needed, enter a specific Athena workgroup to connect to.

Please note that Test Connection currently does not test the validity of the S3 staging directory as well as the workgroup so, please make sure that there are no typos.

  1. After entering the required settings and any optional settings that apply, below the Configure connector section select Test Connection.
  2. After a successful connection test, select Save.
  3. Your new datasource appears in the list of available datasources. By default, these are listed in alphabetical order, so you might have to scroll or change the sort order to see your new datasource.

Advanced/Schema scan frequency

You can adjust how often scans run for a datasource.

  • In section 3 - Advanced, select a value for Schema scan frequency: Hourly, Daily, or Weekly.

Query Governance

Athena datasources support the Scheduling, Enable data storage, Maximum backfill duration, and Maximum distinct values settings. For steps, see Set query governance settings for a datasource.

Date/time data types

These Athena date/time data types are supported:

  • DATE
  • TIMESTAMP

Object types

These Athena object types are supported:

  • Tables

Partitions

Athena datasources support partitions, multi-partitions, and partition time zones.

Supported Versions

Lightup currently supports:

  • Athena SQL