Amazon Athena adds cost details to query execution plans

Dhaval Soni
4 min readMay 30, 2022
Amazon Athena adds cost details to query execution plans

How often do you find that the analysis of unstructured, semi-structured, or even highly structured is consuming a great deal of your time? Sure, who doesn’t wish to have a quicker and easier way to deal with the analysis! If you have got upcoming projects that involve the analysis of data, I’ve got something important for you.

Introduction about Amazon Athena

Amazon Athena enables the users, especially the data analysts to perform interactive queries in the web-based cloud storage service. They can also perform this action on Amazon Simple Storage Service (S3). This service is a perfect fit for cases that involve large-scale data sets.

AWS Athena is a serverless query service and has a variety of benefits of its own. Since it is serverless, the analysts do not need to manage any underlying compute infrastructure to use it. The users also don’t need to load S3 data into Athena or even transform it for analysis. It makes it easier and quite faster when it comes to gaining insights.

As a data analyst, you can access this service through the AWS Management Console. AWS Management Console is an application programming interface. You can also refer to it as a Java Database Connectivity driver. It allows you to define the schema and start to use the built-in query editor. This built-in query editor helps you to execute SQL queries on S3 data very easily.

You can query encrypted data with the help of keys managed by AWS Key Management Service and encrypt query results as well. AWS Athena also allows consistent cross-account access to S3 buckets that are owned by another user when the need arises. Along with that, the service uses managed data catalogs to store information. It also stored the schemas related to searches on Amazon S3 data.

In other words, the AWS Athena interactive query service is an analytical tool that helps the users or organizations analyze the data that is stored in Amazon S3. Athena can also efficiently process unstructured, semistructured, and structured data sets too. This feature could be pretty useful for research purposes, log analysis, and performing Online Analytical Processing.

One of the best benefits of using AWS Athena is that there is no server to manage. As a user, you do not have to manage the underlying infrastructure because the software is designed to automatically handle configuration and software updates.

AWS Athena easily integrates with many other services in the AWS portfolio. Even AWS Glue integrates with Athena to offer you more reliable and easy-to-use data catalog features. It allows you to use features like metadata repository, automated schema and partition recognition, data pipelines, etc. These features are mainly based on Python.

You can also include Athena with AWS CloudFormation, Elastic Load Balancing, Amazon QuickSight, AWS Step Functions, AWS Systems Manager Inventory, Amazon CloudFront, Amazon S3 Inventory, Amazon Virtual Private Cloud, AWS IAM, and AWS CloudTrail.

Announcement Details

Amazon has declared that Amazon Athena has now added cost details to query execution plans. What should the users expect from it and how beneficial the new update is, let’s find out.

Importance of the Announcement

According to the recent announcement, Amazon Athena would now be able to display the computational cost of your queries. You can also keep track of their execution plans. It has also released a new EXPLAIN ANALYZE statement.

This statement can help you in executing your specified query and return a detailed breakdown. This breakdown will of its execution plan along with the CPU usage. This would be applicable to each stage and the number of rows processed during the breakdown.

Along with the provision of better understanding a query’s execution plan, you would now be able to see the time spent within each operator. It can enhance the way you assess the performance profiles of query clauses and the way you choose to order them. With the help of the row input and output counts, the users would be able to validate the impact of query predicates. This could be quite helpful when it’s done over large datasets.

If you are an administrator then you will also find the scanned data counts. The documents would be great for planning the financial impact of their users’ workloads. The documents also intend to identify queries that could benefit from further optimization. You can govern it to control costs using Athena’s data usage controls very easily.

--

--

Dhaval Soni

Dhaval is a seasoned Solutions Architect with expertise in designing, implementing, securing, and managing enterprise cloud computing solutions for customers.