How to Analyze VPC Flow Logs with Amazon Athena and Visualize with QuickSight

How to Analyze VPC Flow Logs with Amazon Athena and Visualize with QuickSight

Introduction

Understanding traffic inside your AWS VPC is critical for security, troubleshooting, and optimization. VPC Flow Logs capture detailed network activity, but in their raw form they are hard to analyze. With Amazon Athena, you can query these logs directly in S3 using SQL. Combined with Amazon QuickSight, you can then build dashboards that make traffic patterns and anomalies easy to visualize. In this guide, we’ll walk through the full process of enabling VPC Flow Logs, querying them with Athena, and creating interactive dashboards in QuickSight.

How to Analyze VPC Flow Logs with Amazon Athena and Visualize with QuickSight

What is VPC Flow Logs and Why Analyze it?

VPC Flow Logs is an AWS feature that captures detailed information about the IP traffic going to and from network interfaces within your VPC. Each log record includes metadata such as source and destination IP addresses, ports, protocols, number of packets, bytes transferred, and whether the traffic was accepted or rejected.

Analyzing VPC Flow Logs is important because it helps you:

  • Improve security: detect unusual traffic, port scans, or unauthorized access attempts.
  • Troubleshoot networking issues: identify dropped connections or misconfigured security groups.
  • Optimize performance and costs: understand traffic flows and bandwidth usage.
  • Maintain compliance: provide audit trails of network activity.

Instead of scrolling through millions of raw log lines, you can use Athena to query them efficiently and QuickSight to visualize patterns in a dashboard.

What is Amazon Athena?

Amazon Athena is a serverless query service that lets you analyze data stored in Amazon S3 using standard SQL. With Athena, you don’t need to manage infrastructure, you simply define a schema for your logs and run queries on demand.

For VPC Flow Logs, Athena turns raw traffic data into structured insights, making it much easier to explore and investigate.

What is Amazon QuickSight?

Amazon QuickSight is a fully managed business intelligence (BI) service that allows you to create interactive visualizations and dashboards. It connects directly to Athena, so you can transform your SQL query results into charts, graphs, and reports.

Prerequisites

Before you start, make sure you have the following in place:

  • AWS Account An active AWS account with access to the AWS Management Console.
  • VPC An existing Virtual Private Cloud (VPC) where you will enable Flow Logs.

Step 1: Enable VPC Flow Logs

To capture traffic activity inside your VPC, you first need to create a VPC Flow Log and deliver the data to Amazon S3.

Enable Flow Logs via the Console

  1. Open the VPC Console and select Your VPCs.
  2. Choose the VPC you want to monitor and go to the Flow Logs tab.
  3. Click Create Flow Log.
  4. Name Enter a descriptive name for your flow log.
  5. Traffic type Choose what traffic to capture:
    • ALL → Captures accepted and rejected traffic (recommended for full visibility).
  6. Maximum aggregation interval Choose how often logs are aggregated before delivery:

    • 1 minute → Higher granularity, more detail, higher cost.
    • 10 minutes → Lower granularity, fewer records, lower cost (recommended for most cases).

Enable VPC Flow Logs via the Console

  1. Destination type Select Send to S3 bucket.
  2. S3 bucket ARN Provide the ARN of your bucket (e.g., arn:aws:s3:::my-vpc-flow-logs-bucket).

Enable VPC Flow Logs via the Console

  1. Log format Keep the default format.
  2. Review the configuration and click Create Flow Log.

Once created, your flow logs will start streaming into the S3 bucket under the chosen prefix. You can verify delivery by checking the bucket for newly created log files.

AWS VPC Flow Logs Recording in Amazon S3

Step 2: Prepare Data in Athena

With your VPC Flow Logs now stored in Amazon S3, the next step is to make them queryable using Amazon Athena. Athena can query structured data directly from S3 once you define a table schema.

Amazon Athena Overview

Create the Athena Table

Open the Athena Console, select your database, and run the following CREATE EXTERNAL TABLE statement. Be sure to update the S3 path with the location of your logs:

CREATE EXTERNAL TABLE IF NOT EXISTS `vpc_flow_logs` (
  version int,
  account_id string,
  interface_id string,
  srcaddr string,
  dstaddr string,
  srcport int,
  dstport int,
  protocol bigint,
  packets bigint,
  bytes bigint,
  start bigint,
  `end` bigint,
  action string,
  log_status string,
  vpc_id string,
  subnet_id string,
  instance_id string,
  tcp_flags int,
  type string,
  pkt_srcaddr string,
  pkt_dstaddr string,
  region string,
  az_id string,
  sublocation_type string,
  sublocation_id string,
  pkt_src_aws_service string,
  pkt_dst_aws_service string,
  flow_direction string,
  traffic_path int
)
PARTITIONED BY (`date` date)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ' '
LOCATION 's3://kemanedonfack-vpc-flow-log-bucket/AWSLogs/xxxxxx/vpcflowlogs/eu-north-1/'
TBLPROPERTIES ("skip.header.line.count"="1");

Amazon Athena Create table Query

Add Partitions

If you use partitioned storage, you need to load the partitions into Athena.

ALTER TABLE vpc_flow_logs
ADD PARTITION (`date`='2025-08-30')
LOCATION 's3://kemanedonfack-vpc-flow-log-bucket/AWSLogs/xxxxxx/vpcflowlogs/eu-north-1/2025/08/30/';

Amazon Athena Add Partition Query

Validate the Table

Run a simple query to confirm that Athena can read the data:

SELECT *
FROM vpc_flow_logs
LIMIT 10;

You should now see traffic records with fields such as srcaddr, dstaddr, action, and bytes.

Amazon Athena Query Result

Step 3: Run Queries with Athena

Now that your VPC Flow Logs are available in Athena, you can start exploring the data with SQL. Below are some example queries to uncover valuable insights about your network traffic.

Top Source IPs

This query shows the IP addresses generating the most traffic in terms of bytes:

SELECT srcaddr, SUM(bytes) AS total_bytes
FROM vpc_flow_logs
WHERE action = 'ACCEPT'
GROUP BY srcaddr
ORDER BY total_bytes DESC
LIMIT 10;

Amazon Athena AWS VPC Flow Query Example Top Source IPs

Top Destination IPs

Find out which destination IPs are receiving the most traffic:

SELECT dstaddr, SUM(bytes) AS total_bytes
FROM vpc_flow_logs
WHERE action = 'ACCEPT'
GROUP BY dstaddr
ORDER BY total_bytes DESC
LIMIT 10;

Most Used Ports and Protocols

To understand which ports are most active:

SELECT dstport, COUNT(*) AS connection_count
FROM vpc_flow_logs
WHERE action = 'ACCEPT'
GROUP BY dstport
ORDER BY connection_count DESC
LIMIT 10;

To check protocol distribution (e.g., TCP=6, UDP=17, ICMP=1):

SELECT protocol, COUNT(*) AS connection_count
FROM vpc_flow_logs
GROUP BY protocol
ORDER BY connection_count DESC;

Amazon Athena AWS VPC Flow Query Example Most Used Protocol

Accepted vs Rejected Traffic

Quickly compare how much traffic is accepted versus rejected:

SELECT action, COUNT(*) AS request_count, SUM(bytes) AS total_bytes
FROM vpc_flow_logs
GROUP BY action;

This helps detect if there are unusual spikes in REJECTED traffic.

Bandwidth by Subnet or VPC

If your flow logs include subnet IDs or VPC IDs, you can aggregate traffic at that level. Example by subnet:

SELECT interface_id, SUM(bytes) AS total_bytes
FROM vpc_flow_logs
GROUP BY interface_id
ORDER BY total_bytes DESC;

This gives visibility into which subnets or ENIs (Elastic Network Interfaces) are consuming the most bandwidth.

Step 4: Connect Athena to QuickSight

Once you’ve run queries in Athena, the next step is to visualize the results with Amazon QuickSight. To do this, you need to connect Athena as a data source and configure the right permissions.

Amazon QuicSight Overview

Configure IAM and QuickSight Permissions

To allow QuickSight to query Athena and read data from S3:

  1. In the QuickSight Console, go to Manage QuickSight → Security & permissions.

Amazon QuicSight Configuring permissions

  1. Ensure that QuickSight has access to:

    • Athena to run queries.
    • Amazon S3 specifically the bucket(s) where VPC Flow Logs are stored and the Athena query results bucket.
  2. Attach the managed policy AmazonAthenaFullAccess and grant S3 read access via the IAM role associated with QuickSight.

Amazon QuicSight Configuring permissions

Add Athena as a Data Source

  1. Open the Amazon QuickSight Console.
  2. Go to Datasets → New dataset.
  3. Select Athena as the data source.
  4. Give your data source a name (e.g., VPCFlowLogsAthena).

Amazon QuicSight Creating DataSet

  1. Choose the correct Athena workgroup and click Create data source.
  2. Select the database and the vpc_flow_logs table you created earlier in Athena.

Amazon QuicSight Creating DataSet

  1. Choose Edit/Preview data. For start_time and end_time, set the data format as a date rather than a number.
  2. Now select Save and visualize.

Amazon QuicSight Creating DataSet

Step 5: Build Visualizations in QuickSight

With your dataset connected, you can now create interactive dashboards to better understand traffic patterns. QuickSight provides different chart types that map perfectly to VPC Flow Log insights. Below are some examples you can build:

Amazon QuicSight Graph Overview

Pie Chart – Traffic by Protocol

  • Goal: Quickly see which protocols (TCP, UDP, ICMP, etc.) dominate traffic.
  • How to build:

    1. Select Pie chart as the visualization type.
    2. Use protocol as the Group by field.
    3. Use COUNT(*) or SUM(bytes) as the metric.

This makes it easy to spot protocol usage distribution across your VPC.

Amazon QuicSight Pie Chart Traffic by Protocol

Heatmap – Accepted vs Rejected Traffic

  • Goal: Compare network traffic decisions (ACCEPT vs REJECT) across ports or IPs.
  • How to build:

    1. Choose a Heatmap visualization.
    2. Place action (ACCEPT/REJECT) on one axis.
    3. Place another dimension such as dstport or srcaddr on the other axis.
    4. Use COUNT(*) or SUM(bytes) as the metric for color intensity.

This visualization helps identify blocked attempts, security anomalies, or misconfigured security groups/NACLs.

Amazon QuicSight Heatmap Accepted vs Rejected Traffic

Bar Chart – Top Talkers (Source IPs)

  • Goal: Identify which source IPs generate the most traffic.
  • How to build:

    1. Choose a Bar chart visualization.
    2. Set srcaddr as the X-axis (Group by).
    3. Set SUM(bytes) as the Y-axis (Value).
    4. Sort the results by descending traffic volume and limit to the top 10.

Amazon QuicSight Bar Chart – Top Talkers

Conclusion

Converting VPC Flow Logs into queryable datasets with Amazon Athena and visual dashboards in QuickSight delivers practical network visibility for security, troubleshooting, and cost control. Apply core best practices partition and compress logs, enforce least-privilege IAM, and secure S3 storage to keep queries efficient and data protected. This approach makes it easier to detect anomalies, investigate incidents, and maintain reliable operational insight from your VPC traffic.

Leave a Reply

Your email address will not be published. Required fields are marked *

Press ESC to close