AWS

6 Mins Read

A Tour On Automated Athena Integration With AWS VPC Flow Log

Voiced by Amazon Polly

Overview

Amazon Virtual Private Cloud (Amazon VPC) allows you to start and manage AWS resources in a virtual network that you specify and that is logically separated. Using the advantages of AWS’s scalable infrastructure, this virtual network is quite similar to a regular network that you would run in your own data center. It is very common that we can have few EC2 instances within your VPC are critical and may consists of confidential data. Then from security perspective it is very important that we keep a track of  IP traffic to that EC2 instances. Prior to VPC Flow Logs, AWS users had to deploy agents on their EC2 instances to collect network flow logs. This made the process of collecting, preserving, and analyzing network flows difficult and only provided a restricted picture of network flows. Security teams were able to see the network traffic entering and leaving their virtual infrastructure when AWS  VPC Flow Logs were introduced in 2015.

Transform Your Career with AWS Certifications

  • Advanced Skills
  • AWS Official Curriculum
  • 10+ Hand-on Labs
Enroll Now

What is VPC Flow log

With Amazon VPC Flow Logs, you may record data about network traffic entering and leaving your VPC’s network interfaces. To keep an eye on various network characteristics of your VPC, you can use VPC Flow Logs as a single, consolidated source of information. Security engineers can view the history of high-level network traffic flows within whole VPCs, subnets, or particular network interfaces (ENIs) by using VPC Flow Logging. For security teams who are interested in gathering network instrumentation across sizable groups of instances, this makes VPC Flow Logs a valuable information source. Amazon CloudWatch Logs, Amazon S3, or Amazon Data Firehose are the places where VPC flow log data can be stored. Following the creation of a flow log, the records in the log group, bucket, or delivery stream can be retrieved and viewed. Since flow log data is gathered away from the path taken by your network traffic, it has no impact on latency or throughput. There is no chance that adding or removing flow logs will negatively affect network performance.

Use cases of VPC Flow log

  1. Monitor remote login activity using SSH and RDP
  2. If a compromised host has been located, lateral movement can also be monitored using flow logs.
  3. Creating reports of high – risk activities or non-compliant protocols, as well as by looking at novel threat patterns, to generate statistics about network traffic.

Setting up VPC Flow log that publishes to S3

Reference Architecture

Let’s Begin!

— Open the Amazon VPC console, go to your VPC, select default VPC

— Choose Actions, Create flow log.

— For Filter, Choose All

— For Maximum aggregation interval, choose 1 min.

— For Destination, choose Send to an Amazon S3 bucket – test-vpc-flowlog.

— For S3 bucket ARN, specify the Amazon Resource Name (ARN) of an existing Amazon S3 bucket.

E.g.: arn:aws:s3:::test-vpc-flowlog

— For Log record format – use the default.

— For Log file format, specify the format for the log file.

Choose Text – Plain text. This is the default format.

— Leave other options as default

— Choose Create flow log.

View VPC Flow log in s3

The Amazon S3 console allows you to access your flow log records. It may take quite a while for your flow log to appear in the console after you generate it.

To view flow log records that have been uploaded to Amazon S3 follow the below steps:

  1. Open the Amazon S3 console
  2. Select the name of the bucket in which VPC flow log data is collected.
  3. Navigate to the folder with the log files. For example, prefix/AWSLogs/account_id/vpcflowlogs/region/year/month/day/.
  4. Select the checkbox next to the file name, and then choose Download.

The log files have been compressed. The log files are decompressed, and the flow log entries are shown when you examine them with the Amazon S3 console. To examine the flow log records after downloading the files, you must first decompress them. Here is one sample of flow log file in S3.

The above log file shows the IP traffic of one date only. Now, this is very lengthy process to view the logs one by one. Security team always looks for easy and fast way to do query on VPC flow log data. This can be achieved by automating Athena with VPC flow log. So let’s see how to integrate Athena with VPC flow log.

Automate Athena with VPC Flow log

You may use conventional SQL to analyze data in Amazon S3, including your flow logs, with Amazon Athena, an interactive query service. You may rapidly obtain useful information about the traffic passing through your VPC by utilizing Athena with VPC Flow Logs. For instance, you can determine which IP addresses have the most refused TCP connections or which resources in your virtual private clouds (VPCs) are the top talkers.

By creating a CloudFormation template that generates the necessary AWS resources and predefined queries that you can execute to gain insights about the traffic passing through your VPC, you can simplify and expedite the integration of your VPC flow logs with Athena.

Once your initial flow log has been successfully sent to S3 bucket, you can seamlessly incorporate  Athena by generating the CloudFormation template and utilize it to build a stack.

Generate template using console:

  1. Open the VPC console, select the VPC for which you had created the flow log (destination must be S3)
  2. Choose Actions, Generate Athena integration after selecting a flow log that publishes to Amazon S3 from the Flow logs tab.

  1. Indicate the frequency of the partition load. You must enter the start and end dates of the partition using previous dates if you select None. The start and end dates of the division are optional whether you select Daily, Weekly, or Monthly. The CloudFormation template generates a Lambda function that loads fresh partitions on a regular basis if you don’t specify start and end dates.
  2. Give the ARN of S3 bucket for the generated template, and for the query results.
  3. Select Generate Athena integration.
  4. To launch the Create Stack wizard in the AWS CloudFormation console, select Create CloudFormation stack from the success message. The Template element contains the URL for the CloudFormation template that was built. To build the resources listed in the template, finish the wizard.

Query VPC flow log using Athena

  1. Open the Athena console and select query editor.

 

  1. Under data section for Data Source select AWS Data Catalog, for Database select vpcflowlogathenadatabase( auto created because of cloudformation stack), for Table select table whose name start with fl.

  1. Select three dots in front of table name and select preview table.

  1. In Query section one query is generate and run automatically.

  1. Scroll down the page under query result you will get VPC flow log entries in table format.

  1. You can also try query from saved queries section in Athena console. These are queries autogenerated because of CloudFormation stack.

 

Drive Business Growth with AWS's Machine Learning Solutions

  • Scalable
  • Cost-effective
  • User-friendly
Connect Today

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery Partner and many more.

To get started, go through our Consultancy page and Managed Services PackageCloudThat’s offerings.

WRITTEN BY Mahek Tamboli

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!