Kubernetes

5 Mins Read

Scaling Applications with Kubernetes: A Comprehensive Guide to Horizontal Pod Autoscaling

Voiced by Amazon Polly

In today’s cloud-native environment, applications must be resilient and adaptable to varying loads. As user demand fluctuates, maintaining optimal performance while managing resource costs becomes a critical challenge. This is where Kubernetes shines, offering powerful tools for scaling applications based on real-time demand. One such feature is the Horizontal Pod Autoscaler (HPA), which allows you to dynamically adjust the number of pod replicas in your deployment based on CPU utilization or other select metrics. In this blog post, we will explore how HPA works and guide you through a hands-on lab to implement it with a simple web application.

Customized Cloud Solutions to Drive your Business Success

  • Cloud Migration
  • Devops
  • AIML & IoT
Know More

What is Horizontal Pod Autoscaler (HPA)?

The Horizontal Pod Autoscaler automatically scales the number of pods in a deployment, replica set, or stateful set based on observed metrics such as CPU utilization or memory consumption. When the resource usage exceeds a defined threshold, HPA increases the number of replicas to maintain performance. Conversely, it reduces the number of replicas when the demand decreases, optimizing resource usage and costs.

Key Features of HPA

  • Dynamic Scaling: HPA adjusts pod replicas automatically based on real-time metrics.
  • Custom Metrics: Besides CPU and memory, HPA can scale based on custom metrics using the Kubernetes Metrics Server or external metrics providers.
  • Integration: HPA works seamlessly with other Kubernetes resources, such as Deployments and ReplicaSets.

Hands-On Lab: Implementing HPA with a Simple Web Application

In this section, we will set up a simple web application and configure the Horizontal Pod Autoscaler to automatically scale the number of replicas based on CPU utilization.

Prerequisites

  • A running Kubernetes cluster (kubeadm, minikube, GKE, EKS, etc.).
  • kubectl installed and configured to communicate with your cluster.
  • Basic knowledge of Kubernetes concepts.

Step 1: Deploy a Simple Web Application

First, we will create a simple web application that we can scale. For this lab, we will use a sample NGINX application.

  • Create a Deployment: Create a file named nginx-deployment.yaml with the following content:
  • Apply the Deployment: Run the following command to create the deployment:
  • Verify the Deployment: Check if the deployment is up and running:

Step 2: Expose the Application

Next, we will expose our application using a Kubernetes Service so we can access it.

  • Create a Service: Create a file named nginx-service.yaml with the following content:
  • Apply the Service: Run the following command:

  • Get the Service Information: After a few moments, you can retrieve the service details:

 

Step 3: Install the Metrics Server

HPA relies on the Metrics Server to gather metrics. If you have not installed it yet, you can do so with the following commands:

  • Install Metrics Server: Run the following command to install the Metrics Server:

  • Verify the Metrics Server: Check that the Metrics Server is running:

 

Step 4: Create the Horizontal Pod Autoscaler

Now that we have our application running and the Metrics Server in place, we can create the HPA.

  • Create an HPA Configuration: Create a file named hpa.yaml with the following content:

 

  • Apply the HPA Configuration: Run the following command:
  • Verify HPA Creation: Check if the HPA is created successfully:

Step 5: Simulate Load to Test Autoscaling

To see HPA in action, we need to simulate some load on our application.

  • Install Apache Benchmark (or any other load testing tool):

For example, if you are using a local machine with apt-get:

  • Run Load Test: Replace <EXTERNAL_IP> with the actual IP address of your NGINX service obtained earlier:

E.g. ab -n 100000 -c 300 http://3.129.8.95:30080/

 

Step 6: Monitor the Autoscaling

  • Check the HPA Status: After running the load test, monitor the HPA status:

You should see the number of replicas scaling up based on CPU usage.

Conclusion

In this blog post, we have explored how to leverage the Horizontal Pod Autoscaler in Kubernetes to dynamically scale applications based on demand. By deploying a simple NGINX application and configuring HPA, you can automatically adjust the number of replicas to maintain optimal performance under varying loads.

With HPA, Kubernetes offers a robust solution for managing application scalability, ensuring your applications remain responsive while optimizing resource usage and cost. As you delve deeper into Kubernetes, consider exploring more advanced features like custom metrics or external metrics providers for further scalability enhancements. Happy scaling!

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

  • Cloud Training
  • Customized Training
  • Experiential Learning
Read More

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery Partner and many more.

To get started, go through our Consultancy page and Managed Services PackageCloudThat’s offerings.

WRITTEN BY Komal Singh

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!