Voiced by Amazon Polly |
In today’s cloud-native environment, applications must be resilient and adaptable to varying loads. As user demand fluctuates, maintaining optimal performance while managing resource costs becomes a critical challenge. This is where Kubernetes shines, offering powerful tools for scaling applications based on real-time demand. One such feature is the Horizontal Pod Autoscaler (HPA), which allows you to dynamically adjust the number of pod replicas in your deployment based on CPU utilization or other select metrics. In this blog post, we will explore how HPA works and guide you through a hands-on lab to implement it with a simple web application.
Customized Cloud Solutions to Drive your Business Success
- Cloud Migration
- Devops
- AIML & IoT
What is Horizontal Pod Autoscaler (HPA)?
The Horizontal Pod Autoscaler automatically scales the number of pods in a deployment, replica set, or stateful set based on observed metrics such as CPU utilization or memory consumption. When the resource usage exceeds a defined threshold, HPA increases the number of replicas to maintain performance. Conversely, it reduces the number of replicas when the demand decreases, optimizing resource usage and costs.
Key Features of HPA
- Dynamic Scaling: HPA adjusts pod replicas automatically based on real-time metrics.
- Custom Metrics: Besides CPU and memory, HPA can scale based on custom metrics using the Kubernetes Metrics Server or external metrics providers.
- Integration: HPA works seamlessly with other Kubernetes resources, such as Deployments and ReplicaSets.
Hands-On Lab: Implementing HPA with a Simple Web Application
In this section, we will set up a simple web application and configure the Horizontal Pod Autoscaler to automatically scale the number of replicas based on CPU utilization.
Prerequisites
- A running Kubernetes cluster (kubeadm, minikube, GKE, EKS, etc.).
- kubectl installed and configured to communicate with your cluster.
- Basic knowledge of Kubernetes concepts.
Step 1: Deploy a Simple Web Application
First, we will create a simple web application that we can scale. For this lab, we will use a sample NGINX application.
- Create a Deployment: Create a file named nginx-deployment.yaml with the following content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx resources: requests: cpu: "200m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi" ports: - containerPort: 80 |
- Apply the Deployment: Run the following command to create the deployment:
1 |
kubectl apply -f nginx-deployment.yaml |
- Verify the Deployment: Check if the deployment is up and running:
1 |
kubectl get deployments |
Step 2: Expose the Application
Next, we will expose our application using a Kubernetes Service so we can access it.
- Create a Service: Create a file named nginx-service.yaml with the following content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 type: NodePort |
- Apply the Service: Run the following command:
1 |
kubectl apply -f nginx-service.yaml |
- Get the Service Information: After a few moments, you can retrieve the service details:
1 |
kubectl get services |
Step 3: Install the Metrics Server
HPA relies on the Metrics Server to gather metrics. If you have not installed it yet, you can do so with the following commands:
- Install Metrics Server: Run the following command to install the Metrics Server:
1 |
kubectl apply -f <a href="https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml">https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml</a> |
- Verify the Metrics Server: Check that the Metrics Server is running:
1 |
kubectl get pods -n kube-system |
Step 4: Create the Horizontal Pod Autoscaler
Now that we have our application running and the Metrics Server in place, we can create the HPA.
- Create an HPA Configuration: Create a file named hpa.yaml with the following content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: nginx-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx-deployment minReplicas: 2 maxReplicas: 10 targetCPUUtilizationPercentage: 70 |
- Apply the HPA Configuration: Run the following command:
1 |
kubectl apply -f hpa.yaml |
- Verify HPA Creation: Check if the HPA is created successfully:
1 |
kubectl get hpa |
Step 5: Simulate Load to Test Autoscaling
To see HPA in action, we need to simulate some load on our application.
- Install Apache Benchmark (or any other load testing tool):
For example, if you are using a local machine with apt-get:
1 |
sudo apt-get install apache2-utils |
- Run Load Test: Replace <EXTERNAL_IP> with the actual IP address of your NGINX service obtained earlier:
1 |
ab -n [total_requests] -c [concurrent_requests] <a href="http://[hostname]:[port]/%5bpath">http://[hostname]:[port]/[path</a>] |
E.g. ab -n 100000 -c 300 http://3.129.8.95:30080/
Step 6: Monitor the Autoscaling
- Check the HPA Status: After running the load test, monitor the HPA status:
1 |
kubectl get hpa -w |
You should see the number of replicas scaling up based on CPU usage.
Conclusion
In this blog post, we have explored how to leverage the Horizontal Pod Autoscaler in Kubernetes to dynamically scale applications based on demand. By deploying a simple NGINX application and configuring HPA, you can automatically adjust the number of replicas to maintain optimal performance under varying loads.
With HPA, Kubernetes offers a robust solution for managing application scalability, ensuring your applications remain responsive while optimizing resource usage and cost. As you delve deeper into Kubernetes, consider exploring more advanced features like custom metrics or external metrics providers for further scalability enhancements. Happy scaling!
Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.
- Cloud Training
- Customized Training
- Experiential Learning
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, AWS GenAI Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner and many more.
To get started, go through our Consultancy page and Managed Services Package, CloudThat’s offerings.
WRITTEN BY Komal Singh
Click to Comment