How Infrastructure Assessment Strengthened Scalability for a Major Government Financial Platform

News Image
Zulfi Al Hakim | 11th June 2026

Objective 

A major Indonesian government financial institution required a comprehensive assessment of its Kubernetes-based a critical nationwide tax administration system infrastructure to evaluate the current operational condition, identify potential scalability and reliability risks, and establish a structured roadmap aligned with Kubernetes production best practices. 

PT Boer Technology conducted a full infrastructure assessment covering Kubernetes cluster architecture, node utilization, storage configuration, RabbitMQRedis, and workload management to support long-term operational stability and modernization initiatives. 

Context 

The critical platform operated on a large-scale Kubernetes environment consisting of: 

  • Over 20+ Kubernetes nodes (Including Control Plane and Worker Nodes) 

  • Kubernetes platform 

  • Hundreds of deployments across multiple active namespace 

  • Over 200+ HorizontalPodAutoscaler resources 

  • Stateful workloads including RabbitMQ and Redis clusters 

  • Ubuntu LTS infrastructure environment 

  • Calico CNI and containerd runtime deployment 

The assessment identified several operational and architectural challenges, including: 

  • Kubernetes and container runtime versions reaching End-of-Life (EOL) 

  • Absence of Pod Security implementation 

  • Lack of ResourceQuota and LimitRange policies 

  • Shared Redis storage architecture creating Single Point of Failure (SPOF) 

  • RabbitMQ memory configuration operating close to OOM threshold 

  • Over-provisioned compute allocation reducing cluster efficiency 

  • Legacy autoscaling API usage with limited scaling flexibility 

  • Limited workload disruption protection mechanisms 

These conditions created potential risks for scalability, operational resilience, and future platform expansion. 

Approach 

PT Boer Technology performed a structured Kubernetes infrastructure assessment and operational analysis focused on production readiness and platform optimization. 

The assessment activities included: 

  • Reviewing Kubernetes cluster architecture and infrastructure topology 

  • Collecting node-level CPU and memory utilization metrics 

  • Evaluating Kubernetes runtime, networking, and storage configurations 

  • Assessing RabbitMQ cluster sizing, persistence, quorum, and resource allocation 

  • Analyzing Redis persistence architecture and storage performance 

  • Reviewing autoscaling implementation and workload scheduling strategy 

  • Identifying security gaps and operational risks 

  • Benchmarking existing configurations against Kubernetes best practices 

Based on the findings, PT Boer Technology produced a prioritized recommendation roadmap covering critical, medium, and low-priority improvements. 

Key recommendations included: 

  • Kubernetes cluster upgrade to supported stable versions 

  • Migration from legacy HPA APIs to autoscaling/v2 

  • Implementation of Pod Security baseline policies 

  • ETCD storage expansion for production-grade scalability 

  • Deployment of PodDisruptionBudget protections 

  • Redis storage migration from shared RWX storage to dedicated RWO volumes 

  • RabbitMQ memory watermark optimization 

  • Compute resource right-sizing for improved scheduler efficiency 

  • Implementation of Descheduler and TopologySpreadConstraint 

  • Adoption of Gateway API architecture for traffic management modernization 

Results 

The assessment provided financial insitution with a comprehensive operational visibility framework and a phased modernization roadmap for the critical system on Kubernetes platform. 

Key outcomes included: 

  • Complete visibility into Kubernetes cluster health and workload utilization 

  • Identification of production stability and scalability bottlenecks 

  • Structured prioritization of infrastructure improvements based on operational impact 

  • Risk mitigation recommendations for RabbitMQ quorum and Redis persistence 

  • Identification of inefficient compute allocation across stateful workloads 

  • Security hardening recommendations aligned with Kubernetes best practices 

  • Scalability improvement roadmap for autoscaling and workload distribution 

  • Foundation for future Kubernetes platform modernization initiatives 

Before 

  • Kubernetes cluster operated on End-of-Life versions 

  • Redis workloads relied on shared NFS storage with SPOF risk 

  • RabbitMQ memory configuration operated near OOM threshold 

  • No Pod Security enforcement implemented 

  • Limited workload disruption safeguards 

  • Manual traffic management configuration using nginx.conf 

  • Cluster resource reservations not fully optimized 

  • HPA implementation limited to CPU-based autoscaling 

After 

  • Clear modernization roadmap established for cluster upgrade and optimization 

  • Production risks identified with mitigation recommendations 

  • Redis persistence architecture improvement plan defined 

  • RabbitMQ operational resilience and quorum protection strategy established 

  • Security baseline implementation framework documented 

  • Autoscaling enhancement strategy prepared using autoscaling/v2 

  • Infrastructure scalability and workload balancing improvements defined 

  • Operational governance and Kubernetes best-practice alignment strengthened 

Takeaways 

  • Large-scale Kubernetes environments require periodic operational assessments to maintain production readiness and scalability. 

  • Stateful services such as RabbitMQ and Redis require dedicated architecture optimization to avoid performance bottlenecks and operational risks. 

  • Incremental modernization strategies allow organizations to improve infrastructure reliability while minimizing disruption to critical services.