How Infrastructure Assessment Strengthened Scalability for a Major Government Financial Platform

Zulfi Al Hakim | 11th June 2026

Objective

A major Indonesian government financial institution required a comprehensive assessment of its Kubernetes-based a critical nationwide tax administration system infrastructure to evaluate the current operational condition, identify potential scalability and reliability risks, and establish a structured roadmap aligned with Kubernetes production best practices.

PT Boer Technology conducted a full infrastructure assessment covering Kubernetes cluster architecture, node utilization, storage configuration, RabbitMQ, Redis, and workload management to support long-term operational stability and modernization initiatives.

Context

The critical platform operated on a large-scale Kubernetes environment consisting of:

Over 20+ Kubernetes nodes (Including Control Plane and Worker Nodes)

Kubernetes platform

Hundreds of deployments across multiple active namespace

Over 200+ HorizontalPodAutoscaler resources

Stateful workloads including RabbitMQ and Redis clusters

Ubuntu LTS infrastructure environment

Calico CNI and containerd runtime deployment

The assessment identified several operational and architectural challenges, including:

Kubernetes and container runtime versions reaching End-of-Life (EOL)

Absence of Pod Security implementation

Lack of ResourceQuota and LimitRange policies

Shared Redis storage architecture creating Single Point of Failure (SPOF)

RabbitMQ memory configuration operating close to OOM threshold

Over-provisioned compute allocation reducing cluster efficiency

Legacy autoscaling API usage with limited scaling flexibility

Limited workload disruption protection mechanisms

These conditions created potential risks for scalability, operational resilience, and future platform expansion.

Approach

PT Boer Technology performed a structured Kubernetes infrastructure assessment and operational analysis focused on production readiness and platform optimization.

The assessment activities included:

Reviewing Kubernetes cluster architecture and infrastructure topology

Collecting node-level CPU and memory utilization metrics

Evaluating Kubernetes runtime, networking, and storage configurations

Assessing RabbitMQ cluster sizing, persistence, quorum, and resource allocation

Analyzing Redis persistence architecture and storage performance

Reviewing autoscaling implementation and workload scheduling strategy

Identifying security gaps and operational risks

Benchmarking existing configurations against Kubernetes best practices

Based on the findings, PT Boer Technology produced a prioritized recommendation roadmap covering critical, medium, and low-priority improvements.

Key recommendations included:

Kubernetes cluster upgrade to supported stable versions

Migration from legacy HPA APIs to autoscaling/v2

Implementation of Pod Security baseline policies

ETCD storage expansion for production-grade scalability

Deployment of PodDisruptionBudget protections

Redis storage migration from shared RWX storage to dedicated RWO volumes

RabbitMQ memory watermark optimization

Compute resource right-sizing for improved scheduler efficiency

Implementation of Descheduler and TopologySpreadConstraint

Adoption of Gateway API architecture for traffic management modernization

Results

The assessment provided financial insitution with a comprehensive operational visibility framework and a phased modernization roadmap for the critical system on Kubernetes platform.

Key outcomes included:

Complete visibility into Kubernetes cluster health and workload utilization

Identification of production stability and scalability bottlenecks

Structured prioritization of infrastructure improvements based on operational impact

Risk mitigation recommendations for RabbitMQ quorum and Redis persistence

Identification of inefficient compute allocation across stateful workloads

Security hardening recommendations aligned with Kubernetes best practices

Scalability improvement roadmap for autoscaling and workload distribution

Foundation for future Kubernetes platform modernization initiatives

Before

Kubernetes cluster operated on End-of-Life versions

Redis workloads relied on shared NFS storage with SPOF risk

RabbitMQ memory configuration operated near OOM threshold

No Pod Security enforcement implemented

Limited workload disruption safeguards

Manual traffic management configuration using nginx.conf

Cluster resource reservations not fully optimized

HPA implementation limited to CPU-based autoscaling

After

Clear modernization roadmap established for cluster upgrade and optimization

Production risks identified with mitigation recommendations

Redis persistence architecture improvement plan defined

RabbitMQ operational resilience and quorum protection strategy established

Security baseline implementation framework documented

Autoscaling enhancement strategy prepared using autoscaling/v2

Infrastructure scalability and workload balancing improvements defined

Operational governance and Kubernetes best-practice alignment strengthened

Takeaways

Large-scale Kubernetes environments require periodic operational assessments to maintain production readiness and scalability.

Stateful services such as RabbitMQ and Redis require dedicated architecture optimization to avoid performance bottlenecks and operational risks.

Incremental modernization strategies allow organizations to improve infrastructure reliability while minimizing disruption to critical services.

Category: Cloud Infrastructure Infrastructure Assessment IT Infrastructure