GKE Overview¶
Description¶
Google Kubernetes Engine (GKE) is a managed Kubernetes service that provides a platform for deploying, managing, and scaling containerized applications using Google’s infrastructure. GKE abstracts away the complexity of managing Kubernetes control planes and provides deep integration with Google Cloud services.
Architecture: Managed Kubernetes clusters consisting of a control plane (managed by Google) and worker nodes (managed by Google or you, depending on cluster type).
Key Features¶
Managed Control Plane¶
- Fully Managed: Google manages the Kubernetes API server, etcd, and other control plane components
- Automatic Updates: Control plane automatically updated with latest Kubernetes versions
- High Availability: Multi-zone control plane with 99.95% or 99.99% SLA (depending on cluster type)
- No Control Plane Costs: Standard clusters don’t charge for control plane (Autopilot includes it in per-pod pricing)
Cluster Management¶
- Multiple Cluster Types: Standard (manual management) and Autopilot (fully managed)
- Auto-Scaling: Cluster autoscaling for nodes, Horizontal/Vertical Pod Autoscaling
- Auto-Repair: Automatically repairs unhealthy nodes
- Auto-Upgrade: Automatically upgrades nodes to match control plane version
- Node Pools: Logical grouping of nodes with same configuration
Integration with Google Cloud¶
- Cloud Load Balancing: Automatic integration for Service type LoadBalancer
- Cloud Logging & Monitoring: Native integration with Cloud Operations
- Workload Identity: Securely access Google Cloud APIs from pods
- Binary Authorization: Ensure only trusted container images are deployed
- VPC-Native Clusters: Pods get IP addresses from VPC subnet ranges
- Private Clusters: Control plane and nodes isolated from public internet
Security¶
- Workload Identity: Map Kubernetes service accounts to Google Cloud service accounts
- Shielded GKE Nodes: Verifiable node integrity
- Binary Authorization: Deploy-time security policy enforcement
- GKE Sandbox: Run untrusted workloads using gVisor
- Security Posture Dashboard: Centralized security recommendations
- Network Policies: Control pod-to-pod communication
Developer Experience¶
- Cloud Code: IDE integration for development and debugging
- Config Connector: Manage GCP resources through Kubernetes
- kubectl: Standard Kubernetes CLI
- Cloud Console UI: Web-based cluster management
- Cloud Shell: Browser-based CLI with pre-installed tools
Important Limits¶
| Limit | Value | Notes |
|---|---|---|
| Max nodes per cluster | 15,000 (Standard), 1,000 node pools | Autopilot scales automatically |
| Max pods per node | 110 (default), up to 256 | Configurable with --max-pods-per-node |
| Max pods per cluster | 200,000 (Standard) | Autopilot manages this automatically |
| Clusters per project | 100 per location | Soft limit, can be increased |
| Node pools per cluster | 1,000 | Each pool can have different configurations |
| Max PVs per cluster | 256 PD per node, 128 local SSDs per node | Persistent disk limits |
| Services (LoadBalancer) | 5 per node, 300 per cluster | Network load balancer limits |
Cluster Types Comparison¶
| Feature | Standard GKE | Autopilot GKE |
|---|---|---|
| Node Management | Manual | Fully automated |
| Pricing Model | Per node-hour | Per pod resource request |
| Scaling | Configure autoscaling | Automatic |
| Node Configuration | Full control | Google-managed |
| SSH Access to Nodes | Yes | No |
| Custom Machine Types | Yes | Predefined pod specs |
| Node Pools | Manual creation | Automatically managed |
| Security Baseline | Configure manually | Hardened by default |
| Best For | Custom requirements, full control | Simplicity, hands-off operations |
When to Use¶
✅ Use GKE When:¶
-
Container Orchestration Needed
-
Running microservices architectures
- Need automatic scaling, self-healing, and rolling updates
-
Managing multiple containerized applications
-
Kubernetes Expertise Available
-
Team has Kubernetes knowledge
- Want standard Kubernetes APIs and ecosystem
-
Need portability across clouds
-
Google Cloud Integration Required
-
Leveraging Google Cloud services (Cloud SQL, Pub/Sub, BigQuery)
- Using Workload Identity for secure GCP access
-
Need integration with Cloud Load Balancing
-
High Availability and Scalability
-
Applications require 99.95%+ uptime
- Need to scale from handful to thousands of pods
-
Multi-region or multi-zone deployments
-
Managed Infrastructure Preferred
-
Want Google to manage control plane
- Prefer automatic updates and patches
- Need security hardening by default (Autopilot)
❌ Don’t Use GKE When:¶
-
Simple Applications
-
Single container application better suited for Cloud Run
- Serverless functions (use Cloud Functions)
-
Static websites (use Cloud Storage/Firebase Hosting)
-
No Container Experience
-
Team lacks container and Kubernetes knowledge
- Learning curve not justified by requirements
-
Simpler solutions available (App Engine, Cloud Run)
-
Windows-Heavy Workloads
-
Primarily Windows containers (GKE supports Windows but consider GCE)
-
Legacy Windows applications not containerized
-
Extremely Cost-Sensitive Small Workloads
-
Single small VM might be cheaper than minimum cluster
- Very low traffic applications
-
Development environments (unless Autopilot)
-
Complete Infrastructure Control Needed
-
Need to modify control plane configuration
- Require kernel-level modifications
- Custom network overlays incompatible with GKE
Common Use Cases¶
Microservices Architecture¶
GKE Cluster
├── Frontend Service (Deployment)
│ └── Pods: 3 replicas
├── API Service (Deployment)
│ └── Pods: 5 replicas
├── Auth Service (Deployment)
│ └── Pods: 2 replicas
└── Database (StatefulSet)
└── Pods: 3 replicas (with persistent volumes)
Ingress: HTTPS load balancer
Service Mesh: Istio for service-to-service communication
CI/CD Pipeline¶
GitHub → Cloud Build → Artifact Registry → GKE
├── Dev Cluster
├── Staging Cluster
└── Production Cluster
Batch Processing¶
GKE Autopilot
└── Jobs/CronJobs
├── Data processing jobs (scales to zero when complete)
├── ML training jobs
└── ETL pipelines
GKE vs Other Google Services¶
GKE vs Cloud Run
- GKE: Full Kubernetes, more control, stateful workloads
- Cloud Run: Serverless containers, simpler, stateless HTTP services
GKE vs Compute Engine
- GKE: Container orchestration, automatic scaling/healing
- Compute Engine: Full VM control, traditional applications
GKE vs App Engine
- GKE: More flexibility, any language/runtime, complex apps
- App Engine: Simpler PaaS, limited languages, quick deployment
Pricing Considerations¶
Standard GKE
- Control Plane: Free for zonal clusters, $0.10/hour for regional
- Nodes: Standard Compute Engine pricing for VMs
- Network: Egress charges apply
- Cost Optimization: Use Spot VMs, committed use discounts, right-size nodes
Autopilot GKE
- No Node Charges: Pay only for pod resource requests
- Control Plane: Included in pod pricing
- vCPU: $0.0445/hour per vCPU requested
- Memory: $0.00488/hour per GB requested
- Cost Optimization: Right-size pod requests, use vertical pod autoscaler
General Tips
- Use Autopilot for unpredictable workloads (pay only for used resources)
- Use Standard with Spot VMs for batch/fault-tolerant workloads (up to 91% discount)
- Enable cluster autoscaling to scale down during off-hours
- Use resource quotas to prevent cost overruns
Getting Started¶
Create a GKE Cluster (Standard)¶
# Create zonal cluster
gcloud container clusters create my-cluster \
--zone=us-central1-a \
--num-nodes=3 \
--machine-type=e2-medium \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=10
# Get credentials
gcloud container clusters get-credentials my-cluster --zone=us-central1-a
# Verify connection
kubectl get nodes
Create a GKE Cluster (Autopilot)¶
# Create Autopilot cluster (regional by default)
gcloud container clusters create-auto my-autopilot-cluster \
--region=us-central1
# Get credentials
gcloud container clusters get-credentials my-autopilot-cluster --region=us-central1
# Deploy application - nodes provisioned automatically
kubectl apply -f deployment.yaml
Best Practices¶
1. Cluster Configuration¶
- Use regional clusters for production (99.95% SLA)
- Enable Workload Identity for secure GCP access
- Use VPC-native clusters (alias IP ranges)
- Enable Binary Authorization for production
- Configure maintenance windows for upgrades
2. Security¶
- Enable Workload Identity (not metadata server)
- Use least-privilege IAM roles
- Implement Network Policies
- Use Private clusters for sensitive workloads
- Enable GKE Dataplane V2 for improved networking
3. Resource Management¶
- Set resource requests and limits on all pods
- Use resource quotas and limit ranges per namespace
- Enable Horizontal Pod Autoscaler for variable workloads
- Use Vertical Pod Autoscaler to right-size requests
4. Monitoring & Logging¶
- Enable GKE monitoring and logging (now default)
- Use Workload metrics for application-level monitoring
- Set up alerts for cluster and pod health
- Use Cloud Trace for distributed tracing
5. Cost Optimization¶
- Use Spot VMs for fault-tolerant workloads
- Enable cluster autoscaling
- Right-size node pools and pod resources
- Use Autopilot for variable or unpredictable workloads
- Clean up unused resources (PVs, LoadBalancers)