Persistent Disks¶

Core Concepts¶

Persistent Disks are durable, network-attached block storage independent of VM lifecycle. Understanding disk types and their performance characteristics is crucial for designing performant and cost-effective solutions.

Key Principle: Performance scales with disk size; separate boot and data disks for flexibility.

Disk Types Comparison¶

Type	IOPS/GB	Max IOPS	Throughput	Use Case	Cost
pd-standard	0.75-1.5	7,500-15,000	Low	Archives, backups	Lowest
pd-balanced	6	80,000	28 MB/s/GB	General purpose	Medium
pd-ssd	30	100,000	48 MB/s/GB	Databases, high I/O	High
pd-extreme	Custom	120,000	2,400 MB/s	Mission-critical	Highest
hyperdisk-balanced	Configurable	160,000	Dynamic	Next-gen balanced	Medium-High
hyperdisk-extreme	Configurable	350,000	Dynamic	Highest performance	Highest
Local SSD	N/A	2,400,000	9,360 MB/s	Ultra-high perf	Medium (ephemeral)

Architectural Decision Criteria¶

pd-standard (HDD)¶

Appropriate for:

Sequential access patterns (logs, media files)
Large datasets with infrequent access
Backup target storage
Cost-sensitive workloads
Data archiving

Not appropriate for:

Database storage
Random I/O workloads
Low-latency requirements
Applications needing >15,000 IOPS

pd-balanced (SSD) - Recommended Default¶

Appropriate for:

General-purpose workloads (should be default choice)
Small to medium databases
Boot disks
Most enterprise applications
Development environments
Balance of price and performance

Architecture Pattern: Start with pd-balanced, upgrade to pd-ssd only if I/O bottleneck identified

pd-ssd¶

Appropriate for:

High-performance databases
I/O-intensive applications
OLTP workloads
Applications requiring low latency
Random I/O patterns

Cost Consideration: 2.5x cost of pd-balanced; verify I/O requirements justify cost

pd-extreme¶

Appropriate for:

Mission-critical databases (SAP HANA, Oracle)
Very large databases requiring >100,000 IOPS
Real-time analytics
High-frequency trading
When performance is more important than cost

Limitations:

Requires N2, C2, or M-series VMs
Minimum size: 500 GB
Much higher cost than pd-ssd
Custom IOPS configuration required

Local SSDs¶

Characteristics:

Physically attached to server
Ephemeral (data lost on VM stop/delete)
Ultra-high performance
375 GB per device, up to 24 devices (9 TB max)
No persistent disk performance limits apply

Appropriate for:

Temporary cache
Scratch space for computation
Data that can be rebuilt (replicated databases)
High-performance computing
When highest IOPS/throughput needed

Not appropriate for:

Primary database storage without replication
Data that must survive VM stop
Critical data without backups
Single point of failure scenarios

Architecture Pattern: Use for performance, replicate critical data externally

Performance Scaling¶

Size-Based Performance¶

pd-balanced Example:

100 GB: 600 read IOPS, 600 write IOPS
500 GB: 3,000 IOPS
13,334 GB: 80,000 IOPS (maximum)

Design Implication: May need larger disk for performance, not just capacity

VM-Level Limits¶

Performance is limited by:

Disk type and size
VM machine type (CPU count)
Number of vCPUs determines max I/O

Example:

n2-standard-2 (2 vCPUs): Max 15,000 read IOPS
n2-standard-32 (32 vCPUs): Max 100,000 read IOPS

Architecture Consideration: Right-size VM for I/O requirements, not just compute

Regional Persistent Disks¶

Synchronous Replication¶

Characteristics:

Replicated across two zones in same region
Synchronous writes (both zones acknowledge)
Automatic failover on zone failure
RPO: Near-zero (synchronous)
RTO: Minutes (automatic failover)
2x cost of zonal disks

Appropriate for:

High-availability databases
Mission-critical applications
Zone failure protection required
Applications needing automatic failover
When cost of 2x storage justified by availability

Not appropriate for:

Cost-sensitive workloads
Single-zone deployments
Applications with external replication
Development/testing

Architecture Pattern: Use for stateful tier in multi-zone deployment

Failover Behavior¶

Automatic Failover:

Regional disk automatically attaches to VM in surviving zone
Application must handle brief I/O pause
No data loss (synchronous replication)
Requires regional MIG for automatic VM recreation

Limitations:

Both zones must be available for writes
Slightly higher latency than zonal (cross-zone sync)
Cannot span regions
Requires zone failure, not VM failure (use MIG autohealing for VM failures)

Disk Architecture Patterns¶

Separate Boot and Data Disks¶

Benefits:

Independent lifecycle management
Different disk types (balanced boot, SSD data)
Easier backups (snapshot data disk only)
Flexibility to resize independently
Replace boot disk without data loss

Pattern:

Boot disk: 20-50 GB pd-balanced
Data disk(s): Size and type based on workload
Attach data disks to multiple VMs (read-only)

Multi-Disk for Performance¶

Stripe Multiple Disks:

Combine multiple disks in RAID 0
Aggregate IOPS and throughput
Each disk contributes performance
Better than single large disk for max performance

Consideration: Local SSDs better choice for highest performance

Disk Quotas and Limits¶

Per VM Limits:

128 persistent disks (including boot)
257 TB total persistent disk size
24 Local SSD devices (9 TB)

Architecture Impact:

Plan data architecture within limits
Consider object storage (Cloud Storage) for > 128 data sources
Use Filestore for NFS requirements

Snapshot Architecture¶

Incremental Backups¶

How It Works:

First snapshot: Full copy of disk
Subsequent snapshots: Only changed blocks
Snapshot chains maintained automatically
Deleting intermediate snapshots safe (blocks preserved if needed)

Cost Efficiency:

Daily snapshots of 1 TB disk with 5% daily change
Month 1: ~1,000 GB + 30 × 50 GB = ~2,500 GB storage
Much cheaper than 30 full copies (30,000 GB)

Snapshot Storage Locations¶

Regional:

Stored in single region
Lower cost
Faster creation/restore (same region)
Use for: Non-critical data, cost optimization

Multi-Regional:

Stored across multiple regions
Higher cost
Disaster recovery protection
Slower initial creation
Use for: Critical data, DR requirements

Architecture Decision: Balance cost vs DR requirements

Encryption¶

Default Encryption¶

Google-Managed Keys:

All persistent disks encrypted at rest
Automatic, no configuration
No performance impact
Transparent to applications

Customer-Managed Encryption Keys (CMEK)¶

Benefits:

Control key lifecycle
Regulatory compliance
Audit key usage
Revoke access immediately

Considerations:

Additional Cloud KMS costs
Key management responsibility
Availability dependency on Cloud KMS
Slight performance impact

Use Cases:

Regulatory requirements (HIPAA, PCI-DSS)
Enhanced security posture
Key rotation policies
Multi-cloud key management

Cost Optimization Strategies¶

Right-Sizing¶

Approach:

Start with pd-balanced (cost-effective default)
Monitor I/O metrics
Upgrade to pd-ssd only if bottleneck identified
Downgrade to pd-standard for sequential-only workloads

Snapshot Management¶

Best Practices:

Implement retention policies (delete old snapshots)
Use snapshot schedules (automated lifecycle)
Regional storage for non-critical data
Delete disk but keep snapshots (cheaper long-term storage)

Storage Tiering¶

Pattern:

Hot data: pd-ssd (frequent access)
Warm data: pd-balanced (occasional access)
Cold data: pd-standard or Cloud Storage (rare access)
Archive: Cloud Storage Archive class (long-term retention)

Disaster Recovery Considerations¶

Backup Strategy¶

Snapshot Frequency:

Critical data: Hourly or continuous (regional disks)
Important data: Daily
Normal data: Weekly
Test data: Monthly or manual

Retention:

Compliance requirements dictate minimum
Balance cost vs recovery point options
30 days common for production
7 days for development

Cross-Region DR¶

Approach:

Snapshots stored in multi-regional location
Restore disks in DR region from snapshots
Test DR procedures regularly
Document recovery procedures

RTO Considerations:

Snapshot restore: 30-60 minutes
Regional disk failover: Minutes
Multi-region failover: 1-2 hours (manual)

Exam Focus Areas¶

Design Decisions¶

Disk type selection based on workload
Performance sizing (IOPS/throughput requirements)
Regional vs zonal disks for HA
Cost optimization strategies

Architecture Patterns¶

Separate boot and data disks
Local SSD use cases and limitations
Multi-disk configurations
Snapshot strategies

Performance¶

Size-based performance scaling
VM-level performance limits
When to use Local SSDs
Multi-disk striping considerations

Disaster Recovery¶

Snapshot frequency and retention
Regional disk RPO/RTO
Cross-region DR strategies
Backup architecture patterns

Cost Management¶

Disk type cost comparison
Snapshot storage optimization
Unused disk identification
Storage tiering strategies