Your DevOps practices worked beautifully on-premises. Then you moved to the cloud and everything got complicated.
Deployment pipelines that took 20 minutes now take 90 minutes. Infrastructure provisioning that was predictable is now inconsistent. Costs that were fixed are now wildly variable. Your team is drowning in cloud-specific services, each with different patterns and best practices.
Here's what changed: The cloud isn't just "someone else's computer." AWS, Azure, and GCP have fundamentally different service models, architecture patterns, and operational best practices. What works brilliantly on AWS can be inefficient or expensive on Azure. Azure patterns don't translate directly to GCP.
The data confirms this: Organizations that use cloud-native DevOps patterns deploy 7.2x more frequently and recover from incidents 4.8x faster than those using generic on-premises patterns in cloud (DORA State of DevOps Report, 2024). Yet 62% of organizations still use adapted on-premises DevOps practices in cloud instead of platform-specific patterns.
The gap costs real money: 43% higher cloud bills, 3-5x longer deployment times, and 2-3x more production incidents.
This guide provides platform-specific DevOps best practices for AWS, Azure, and GCP—so you stop fighting the platform and start leveraging its strengths.
Why Generic Patterns Fail
The "Lift and Shift" DevOps Trap
Most organizations migrate infrastructure to cloud but keep on-premises DevOps patterns:
- Jenkins on EC2/VMs instead of cloud-native CI/CD
- Manual infrastructure provisioning instead of Infrastructure as Code
- Server-based deployments instead of containerized or serverless
- Traditional monitoring instead of cloud-native observability
Result: You pay cloud prices for on-premises performance. Worse, you miss cloud capabilities that would dramatically improve speed and reliability.
The Multi-Cloud Complexity
Many organizations pursue multi-cloud for:
- Risk mitigation (don't depend on one vendor)
- Cost optimization (use cheapest provider for each workload)
- Regulatory requirements (data residency)
- Acquisition integration (inherited platforms)
Problem: Each platform has 200+ services with unique APIs, pricing models, and operational patterns. A "one size fits all" approach means you're optimizing for none of them.
The Service Explosion Problem
- AWS: 240+ services (and growing)
- Azure: 200+ services
- GCP: 100+ services
Each deployment could potentially use dozens of services. Which CI/CD service? Which container orchestration? Which database? Which monitoring? Each choice has downstream implications for architecture, cost, and operations.
Platform-Specific DevOps: The Foundation
Before diving into platform differences, understand the common cloud-native DevOps principles:
Universal Cloud DevOps Principles
1. Infrastructure as Code (IaC) is Non-Negotiable
- Every infrastructure component defined in code
- Version controlled alongside application code
- Automated provisioning and configuration
- Immutable infrastructure (replace, don't modify)
2. CI/CD is Platform-Native
- Use cloud provider's native CI/CD services (or integrate deeply)
- Deployment pipelines as code
- Automated testing at every stage
- Blue-green or canary deployments
3. Observability, Not Just Monitoring
- Logs, metrics, and traces correlated
- Cloud-native observability services
- Automated alerting and incident response
- Cost monitoring as operational metric
4. Security Integrated, Not Bolted On
- Identity and access management (IAM) as foundation
- Secrets management via cloud services
- Security scanning in pipeline
- Compliance as code
5. Cost Optimization by Design
- Right-sizing resources (not over-provisioning)
- Auto-scaling and spot/preemptible instances
- Reserved capacity for predictable workloads
- Cost monitoring and budgeting
Now, let's explore how these principles manifest differently on each platform.
AWS DevOps Best Practices
AWS Strengths for DevOps
- Maturity: Longest cloud history, most services
- Ecosystem: Largest third-party tool integration
- Flexibility: Widest range of service options
- Innovation: Fastest service releases
AWS Service Selection for DevOps
CI/CD Pipeline Stack
Option 1: AWS-Native (Recommended for AWS-Only)
Source Control: AWS CodeCommit (or GitHub/GitLab)
CI/CD: AWS CodePipeline + AWS CodeBuild
Artifact Storage: AWS CodeArtifact / Amazon S3
Deployment: AWS CodeDeploy
Advantages:
- Deep AWS integration (easy IAM, networking, service access)
- Pay-per-use pricing (no idle compute costs)
- Native support for Lambda, ECS, EC2 deployments
Disadvantages:
- Limited to AWS (multi-cloud requires different tooling)
- Less feature-rich than Jenkins/GitLab CI (but improving)
Option 2: Jenkins on ECS/EKS (Recommended for Multi-Cloud)
CI/CD: Jenkins on Amazon ECS or EKS
Source Control: GitHub / GitLab / Bitbucket
Artifact Storage: Amazon S3 / Nexus on EC2
Deployment: Custom scripts + AWS CLI/SDKs
Advantages:
- Flexibility and plugin ecosystem
- Multi-cloud capable
- Team familiarity (most common CI/CD tool)
Disadvantages:
- Infrastructure overhead (managing Jenkins)
- More expensive (always-on compute)
- Requires more operational expertise
Recommendation: Start with CodePipeline for AWS-specific workloads. Use Jenkins/GitLab CI if you need multi-cloud or complex workflows.
Infrastructure as Code
Tool Options:
- AWS CloudFormation (AWS-native)
- Terraform (multi-cloud)
- AWS CDK (code-based, generates CloudFormation)
- Pulumi (code-based, multi-cloud)
Best Practice: Hybrid Approach
Foundation/Networking: Terraform (shareable across cloud providers)
Application Infrastructure: AWS CDK (leverages programming languages)
Configuration: AWS Systems Manager Parameter Store / AWS Secrets Manager
Example AWS CDK Pattern (TypeScript):
import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as ecs_patterns from 'aws-cdk-lib/aws-ecs-patterns';
export class MyAppStack extends cdk.Stack {
constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// VPC with public and private subnets
const vpc = new ec2.Vpc(this, 'MyVPC', {
maxAzs: 3, // 3 availability zones for high availability
natGateways: 1 // Cost optimization: 1 NAT gateway
});
// ECS Cluster
const cluster = new ecs.Cluster(this, 'MyCluster', {
vpc: vpc,
containerInsights: true // Enable CloudWatch Container Insights
});
// Fargate Service with ALB
const fargateService = new ecs_patterns.ApplicationLoadBalancedFargateService(
this,
'MyFargateService',
{
cluster: cluster,
cpu: 512,
memoryLimitMiB: 1024,
desiredCount: 3, // HA across AZs
taskImageOptions: {
image: ecs.ContainerImage.fromRegistry('myapp:latest'),
environment: {
ENV: 'production'
},
},
publicLoadBalancer: true,
}
);
// Auto-scaling based on CPU
const scaling = fargateService.service.autoScaleTaskCount({
minCapacity: 3,
maxCapacity: 10
});
scaling.scaleOnCpuUtilization('CpuScaling', {
targetUtilizationPercent: 70
});
}
}
Why AWS CDK for Application Infrastructure:
- Type safety and IDE support
- Reusable constructs (like functions)
- Generates CloudFormation (audit trail, rollback)
- Programming language familiarity
Container Orchestration
Option 1: Amazon ECS (Elastic Container Service)
- When: AWS-only, simpler use cases, tight AWS integration
- Advantages: No Kubernetes complexity, deep AWS integration, lower learning curve
- Cost: ~30% cheaper than EKS (no control plane costs)
Option 2: Amazon EKS (Elastic Kubernetes Service)
- When: Multi-cloud strategy, Kubernetes expertise, complex orchestration
- Advantages: Kubernetes standard, portable, rich ecosystem
- Cost: $0.10/hour for control plane + node costs
Option 3: AWS Fargate
- When: Don't want to manage servers at all
- Advantages: True serverless containers, no node management
- Cost: 20-30% premium over EC2, but no idle capacity waste
Best Practice Deployment Pattern:
Development: ECS on Fargate (simplicity, no infrastructure management)
Staging: ECS on Fargate (consistency with production)
Production: ECS on EC2 with Auto Scaling (cost optimization, control)
OR
Production: EKS if multi-cloud or complex orchestration needs
Observability Stack
AWS-Native Observability:
Logs: Amazon CloudWatch Logs
Metrics: Amazon CloudWatch Metrics
Traces: AWS X-Ray
Dashboards: Amazon CloudWatch Dashboards
Alerts: Amazon CloudWatch Alarms + Amazon SNS
Enhanced Observability (For Complex Environments):
Logs: CloudWatch Logs → Amazon OpenSearch Service
Metrics: CloudWatch Metrics + Prometheus on EKS
Traces: AWS X-Ray + Jaeger
Visualization: Grafana on ECS/EKS
Alerts: CloudWatch Alarms + PagerDuty/Opsgenie
Cost Optimization Pattern:
- Use CloudWatch for real-time operational metrics
- Stream to S3 for long-term storage and analysis (much cheaper)
- Use Athena for ad-hoc queries on S3 logs
- OpenSearch only for search-heavy use cases
Security & Compliance
AWS Security Best Practices:
1. IAM Roles, Not Access Keys
✅ Use IAM roles for EC2/ECS/Lambda (no embedded credentials)
✅ Use IRSA (IAM Roles for Service Accounts) for EKS
❌ Never embed AWS access keys in code or containers
2. Secrets Management:
Application Secrets: AWS Secrets Manager (auto-rotation)
Configuration: AWS Systems Manager Parameter Store (free tier available)
Encryption Keys: AWS KMS (centralized key management)
3. Network Security:
VPC Design: Public subnets (load balancers), private subnets (applications), isolated subnets (databases)
Security Groups: Least privilege (specific ports, specific sources)
NACLs: Additional layer for sensitive workloads
VPC Flow Logs: Network traffic monitoring
4. Compliance Automation:
Config Rules: AWS Config (continuous compliance monitoring)
Scanning: Amazon Inspector (vulnerability scanning)
GuardDuty: AWS GuardDuty (threat detection)
Security Hub: Centralized security findings
AWS DevOps Pipeline Example
Complete AWS-native pipeline for containerized application:
# buildspec.yml (CodeBuild)
version: 0.2
phases:
pre_build:
commands:
- echo Logging in to Amazon ECR...
- aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
- REPOSITORY_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/my-app
- COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
- IMAGE_TAG=${COMMIT_HASH:=latest}
build:
commands:
- echo Build started on `date`
- docker build -t $REPOSITORY_URI:latest .
- docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
post_build:
commands:
- echo Build completed on `date`
- docker push $REPOSITORY_URI:latest
- docker push $REPOSITORY_URI:$IMAGE_TAG
- echo Writing image definitions file...
- printf '[{"name":"my-app","imageUri":"%s"}]' $REPOSITORY_URI:$IMAGE_TAG > imagedefinitions.json
artifacts:
files: imagedefinitions.json
Deployment to ECS (Blue/Green via CodeDeploy):
- CodePipeline triggers on git push
- CodeBuild runs tests and builds Docker image
- Image pushed to Amazon ECR
- CodeDeploy creates new task definition with new image
- Blue/Green deployment:
- Deploy to new ECS tasks (green)
- Shift traffic gradually to green
- Monitor CloudWatch metrics
- Automatic rollback if errors spike
- Terminate old tasks (blue) after success
Azure DevOps Best Practices
Azure Strengths for DevOps
- Microsoft Integration: Best for .NET and Windows workloads
- Enterprise Features: Strong governance and compliance
- Hybrid Capability: Seamless on-premises + cloud
- Azure DevOps: Comprehensive, mature DevOps platform
Azure Service Selection for DevOps
CI/CD Pipeline Stack
Azure-Native (Recommended):
Source Control: Azure Repos (or GitHub)
CI/CD: Azure Pipelines
Artifact Storage: Azure Artifacts
Deployment: Azure Pipelines (built-in deployment)
Advantages over AWS:
- Single integrated platform (not separate services like AWS)
- Unlimited CI/CD minutes for self-hosted agents
- Native YAML pipelines (infrastructure as code)
- Strong Windows/.NET support
Azure Pipelines YAML Example:
trigger:
branches:
include:
- main
- develop
pool:
vmImage: 'ubuntu-latest'
variables:
dockerRegistryServiceConnection: 'myACRConnection'
imageRepository: 'myapp'
containerRegistry: 'myregistry.azurecr.io'
dockerfilePath: '$(Build.SourcesDirectory)/Dockerfile'
tag: '$(Build.BuildId)'
stages:
- stage: Build
displayName: Build and Push
jobs:
- job: Build
displayName: Build and Push Docker Image
steps:
- task: Docker@2
displayName: Build and push image
inputs:
command: buildAndPush
repository: $(imageRepository)
dockerfile: $(dockerfilePath)
containerRegistry: $(dockerRegistryServiceConnection)
tags: |
$(tag)
latest
- stage: Deploy
displayName: Deploy to AKS
dependsOn: Build
jobs:
- deployment: Deploy
displayName: Deploy to AKS
environment: 'production'
strategy:
runOnce:
deploy:
steps:
- task: KubernetesManifest@0
displayName: Deploy to Kubernetes
inputs:
action: deploy
manifests: |
$(Pipeline.Workspace)/manifests/deployment.yml
$(Pipeline.Workspace)/manifests/service.yml
containers: |
$(containerRegistry)/$(imageRepository):$(tag)
Infrastructure as Code
Tool Options:
- Azure Resource Manager (ARM) Templates (JSON-based, verbose)
- Bicep (Azure-native, simpler than ARM)
- Terraform (multi-cloud)
Recommendation: Bicep for Azure-Specific
Bicep is Azure's answer to AWS CDK—simpler than ARM, Azure-native:
// Web App with App Service Plan
param location string = resourceGroup().location
param appName string = 'myapp-${uniqueString(resourceGroup().id)}'
// App Service Plan (Linux)
resource appServicePlan 'Microsoft.Web/serverfarms@2022-03-01' = {
name: '${appName}-plan'
location: location
sku: {
name: 'P1v3'
tier: 'PremiumV3'
capacity: 3
}
kind: 'linux'
properties: {
reserved: true
}
}
// Web App
resource webApp 'Microsoft.Web/sites@2022-03-01' = {
name: appName
location: location
properties: {
serverFarmId: appServicePlan.id
siteConfig: {
linuxFxVersion: 'DOCKER|myregistry.azurecr.io/myapp:latest'
appSettings: [
{
name: 'WEBSITES_ENABLE_APP_SERVICE_STORAGE'
value: 'false'
}
{
name: 'DOCKER_REGISTRY_SERVER_URL'
value: 'https://myregistry.azurecr.io'
}
]
alwaysOn: true
http20Enabled: true
}
httpsOnly: true
}
}
// Auto-scaling
resource autoScaleSettings 'Microsoft.Insights/autoscalesettings@2022-10-01' = {
name: '${appName}-autoscale'
location: location
properties: {
profiles: [
{
name: 'Default'
capacity: {
minimum: '3'
maximum: '10'
default: '3'
}
rules: [
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 70
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '1'
cooldown: 'PT5M'
}
}
]
}
]
targetResourceUri: appServicePlan.id
}
}
output webAppUrl string = webApp.properties.defaultHostName
Why Bicep:
- Much simpler than ARM JSON
- Native Azure support (transpiles to ARM)
- Strong typing and IntelliSense
- Easier debugging than Terraform for Azure
Container Orchestration
Option 1: Azure Container Apps (Newest, Recommended for Many Use Cases)
- When: Microservices, event-driven, don't want Kubernetes complexity
- Advantages: Serverless containers, auto-scaling to zero, built-in KEDA, simpler than AKS
- Cost: Pay per second of execution, scale to zero
Option 2: Azure Kubernetes Service (AKS)
- When: Complex orchestration, Kubernetes expertise, large-scale
- Advantages: Managed Kubernetes, free control plane, excellent Azure integration
- Cost: Only pay for nodes (control plane free, unlike AWS EKS)
Option 3: Azure App Service
- When: Simple web apps, .NET workloads, don't need containers
- Advantages: Fully managed, deployment slots (blue-green), easy scaling
- Cost: Fixed pricing tiers, predictable
Best Practice:
Simple Web Apps: Azure App Service
Microservices: Azure Container Apps (2023+)
Complex Kubernetes Needs: AKS
Azure Container Apps is Azure's competitive advantage—simpler than AWS ECS, more capable than AWS App Runner.
Observability Stack
Azure-Native:
Logs: Azure Monitor Logs (Log Analytics)
Metrics: Azure Monitor Metrics
Traces: Application Insights (auto-instrumentation for .NET/Java/Node/Python)
Dashboards: Azure Dashboards / Azure Workbooks
Alerts: Azure Monitor Alerts + Action Groups
Application Insights is Azure's killer feature:
- Automatic instrumentation (no code changes for basic telemetry)
- Application map (auto-discovers dependencies)
- Live metrics (real-time performance view)
- Integrated with Azure DevOps (correlate deployments with incidents)
Example: Application Insights in .NET:
// No manual instrumentation needed!
// Just add NuGet package and configure in appsettings.json:
{
"ApplicationInsights": {
"ConnectionString": "InstrumentationKey=..."
}
}
// Automatic collection of:
// - HTTP requests and dependencies
// - Exceptions
// - Performance counters
// - Custom events (if you add)
Security & Compliance
Azure Security Best Practices:
1. Managed Identities (Azure's IAM Roles):
✅ Use System-Assigned or User-Assigned Managed Identities
✅ No secrets in code or configuration
✅ Automatic credential rotation
2. Azure Key Vault (Secrets + Certificates + Keys):
Application Secrets: Azure Key Vault Secrets
Certificates: Azure Key Vault Certificates (auto-renewal)
Encryption Keys: Azure Key Vault Keys (HSM-backed)
// Reference in App Service:
@Microsoft.KeyVault(SecretUri=https://myvault.vault.azure.net/secrets/dbpassword/)
3. Network Security:
VNets: Similar to AWS VPC
NSGs: Similar to AWS Security Groups
Azure Firewall: Centralized network security
Private Link: Private connectivity to Azure services (no internet exposure)
4. Compliance:
Azure Policy: Enforce organizational standards (like AWS Config Rules)
Azure Security Center: Unified security management
Azure Sentinel: SIEM and SOAR
Defender for Cloud: Workload protection
Azure DevOps Advantage: Work Item Integration
Unique Azure DevOps feature—work items linked to commits/builds/releases:
# In commit message:
git commit -m "Fixed authentication bug #1234"
# Azure DevOps automatically:
# - Links commit to work item #1234
# - Updates work item status
# - Shows in release notes
# - Enables full traceability (requirement → code → build → deployment)
This traceability is harder to achieve in AWS (requires third-party tools).
GCP DevOps Best Practices
GCP Strengths for DevOps
- Simplicity: Cleaner APIs, easier to learn
- Innovation: Kubernetes origins, ML/AI integration
- Networking: Best global network performance
- Cost: Often 20-30% cheaper than AWS/Azure
GCP Service Selection for DevOps
CI/CD Pipeline Stack
GCP-Native:
Source Control: Cloud Source Repositories (or GitHub/GitLab)
CI/CD: Cloud Build
Artifact Storage: Artifact Registry
Deployment: Cloud Build + GKE/Cloud Run
Cloud Build YAML Example:
# cloudbuild.yaml
steps:
# Build Docker image
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/myapp:$SHORT_SHA', '.']
# Push to Container Registry
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/myapp:$SHORT_SHA']
# Deploy to Cloud Run
- name: 'gcr.io/cloud-builders/gcloud'
args:
- 'run'
- 'deploy'
- 'myapp'
- '--image=gcr.io/$PROJECT_ID/myapp:$SHORT_SHA'
- '--region=us-central1'
- '--platform=managed'
- '--allow-unauthenticated'
images:
- 'gcr.io/$PROJECT_ID/myapp:$SHORT_SHA'
Cloud Build advantages:
- Simplest of the three platforms
- Pay-per-minute (120 free minutes/day)
- Native Docker support
- Tight GKE/Cloud Run integration
Infrastructure as Code
Tool Options:
- Google Cloud Deployment Manager (YAML/Python, less popular)
- Terraform (recommended, best GCP support among multi-cloud tools)
Terraform is the de facto standard for GCP:
# GKE Cluster with Node Pool
resource "google_container_cluster" "primary" {
name = "my-gke-cluster"
location = "us-central1"
# Regional cluster for HA
node_locations = [
"us-central1-a",
"us-central1-b",
"us-central1-c"
]
# Remove default node pool (we'll create custom one)
remove_default_node_pool = true
initial_node_count = 1
# Workload Identity for secure pod authentication
workload_identity_config {
workload_pool = "${var.project_id}.svc.id.goog"
}
# Enable binary authorization for security
binary_authorization {
evaluation_mode = "PROJECT_SINGLETON_POLICY_ENFORCE"
}
}
# Custom node pool with autoscaling
resource "google_container_node_pool" "primary_nodes" {
name = "my-node-pool"
location = "us-central1"
cluster = google_container_cluster.primary.name
autoscaling {
min_node_count = 3
max_node_count = 10
}
node_config {
preemptible = false
machine_type = "n1-standard-4"
# Use spot instances for cost savings (like AWS Spot/Azure Spot)
spot = true
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
# Workload Identity
workload_metadata_config {
mode = "GKE_METADATA"
}
}
}
Why Terraform for GCP:
- Better than Deployment Manager (more mature, better docs)
- GCP's official recommendation
- Multi-cloud portability if needed
Container Orchestration
Option 1: Cloud Run (GCP's Unique Advantage)
- When: Stateless services, HTTP-triggered, want simplest deployment
- Advantages: True serverless containers, scale to zero, pay per request, no Kubernetes complexity
- Cost: Most cost-effective for variable traffic
Cloud Run Example:
# Deploy container to Cloud Run (that's it!)
gcloud run deploy myapp \
--image gcr.io/myproject/myapp:latest \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--min-instances 1 \
--max-instances 100 \
--cpu 2 \
--memory 4Gi
# Automatic:
# - HTTPS endpoint
# - TLS certificate
# - Auto-scaling (including to zero)
# - Load balancing
# - Logging and monitoring
Option 2: Google Kubernetes Engine (GKE)
- When: Complex orchestration, stateful workloads, Kubernetes standard
- Advantages: Best Kubernetes (Google created Kubernetes), Autopilot mode (fully managed nodes)
- Cost: Competitive pricing, free control plane
GKE Autopilot Mode (2021+):
- Google manages nodes entirely
- Pay only for pod resources (not node resources)
- Auto-scaling, auto-upgrades, auto-repairs
- Simpler than AWS EKS/Azure AKS
Best Practice:
Most Workloads: Cloud Run (simplest, cheapest for variable traffic)
Kubernetes Need: GKE Autopilot (managed nodes)
Complex Orchestration: GKE Standard (full control)
Cloud Run is GCP's killer app for DevOps—unmatched simplicity.
Observability Stack
GCP-Native (Operations Suite, formerly Stackdriver):
Logs: Cloud Logging
Metrics: Cloud Monitoring
Traces: Cloud Trace
Profiling: Cloud Profiler
Debugging: Cloud Debugger (production debugging without stopping service!)
Error Reporting: Cloud Error Reporting (automatic error aggregation)
GCP's observability advantage:
- Unified Operations Suite (one place, not separate services)
- Cloud Debugger: Set breakpoints in production without redeploying
- Cloud Profiler: Continuous profiling (find performance bottlenecks in prod)
- Auto-instrumentation: For GKE, Cloud Run, App Engine
Example: Cloud Logging Query:
-- Structured logging query language (easier than AWS CloudWatch Insights)
resource.type="cloud_run_revision"
severity="ERROR"
timestamp>"2025-01-01T00:00:00Z"
jsonPayload.user_id="12345"
Security & Compliance
GCP Security Best Practices:
1. Service Accounts (GCP's IAM for services):
✅ Each service has dedicated service account
✅ Workload Identity for GKE (similar to AWS IRSA)
✅ Short-lived tokens, automatic rotation
2. Secret Manager:
Secrets: Secret Manager (versioned, auto-rotation capable)
Encryption: Cloud KMS (integrated with all services)
# Access secret in Cloud Run:
gcloud run deploy myapp \
--set-secrets="DB_PASSWORD=db-password:latest"
3. VPC Security:
VPC: Google Virtual Private Cloud
Firewall Rules: Stateful (simpler than AWS Security Groups)
Private Google Access: Access GCP services without internet
VPC Service Controls: Perimeter security (prevent data exfiltration)
4. Compliance:
Policy Intelligence: Recommendations for over-permissioned IAM
Security Command Center: Centralized security view
Binary Authorization: Only verified container images can run
GCP DevOps Simplicity: The Differentiator
Example: Deploy containerized app to production:
AWS (15+ steps):
- Create VPC, subnets, NAT gateways
- Create ECS cluster
- Create task definition
- Create service
- Create Application Load Balancer
- Configure target groups
- Configure security groups
- Create ECR repository
- Build and push image
- Set up CodePipeline
- Configure CodeBuild
- Configure IAM roles (multiple)
- Set up CloudWatch logging
- Configure auto-scaling
- Set up Route 53 DNS
GCP (3 commands):
# Build and push image
gcloud builds submit --tag gcr.io/myproject/myapp
# Deploy to Cloud Run
gcloud run deploy myapp --image gcr.io/myproject/myapp --allow-unauthenticated
# (Optional) Map custom domain
gcloud run domain-mappings create --service myapp --domain myapp.com
This simplicity is GCP's competitive advantage. For many workloads, GCP is objectively easier to operationalize.
Platform Comparison: Quick Decision Matrix
| Criterion | AWS | Azure | GCP |
|---|---|---|---|
| Maturity | Highest (2006) | High (2010) | Medium (2011) |
| Service Breadth | Widest (240+) | Wide (200+) | Focused (100+) |
| Simplicity | Complex | Medium | Simplest |
| Cost | Baseline | 5-10% more | 10-20% less |
| CI/CD | CodePipeline (AWS-only) | Azure Pipelines (best integrated) | Cloud Build (simplest) |
| IaC | CloudFormation/CDK | Bicep (simplest for Azure) | Terraform (standard) |
| Containers (Managed) | ECS/Fargate | Container Apps | Cloud Run (best) |
| Kubernetes | EKS ($0.10/hr control plane) | AKS (free control plane) | GKE Autopilot (best) |
| Observability | CloudWatch (separate services) | Application Insights (auto-instrumentation) | Operations Suite (unified) |
| ML/AI Integration | SageMaker (comprehensive) | Azure ML (strong) | Vertex AI (best) |
| Windows/.NET | Supported | Best | Supported |
| Learning Curve | Steepest | Medium | Easiest |
| Multi-Cloud | Challenging | Hybrid-friendly | Challenging |
| Best For | Maximum flexibility, innovation | Enterprise, Microsoft shops | Simplicity, Kubernetes, ML |
Multi-Cloud DevOps Strategy
If you must support multiple clouds (and think carefully if you really must):
Abstraction Layers
Infrastructure: Terraform
- Write Terraform modules abstracting cloud differences
- Use Terraform workspaces for different environments
- Accept some platform-specific code (full abstraction impossible)
CI/CD: GitLab CI or Jenkins
- Works across all platforms
- Use cloud-specific deployment scripts
- Consistent pipeline structure
Containers: Kubernetes
- Kubernetes abstracts cloud differences (mostly)
- Use AWS EKS, Azure AKS, GCP GKE
- Application deployment is portable
- Platform services (databases, storage) still differ
Observability: OpenTelemetry + Grafana
- OpenTelemetry for instrumentation (cloud-agnostic)
- Grafana for visualization
- Loki for logs, Prometheus for metrics, Tempo for traces
Multi-Cloud Anti-Patterns to Avoid
❌ Lowest common denominator: Using only features available on all clouds (you underutilize each platform)
❌ Perfect portability: Trying to make everything 100% portable (expensive, limits innovation)
❌ Multi-cloud by default: Assume one cloud unless there's compelling reason for multiple
✅ Strategic multi-cloud: Different clouds for different purposes
- AWS for core applications (maturity, breadth)
- GCP for ML/AI workloads (Vertex AI, BigQuery)
- Azure for Microsoft workloads (.NET, Office 365 integration)
Your Cloud DevOps Action Plan
This Week:
Assess current cloud usage (2 hours)
- Which clouds are you using?
- Which services?
- What's working, what's painful?
Identify one quick win (1 hour)
- Platform-native CI/CD migration?
- IaC adoption?
- Observability improvement?
Next 30 Days:
Implement platform-native CI/CD (2-3 weeks)
- AWS: Migrate to CodePipeline or modernize Jenkins
- Azure: Implement Azure Pipelines
- GCP: Set up Cloud Build
Adopt Infrastructure as Code (2-4 weeks)
- Choose tool (CDK for AWS, Bicep for Azure, Terraform for GCP/multi-cloud)
- Start with one application stack
- Expand incrementally
Next 90 Days:
Optimize container orchestration (4-8 weeks)
- AWS: Evaluate ECS vs. EKS vs. Fargate
- Azure: Consider Container Apps for new workloads
- GCP: Migrate simple services to Cloud Run
Enhance observability (4-6 weeks)
- Implement platform-native observability
- Create operational dashboards
- Set up automated alerting
Measure improvements (ongoing)
- Deployment frequency
- Lead time for changes
- MTTR (mean time to recovery)
- Change failure rate
- Cloud costs
The Bottom Line
Cloud DevOps requires platform-specific practices, not generic on-premises patterns adapted to cloud.
Each platform has unique strengths:
- AWS: Breadth, maturity, flexibility (but complexity)
- Azure: Enterprise features, Microsoft integration, hybrid capability
- GCP: Simplicity, Kubernetes excellence, ML/AI leadership
Don't fight the platform—leverage its strengths.
The organizations achieving 7x faster deployments and 50% lower cloud costs are using platform-native services and patterns, not trying to make every cloud work the same way.
Your cloud strategy should optimize for each platform's strengths, not pursue perfect portability that limits innovation.
Need Help Optimizing Cloud DevOps?
If you're struggling with cloud DevOps complexity, excessive costs, or slow deployment velocity, you don't have to figure it out alone. I help organizations design and implement platform-optimized DevOps practices that leverage each cloud's strengths.
Schedule a 30-minute cloud DevOps consultation to discuss your specific platform challenges and identify opportunities for improvement.
Want insights on cloud architecture and DevOps best practices? Join my monthly newsletter for platform-specific patterns, cost optimization strategies, and DevOps excellence frameworks.