Opella DevOps Challenge — Azure Infrastructure with Terraform

Production-grade Terraform infrastructure for Azure, featuring a reusable VNET module, multi-environment deployments (dev / prod), and a GitHub Actions CI/CD pipeline.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                        GitHub Actions CI/CD                         │
│  PR: lint → validate → plan   │  Merge: apply dev → approve → prod │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
              ┌───────────────┴───────────────┐
              ▼                               ▼
    ┌─── dev (eastus) ───┐        ┌─── prod (westeurope) ──┐
    │  Resource Group     │        │  Resource Group          │
    │  ├── VNET           │        │  ├── VNET                │
    │  │   ├── compute    │        │  │   ├── compute (NSG)   │
    │  │   │   └── NSG    │        │  │   └── storage (NSG)   │
    │  │   └── storage    │        │  ├── Linux VM (no PIP)   │
    │  │       └── NSG    │        │  ├── Storage Account     │
    │  ├── Linux VM + PIP │        │  │   └── Blob Container  │
    │  ├── Storage Account│        │  └── Key Vault           │
    │  │   └── Blob       │        └──────────────────────────┘
    │  └── Key Vault      │
    └─────────────────────┘

Key Design Decisions

Resource Groups vs. Subscriptions for Environment Isolation

This project uses resource groups per environment rather than separate subscriptions. The rationale:

Consideration	Resource Groups	Subscriptions
Setup complexity	Low — single subscription	High — cross-sub IAM, billing
Cost tracking	Tags + RG-level cost analysis	Native per-sub billing
Blast radius	Shared subscription limits	Full isolation
When to choose	Small-to-medium teams, PoCs	Enterprise, strict compliance

For this project's scale, resource groups provide sufficient isolation with lower operational overhead. For an enterprise deployment, promoting to subscription-per-environment would be straightforward — change the provider configuration and update the backend.

Naming Convention

All resources follow: {project}-{environment}-{region}-{resource_type}

Example: opella-dev-eastus-vnet, opella-prod-westeurope-vm

Tagging Strategy

Every resource receives these mandatory tags (enforced via local.common_tags):

Tag	Purpose
`environment`	Distinguish dev/staging/prod
`project`	Cost allocation and filtering
`region`	Multi-region clarity
`managed_by`	Identify IaC-managed resources

Additional tags can be injected per environment via extra_tags. To enforce tagging at the Azure level, consider Azure Policy with deny effect for resources missing required tags.

Security Highlights

NSGs per subnet with explicit deny-all catch rules
SSH only for VMs — password auth disabled, keys stored in Key Vault
Storage accounts locked to VNET via service endpoints + default deny
Key Vault with RBAC authorization, network ACLs, and (in prod) purge protection
Prod VM has no public IP — accessible only within the VNET
TLS 1.2 minimum on storage accounts

Repository Structure

.
├── modules/
│   └── vnet/                        # Reusable VNET module
│       ├── main.tf                  # VNET, subnets, NSGs, DDoS
│       ├── variables.tf             # Input variables with validation
│       ├── outputs.tf               # VNET/subnet/NSG IDs
│       ├── versions.tf              # Provider constraints
│       ├── README.md                # Auto-generated docs (terraform-docs)
│       └── tests/
│           ├── vnet_test.go         # Terratest integration tests
│           └── fixtures/            # Test configurations
├── environments/
│   ├── dev/                         # Dev environment (eastus)
│   │   ├── main.tf                  # Resources: VNET, VM, Storage, KV
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   ├── terraform.tfvars         # Dev-specific values
│   │   └── versions.tf
│   └── prod/                        # Prod environment (westeurope)
│       ├── main.tf
│       ├── variables.tf
│       ├── outputs.tf
│       ├── terraform.tfvars
│       └── versions.tf
├── tests/
│   ├── static/
│   │   └── validate.sh              # 39 offline validation checks
│   ├── policy/
│   │   ├── terraform.rego           # OPA security policies
│   │   └── conftest.sh              # Policy test runner
│   └── integration/
│       ├── plan_test.go             # Plan-level Terratest tests
│       └── go.mod
├── scripts/
│   ├── infra-up.sh                  # Deploy or resume environments
│   ├── infra-down.sh                # Deallocate VMs (save costs)
│   └── infra-status.sh              # Show environment status
├── testing-results/
│   ├── terraform-plan-dev.txt       # Dev plan output (17 resources)
│   └── terraform-plan-prod.txt      # Prod plan output (16 resources)
├── docs/
│   └── screenshots/                 # Azure Portal proof screenshots
├── .github/workflows/
│   └── terraform.yml                # CI/CD pipeline
├── .pre-commit-config.yaml          # Pre-commit hooks config
├── .tflint.hcl                      # TFLint rules
├── .terraform-docs.yml              # Auto-doc generation config
├── Makefile                         # Developer workflow shortcuts
└── README.md

Getting Started

Prerequisites

Terraform >= 1.5.0
Azure CLI
An Azure subscription (Free tier works)

Authentication

az login
az account set --subscription "<subscription-id>"

Deploy Dev Environment

cd environments/dev
terraform init
terraform plan -out=dev.tfplan
terraform apply dev.tfplan

Deploy Prod Environment

cd environments/prod
terraform init
terraform plan -out=prod.tfplan
terraform apply prod.tfplan

Quick Scripts

./scripts/infra-up.sh dev       # Deploy or resume dev (starts deallocated VMs)
./scripts/infra-up.sh prod      # Deploy or resume prod
./scripts/infra-down.sh dev     # Deallocate VMs to save costs (no destroy)
./scripts/infra-down.sh all     # Stop both environments
./scripts/infra-status.sh       # Show status of all environments

Using the Makefile

make help          # Show all available commands
make fmt           # Format all Terraform files
make init-dev      # Initialize dev environment
make plan-dev      # Plan dev environment
make apply-dev     # Apply dev environment
make test          # Run Terratest module tests
make docs          # Regenerate module documentation
make clean         # Remove .terraform dirs and plan files

VNET Module Usage

The module is designed to be reusable in any context:

module "vnet" {
  source = "../../modules/vnet"

  vnet_name           = "my-app-vnet"
  resource_group_name = azurerm_resource_group.example.name
  location            = "eastus"
  address_space       = ["10.0.0.0/16"]

  subnets = {
    web = {
      address_prefixes  = ["10.0.1.0/24"]
      service_endpoints = ["Microsoft.Storage"]
      nsg_rules = [
        {
          name                       = "allow-https"
          priority                   = 100
          direction                  = "Inbound"
          access                     = "Allow"
          protocol                   = "Tcp"
          source_port_range          = "*"
          destination_port_range     = "443"
          source_address_prefix      = "*"
          destination_address_prefix = "*"
        },
      ]
    }
    db = {
      address_prefixes = ["10.0.2.0/24"]
      delegation = {
        name = "mysql-delegation"
        service_delegation = {
          name    = "Microsoft.DBforMySQL/flexibleServers"
          actions = ["Microsoft.Network/virtualNetworks/subnets/join/action"]
        }
      }
    }
  }

  enable_ddos_protection = false

  tags = {
    environment = "dev"
    project     = "my-app"
  }
}

See modules/vnet/README.md for full input/output documentation (auto-generated with terraform-docs).

CI/CD Pipeline & Release Lifecycle

The GitHub Actions workflow (.github/workflows/terraform.yml) implements a promote-through-environments strategy with 6 stages, plus a cost estimation job on PRs:

Pipeline Architecture

  ┌──────────────────────────────────────────────────────────────────────────────┐
  │                        ON PULL REQUEST                                       │
  │                                                                              │
  │  ┌───────────┐   ┌─────────────┐   ┌─────────┐  ┌─────────┐  ┌──────────┐ │
  │  │  Stage 1   │──▶│   Stage 2   │──▶│ Stage 3 │  │ Stage 3 │  │Stage 3b  │ │
  │  │  Lint &    │   │  Security   │   │ Plan    │  │ Plan    │  │ Infracost│ │
  │  │  Format    │   │  (Checkov)  │   │  DEV    │  │  PROD   │  │ Cost Est │ │
  │  └───────────┘   └─────────────┘   └────┬────┘  └────┬────┘  └─────┬────┘ │
  │                                          │            │             │       │
  │                              ┌───────────▼────────────▼─────────────▼──┐    │
  │                              │ Plan + Cost posted as PR comments       │    │
  │                              └────────────────────────────────────────-─┘    │
  └──────────────────────────────────────────────────────────────────────────────┘

  ┌────────────────────────────────────────────────────────────────────────┐
  │                       ON MERGE TO MAIN                                 │
  │                                                                        │
  │  Stages 1-3 run as above, then:                                        │
  │                                                                        │
  │  ┌───────────┐         ┌────────────────┐         ┌───────────┐       │
  │  │  Stage 4   │────────▶│  Manual Gate   │────────▶│  Stage 5   │      │
  │  │  Apply     │         │  (GitHub Env   │         │  Apply     │      │
  │  │  DEV       │         │  "production") │         │  PROD      │      │
  │  │ (automatic)│         └────────────────┘         │ (approved) │      │
  │  └───────────┘                                     └───────────┘       │
  └────────────────────────────────────────────────────────────────────────┘

Stage Details

Stage	Job Name	Trigger	What It Does
1. Lint & Format	`lint`	PR + push	Runs `terraform fmt -check`, TFLint on module and both environments
2. Security Scan	`security`	PR + push	Runs Checkov static analysis on all Terraform code; uploads SARIF results to GitHub Security tab
3. Plan	`plan` (matrix)	PR + push	Runs `terraform plan` for dev and prod in parallel; posts plan output as PR comment
3b. Cost Estimate	`cost`	PR only	Runs Infracost to show cost impact of changes as a PR comment
4. Apply Dev	`apply-dev`	merge only	Auto-applies to dev environment (no manual gate)
5. Apply Prod	`apply-prod`	merge only	Applies to prod after manual approval via GitHub Environment protection rules

Additional Workflows

Workflow	Trigger	Purpose
Drift Detection	Weekdays 6 AM UTC (cron)	Runs `terraform plan` to detect infrastructure drift; opens GitHub Issue on drift
Dependabot	Weekly	Automatically proposes PRs to update GitHub Actions versions

Checkov Security Scanning

Checkov is an open-source static analysis tool that scans Terraform files for security misconfigurations. Our pipeline checks for:

Storage accounts without encryption or network restrictions
VMs with password authentication enabled
Key Vaults without purge protection or RBAC
Missing TLS version enforcement
Overly permissive NSG rules
Missing tags on resources

Checkov runs in soft-fail mode so findings are reported in the GitHub Security tab but don't block deployment. This allows teams to review and remediate findings progressively.

Release Lifecycle (Step by Step)

Developer creates a feature branch and opens a PR
Lint job validates formatting and runs TFLint rules
Checkov scans for security issues (results in GitHub Security tab)
Plan jobs run in parallel for dev and prod — output is posted as a PR comment so reviewers can see exactly what will change
Infracost posts a cost estimation comment showing monthly cost impact
Team reviews the PR: code changes + plan output + cost impact + security findings
PR is merged to main
Dev auto-applies — immediate feedback on whether changes work
Prod waits for manual approval via GitHub Environment protection
Reviewer approves in the GitHub Actions UI
Prod applies — changes are live in production

Path Filtering

The workflow only triggers when relevant files change:

paths:
  - "modules/**"
  - "environments/**"
  - ".github/workflows/terraform.yml"

Changes to README, docs, or scripts won't trigger unnecessary pipeline runs.

Setting Up the Pipeline

1. Create an Azure Service Principal:

az ad sp create-for-rbac --name "github-terraform" \
  --role Contributor \
  --scopes /subscriptions/<SUBSCRIPTION_ID>

2. Add GitHub repository secrets:

Secret	Value
`ARM_CLIENT_ID`	Service Principal App ID
`ARM_CLIENT_SECRET`	Service Principal Password
`ARM_SUBSCRIPTION_ID`	Azure Subscription ID
`ARM_TENANT_ID`	Azure AD Tenant ID
`INFRACOST_API_KEY`	(Optional) Infracost API key for cost estimates on PRs

3. Create GitHub Environments:

Environment	Protection Rules
`dev`	None (auto-deploy)
`production`	Required reviewers + optional wait timer

GitHub Actions Screenshots

Pipeline Overview — All 6 Stages Visible

All stages green: Lint, Checkov, Plan-dev, Plan-prod, Apply-dev, Apply-prod pass end-to-end. Apply-prod uses manual approval via GitHub Environment protection rules.

Checkov Security Scan — Job Steps

Checkov runs against the VNET module, dev environment, and prod environment separately. Results are uploaded as SARIF to the GitHub Security tab.

Checkov Findings — Security Annotations

Initial Checkov scan surfaced 23 findings. We addressed them all: secrets now have content_type and expiration_date, storage accounts have soft-delete + SAS policy + queue logging, and infrastructure-level checks (private endpoints, VM extensions, VNET NSGs) are suppressed with justifications in .checkov.yml.

PR Plan Comments — Dev & Prod Plans

On every PR, the pipeline automatically posts Terraform plan output as comments for both dev and prod environments. Reviewers see exactly what will change before approving.

Lint & Format Job — All Checks Passing

Lint job validates formatting with terraform fmt, runs TFLint on the VNET module and both environment configurations.

Note: All 6 stages pass end-to-end (Lint, Checkov, Plan-dev, Plan-prod, Apply-dev, Apply-prod). In a hardened production environment you would tighten storage/Key Vault firewalls to Deny and use self-hosted runners within the VNET or Private Endpoints.

Code Quality Tools & Processes

Tool	Purpose	How
`terraform fmt`	Consistent formatting	Pre-commit hook + CI check
`terraform validate`	Syntax & config validation	CI on every PR
TFLint	Linting & best practices	Pre-commit hook + CI
Checkov	Security static analysis	CI pipeline (SARIF -> GitHub Security tab)
terraform-docs	Auto-generate module docs	Pre-commit hook + `make docs`
pre-commit	Git hook automation	`.pre-commit-config.yaml`
Terratest	Integration testing	`make test`
OPA/Conftest	Policy-as-code (Rego)	`make test-policy`
Infracost	Cost estimation on PRs	CI pipeline (PR comment)
Drift Detection	Scheduled plan to detect config drift	Cron workflow (weekdays 6 AM UTC)
Dependabot	Automated dependency updates	`.github/dependabot.yml`

Install Pre-commit Hooks

pip install pre-commit
pre-commit install

Testing

The project includes a comprehensive test suite at multiple levels:

Static Validation (no cloud credentials needed)

make test-static

Runs 39 checks including: formatting, module structure, variable/output documentation, secret detection, provider constraints, naming conventions, tag enforcement, and security configuration.

OPA Policy Tests

make test-policy

Validates Terraform plans against security policies written in Rego — checks for required tags, TLS 1.2, private blob access, password-disabled VMs, and RBAC-enabled Key Vaults.

Integration Tests (Terratest)

make test-integration   # Plan-level tests (no deploy)
make test-module        # Full deploy/destroy tests

Plan-level tests validate resource counts, naming conventions, security settings, tag presence, and environment-specific rules (e.g., prod has no public IP, restricted SSH).

Azure Portal Screenshots

Proof of successful deployment of the dev environment in Azure:

Resource Group Overview

VNET with Subnets and NSGs

Virtual Machine (Running)

Tags (environment, project, region, managed_by)

Test Results Summary

All test reports are stored in testing-results/. Every test suite passes with zero failures.

Report	Test Type	Result
`static-validation.txt`	39 static checks (format, structure, docs, secrets, constraints, naming, tags, security)	39/39 passed
`terraform-fmt.txt`	Terraform formatting	All formatted
`terraform-validate.txt`	Terraform validate (dev + prod)	Both valid
`tflint.txt`	TFLint (module + dev + prod)	0 warnings
`checkov-dev.txt`	Checkov security scan (dev)	24 passed, 0 failed
`checkov-prod.txt`	Checkov security scan (prod)	24 passed, 0 failed
`checkov-module.txt`	Checkov security scan (VNET module)	6 passed, 0 failed
`conftest-dev.txt`	OPA/Rego policy tests (dev)	0 violations
`conftest-prod.txt`	OPA/Rego policy tests (prod)	0 violations
`integration-tests.txt`	Terratest plan-level (resource count, naming, security, tags, prod no public IP, restricted SSH)	6/6 passed
`terratest-vnet-module.txt`	Terratest deploy/destroy (basic + NSG fixtures)	2/2 passed
`terraform-plan-dev.txt`	Terraform plan — dev (eastus2)	No changes (in sync)
`terraform-plan-prod.txt`	Terraform plan — prod (westeurope)	3 to add

To regenerate:

make test-static                    # Static validation (39 checks)
make test-integration               # Plan-level integration tests
make test-module                    # Terratest deploy/destroy (VNET)
make test-policy                    # OPA/Conftest policy tests
checkov -d environments/dev --config-file .checkov.yml   # Checkov scan

Future Improvements

Remote state: Configured with Azure Storage backend (opellatfstate0930) for shared state across local and CI
Azure Policy: Enforce tagging and allowed resource types at the subscription level
VNET Peering: Add peering between dev and prod if cross-env communication is needed
Bastion Host: Replace public IPs with Azure Bastion for secure VM access
Monitoring: Add Azure Monitor + Log Analytics workspace
tfsec: Add tfsec as a complementary security scanner alongside Checkov

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
docs/screenshots		docs/screenshots
environments		environments
modules/vnet		modules/vnet
scripts		scripts
testing-results		testing-results
tests		tests
.checkov.yml		.checkov.yml
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.terraform-docs.yml		.terraform-docs.yml
.tflint.hcl		.tflint.hcl
Makefile		Makefile
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Opella DevOps Challenge — Azure Infrastructure with Terraform

Architecture Overview

Key Design Decisions

Resource Groups vs. Subscriptions for Environment Isolation

Naming Convention

Tagging Strategy

Security Highlights

Repository Structure

Getting Started

Prerequisites

Authentication

Deploy Dev Environment

Deploy Prod Environment

Quick Scripts

Using the Makefile

VNET Module Usage

CI/CD Pipeline & Release Lifecycle

Pipeline Architecture

Stage Details

Additional Workflows

Checkov Security Scanning

Release Lifecycle (Step by Step)

Path Filtering

Setting Up the Pipeline

GitHub Actions Screenshots

Pipeline Overview — All 6 Stages Visible

Checkov Security Scan — Job Steps

Checkov Findings — Security Annotations

PR Plan Comments — Dev & Prod Plans

Lint & Format Job — All Checks Passing

Code Quality Tools & Processes

Install Pre-commit Hooks

Testing

Static Validation (no cloud credentials needed)

OPA Policy Tests

Integration Tests (Terratest)

Azure Portal Screenshots

Resource Group Overview

VNET with Subnets and NSGs

Virtual Machine (Running)

Tags (environment, project, region, managed_by)

Test Results Summary

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages