diff --git a/README.md b/README.md index 965e525..d49f608 100644 --- a/README.md +++ b/README.md @@ -118,7 +118,7 @@ For the full source structure, build process, and extension guide, see [DEVELOPM | Component | Monthly Cost | |-----------|-------------| | Lambda — Auto-Tagger (100–1,000 invocations/day) | $0.10 – $2.00 | -| Lambda — Reconciliation (1/day) + Preflight (1 at deploy) | < $0.01 | +| Lambda — Preflight (1 at deploy) | < $0.01 | | EventBridge + SQS + SSM | $0.01 – $0.20 | | **Total per account** | **< $2/month** | @@ -129,7 +129,7 @@ For the full source structure, build process, and extension guide, see [DEVELOPM | Document | Description | |----------|-------------| | [OVERVIEW.md](docs/OVERVIEW.md) | How it works — architecture, deployment, auto-deployment, SSM scope, cost | -| [INSTRUCTIONS.md](docs/INSTRUCTIONS.md) | Deployment steps, day-2 operations (update.sh), monitoring, upgrade path, FAQ | +| [INSTRUCTIONS.md](docs/INSTRUCTIONS.md) | Deployment steps, day-2 operations, monitoring, upgrade path, FAQ | | [COVERAGE.md](docs/COVERAGE.md) | Supported services (154 resource types) and E2E test coverage matrix | | [LIMITATIONS.md](docs/LIMITATIONS.md) | Hard constraints — management account, SCPs, latency, upgrade gotcha | | [MAP_TAGGING_GAP_ANALYSIS.md](docs/MAP_TAGGING_GAP_ANALYSIS.md) | What can't be tagged and why (AWS API limitations, customer-side config) | diff --git a/VERSIONING.md b/VERSIONING.md index 04ea891..376eaa2 100644 --- a/VERSIONING.md +++ b/VERSIONING.md @@ -28,7 +28,7 @@ The build script generates both `configurator.html` and `configurator.yaml` from ## Where customers see it - **CFN Output `TemplateVersion`** — surfaces in the CloudFormation console after deploy. -- **SSM Parameter `/auto-map-tagger/${MpeId}/version`** — readable via `aws ssm get-parameter`. Used by `upgrade.sh` for version-guard checks. +- **SSM Parameter `/auto-map-tagger/${MpeId}/version`** — readable via `aws ssm get-parameter`. - **CloudWatch Logs** — every Lambda cold start prints `auto-map-tagger vN.N.N cold start`. ## Release tagging @@ -47,4 +47,4 @@ Customers who "Watch → Releases only" on the GitHub repository get an email wh - Lambda runtime does **not** branch on version. The version string is metadata for humans and external tooling only. - Per-resource versioning (Lambda versions, Lambda Layers, etc.) is never used. -- Cross-version compatibility checks in `upgrade.sh` are out of scope for this policy document — see `upgrade.sh` documentation when that PR lands. +- Cross-version compatibility checks are out of scope for this policy document. diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md index 125aa92..4bce232 100644 --- a/docs/DEVELOPMENT.md +++ b/docs/DEVELOPMENT.md @@ -29,9 +29,9 @@ src/ │ │ ├── script-deploy.js generateDeployScript() — deploy.sh generator │ │ └── instructions.js generateInstructions() │ ├── editor/ -│ │ └── editor-flow.js Editor mode — add/remove accounts, generates update.sh +│ │ └── editor-flow.js Editor mode (disabled — kept for reference) │ ├── upgrade/ -│ │ └── upgrade-flow.js Upgrade mode — template version upgrades, generates upgrade.sh +│ │ └── upgrade-flow.js Upgrade mode (disabled — kept for reference) │ └── delete/ │ └── delete-flow.js Delete mode — removal flow, generates delete.sh └── templates/ diff --git a/docs/INSTRUCTIONS.md b/docs/INSTRUCTIONS.md index 5e7be0b..60d86b5 100644 --- a/docs/INSTRUCTIONS.md +++ b/docs/INSTRUCTIONS.md @@ -111,44 +111,37 @@ aws cloudformation list-stack-instances \ --- -## Day-2: Add or Remove Accounts (update.sh) +## Day-2: Add or Remove Accounts -After the initial deployment, use the **Editor** tab in `configurator.html` to add or remove accounts from scope without redeploying. - -### Generate update.sh +To modify which accounts are tagged, update the `ScopedAccountIds` CloudFormation parameter directly via CloudShell. No configurator UI or script download needed. -1. Open `configurator.html` and switch to the **Editor** tab -2. Enter the MPE ID and region of the existing deployment -3. Choose **Add accounts** or **Remove accounts** -4. Enter the account IDs to add or remove -5. Click **Generate update.sh** → downloads `update.sh` +### View current scope (optional) -### Run update.sh +```bash +aws cloudformation describe-stack-set --stack-set-name map-auto-tagger-mig --region --query "StackSet.Parameters[?ParameterKey=='ScopedAccountIds'].ParameterValue" --output text +``` -**Option 1 — AWS CloudShell:** +### Update account scope -1. Log into the AWS Console for your **management account** -2. Open **CloudShell** -3. Upload `update.sh` -4. Run: +Run as a **single line** in CloudShell from the management account. List **all** accounts that should be in scope (this is a full replacement — any account not listed will be removed from scope): ```bash -bash update.sh +aws cloudformation update-stack-set --stack-set-name map-auto-tagger-mig --use-previous-template --parameters 'ParameterKey=ScopedAccountIds,ParameterValue="[\"111111111111\",\"222222222222\",\"333333333333\"]"' 'ParameterKey=MpeId,UsePreviousValue=true' 'ParameterKey=AgreementStartDate,UsePreviousValue=true' 'ParameterKey=AgreementEndDate,UsePreviousValue=true' 'ParameterKey=ScopeMode,UsePreviousValue=true' 'ParameterKey=ScopedVpcIds,UsePreviousValue=true' 'ParameterKey=TagNonVpcServices,UsePreviousValue=true' 'ParameterKey=AlertEmail,UsePreviousValue=true' --capabilities CAPABILITY_NAMED_IAM --region ``` -**Option 2 — Local AWS CLI:** +**Format:** Each account ID must be wrapped in `\"...\"` and separated by commas. Replace the example IDs with your actual account IDs. To tag all accounts in the org, use `"[\"ALL\"]"`. -```bash -bash update.sh -``` +**To remove an account:** run the same command but omit that account ID from the list. The Lambda remains deployed but stops tagging in that account. + +### Single-account deployments -The script: -- Verifies the existing StackSet deployment -- Updates the account scope in the SSM parameter and S3 template -- Pushes the update to all accounts in the org -- Optionally re-runs backfill for newly added accounts +For single-account stacks, update the SSM parameter directly: -> Only resources created **after** the update will be tagged in newly added accounts (unless backfill is enabled). Existing tags on removed accounts are not affected. +```bash +aws ssm get-parameter --name "/auto-map-tagger/mig/config" --region --query Parameter.Value --output text +# Edit the scoped_account_ids array, then: +aws ssm put-parameter --name "/auto-map-tagger/mig/config" --type String --overwrite --value '' --region +``` --- @@ -341,7 +334,7 @@ Enable backfill to catch resources created during the brief gap (~2-5 minutes). ### Migrating from the pre-v19 un-namespaced layout -Prior versions used fixed resource names (`map-auto-tagger`, `/auto-map-tagger/config`). The current version uses MPE-ID-namespaced names (`map-auto-tagger-mig111`, `/auto-map-tagger/mig111/config`). `upgrade.sh` refuses to touch the un-namespaced layout — it must be migrated manually. +Prior versions used fixed resource names (`map-auto-tagger`, `/auto-map-tagger/config`). The current version uses MPE-ID-namespaced names (`map-auto-tagger-mig111`, `/auto-map-tagger/mig111/config`). The un-namespaced layout must be migrated manually. > **⚠️ Dual-Lambda concurrent-tagging window.** During the migration, the old Lambda will still be processing events from its SQS queue while the new Lambda is being created. If a new resource is created during this window, both Lambdas receive the event and race to tag it. They'll write the same `map-migrated` tag value (SSM config is shared per MPE ID), so the race is a no-op in the single-MPE case — but if the new deployment uses a **different** MPE ID, the last writer wins and you get non-deterministic tag values. Mitigation: pause resource creation during the migration window (typically 2-5 minutes). diff --git a/docs/LIMITATIONS.md b/docs/LIMITATIONS.md index b9b89ec..a00a794 100644 --- a/docs/LIMITATIONS.md +++ b/docs/LIMITATIONS.md @@ -115,18 +115,6 @@ The configurator requires the end date to be set explicitly. Use the configurato --- -## Reconciliation VPC Scope Limitation - -The daily reconciliation Lambda enumerates resources via the Resource Groups Tagging API (RGTA), which does not return VPC association context. When reconciliation re-enqueues a missing tag, the auto-tagger Lambda cannot determine VPC membership from the synthetic event. - -**Behavior with `tag_non_vpc_services=true` (default):** Non-VPC services (S3, DynamoDB, Lambda, SQS, SNS, etc.) are re-tagged by reconciliation as expected. VPC-bound services (EC2, RDS, ElastiCache, ELB, etc.) are **not** re-tagged because the Lambda detects them as VPC-bound but cannot resolve their VPC — it fails closed to prevent scope leaks. - -**Behavior with `tag_non_vpc_services=false`:** No resources without resolved VPC context are re-tagged by reconciliation. Only resources whose VPC can be determined from the live event (not the synthetic reconciliation event) are tagged. - -Customers using VPC scope should monitor DLQ depth as a proxy for missed tags. Resources that exhaust SQS retries land in the DLQ and trigger the SNS alert — these are the ones most likely to need manual remediation. - ---- - ## SSM Parameter Store Advanced tier (very large scopes) When `ScopedAccountIds` contains more than approximately 235 explicitly-named AWS account IDs, the serialized config payload exceeds the 4 KB Standard-tier SSM parameter limit. This template declares the config parameter with `Tier: Intelligent-Tiering`, so SSM automatically promotes the parameter to Advanced tier when the payload crosses the threshold. diff --git a/docs/OVERVIEW.md b/docs/OVERVIEW.md index 2a076ff..8a900ae 100644 --- a/docs/OVERVIEW.md +++ b/docs/OVERVIEW.md @@ -131,7 +131,6 @@ The script handles everything automatically: |-----------|---------| | **Auto-Tagger Lambda** | Extracts resource ARN and applies `map-migrated` tag | | **Preflight Lambda** | Runs once at deploy time to detect peer-tagger scope conflicts | -| **Reconciliation Lambda** | Runs daily via EventBridge schedule; scans RGTA for missing/wrong tags and re-enqueues corrections | | **EventBridge rule** | Catches resource creation events from CloudTrail | | **SQS queue** | Buffers events with 14-day retention and 5 retries | | **Dead Letter Queue** | Captures events that fail after all retries | @@ -163,7 +162,7 @@ Typically 60–90 seconds from resource creation to tagged. Up to 15 minutes dur ### Day-2 operations -Use the **Editor** tab in `configurator.html` to add or remove accounts from scope without redeploying. It generates an `update.sh` script. +To add or remove accounts from scope, update the `ScopedAccountIds` CloudFormation parameter via CloudShell. See [INSTRUCTIONS.md](INSTRUCTIONS.md#day-2-add-or-remove-accounts) for commands. --- @@ -202,7 +201,7 @@ Only after passing all checks does the Lambda apply the `map-migrated` tag. StackSet auto-deployment is enabled — when a new account joins the org, CloudFormation automatically deploys the stack. The Lambda starts running but defers to SSM for whether to act: - **`ALL` scoping (default):** New accounts are tagged immediately with zero intervention. -- **Specific account scoping:** The Lambda deploys but no-ops. Use `update.sh` to add the account to scope when ready. +- **Specific account scoping:** The Lambda deploys but no-ops. Update the `ScopedAccountIds` parameter via CloudShell to add the account to scope when ready (see [INSTRUCTIONS.md](INSTRUCTIONS.md#day-2-add-or-remove-accounts)). ### Multiple MAP engagements @@ -227,7 +226,7 @@ Account 333333333333 — resource created | Tool | What it does | |------|-------------| | `deploy.sh` | Creates StackSet + stack instances. Sets initial SSM parameter. Deploys Lambda to all accounts. | -| `update.sh` | Updates the SSM parameter (adds/removes accounts from scope). Does not deploy or remove Lambdas. | +| `update-stack-set` (CloudShell) | Updates the `ScopedAccountIds` CFN parameter → rewrites SSM config in all accounts. Does not deploy or remove Lambdas. | | Auto-deployment | CloudFormation deploys the stack when a new account joins the org. Lambda defers to SSM for behavior. | A Lambda in an out-of-scope account has negligible cost — it fires, reads SSM, determines the account is out of scope, and returns in ~100ms. @@ -239,7 +238,6 @@ A Lambda in an out-of-scope account has negligible cost — it fires, reads SSM, | Component | Monthly Cost | |-----------|-------------| | Lambda — Auto-Tagger (100–1,000 invocations/day) | $0.10 – $2.00 | -| Lambda — Reconciliation (1 invocation/day, ~200ms) | < $0.01 | | Lambda — Preflight (1 invocation at deploy time) | $0.00 | | EventBridge events | $0.01 – $0.20 | | SQS (event buffer, ~1M requests/account/month) | $0.00 (within free tier) |