Skip to content

fix(telemetry): drop profile field [skip-runtime-e2e]#162

Merged
saurabhjain1592 merged 2 commits intomainfrom
fix/drop-profile-field
May 8, 2026
Merged

fix(telemetry): drop profile field [skip-runtime-e2e]#162
saurabhjain1592 merged 2 commits intomainfrom
fix/drop-profile-field

Conversation

@saurabhjain1592
Copy link
Copy Markdown
Member

@saurabhjain1592 saurabhjain1592 commented May 8, 2026

Summary

Drops the v1 telemetry profile field that was added in v8.0.0 (#161 / commit 15a448e). The AXONFLOW_PROFILE env var was already in use by the agent for governance enforcement (allowlist dev|default|strict|compliance per ADR-036), so emitting telemetry from a customer's correctly-configured governance profile (e.g. strict) would have produced HTTP 400 silent drops on the heartbeat path — the v1 telemetry validator accepts only dev|prod|unknown.

Refs getaxonflow/axonflow-enterprise#2033.

Why drop instead of rename

  1. Env-var collision. AXONFLOW_PROFILE is owned by the governance-enforcement layer (platform/agent/profile.go, EnvProfile = "AXONFLOW_PROFILE") and has been since ADR-036. The v1 telemetry schema (#2004) picked the same name without grepping; the values do not overlap (strict|compliance vs dev|prod|unknown), so any customer who set the env var for governance would have started emitting 400-rejected pings.
  2. Zero analytics consumers. A grep across platform/, ee/, and the analytics workflows for Profile.S | .Profile\b | profile.*dimension (excluding _test.go and the checkpoint-service) returns nothing — no aggregator, no daily-report dimension, no dashboard reads the field. Removing it loses no signal in use today.
  3. deployment_mode already covers the topology dimension the field was meant to add: self_hosted | community_saas | unknown, single canonical form, no env-var collision, has analytics callers. Untouched by this PR.

Renaming would force a 9-repo coordinated bump just to allowlist-split a value nothing currently reads. Drop is the cleanest fix.

Validator-vs-caller audit

$ grep -rn 'AXONFLOW_PROFILE\|Profile\b' --include='*.go' .
(no output)

Verified zero AXONFLOW_PROFILE / Profile references remain across the 4 changed files. find . -name 'baseline*' returned testdata/wire_shape_baseline.jsongrep -i 'profile' against it: empty.

Hunk-by-hunk self-review (HARD RULE #1)

For each of the 4 hunks, the 5-question protocol from feedback_self_review_is_mandatory_every_pr.md:

Hunk 1 — telemetry.go struct field removal (3 lines doc + 1 line field).

  1. Does this match the intent in the PR title? Yes — drops the Profile field from telemetryPayload.
  2. Are there any references that are now broken? grep confirms no — the only callers were the assignment site (Hunk 3) and tests (Hunks 5-6).
  3. Backward-compat consideration? Server-side ignores unknown JSON fields per json.Unmarshal semantics; no client crashes from old-server-vs-new-client. Reverse direction (new server, old client emitting profile) is also fine — server will validate-and-drop or accept under the JSON unmarshaller.
  4. Anything load-bearing about the doc comment? No — purely descriptive, references the field that no longer exists.
  5. Does this leave any dead code? No — Hunks 2-3 remove the read + assignment.

Hunk 2 — telemetry.go os.Getenv("AXONFLOW_PROFILE") read removal (8 lines).

  1. Intent match? Yes — removes the env-var read that was the collision source.
  2. Broken refs? profile local var was used only in Hunk 3.
  3. Imports unused? strings still used (TrimPrefix on runtime.Version, EqualFold + TrimSpace + ToLower elsewhere); os still used (Getenv for AXONFLOW_TELEMETRY etc). No orphan imports — go build confirms.
  4. Side effects? None — pure read of env, no caller behavior change beyond not setting that one struct field.
  5. Tests still cover the surrounding paths? Yes — TestSendTelemetryPing_Success + TestBuildPayload deployment_mode + stream subtests still run.

Hunk 3 — telemetry.go payload literal Profile: profile removal (1 line).

  1. Intent match? Yes — removes the payload-side use site.
  2. Compiles? go build ./... passes.
  3. Tests? go test ./... passes 7.6s.
  4. Field ordering / struct literal validity? Unchanged — Go allows omitting any field in a struct literal.
  5. Any other emission site? No — grep -rn 'telemetryPayload{' --include='*.go' returns just this one (and tests that decode into the struct).

Hunk 4 — telemetry_test.go removal of received.Profile assertion in TestSendTelemetryPing_Success (3 lines).

  1. Intent match? Yes — drops the assertion for the removed field.
  2. Test still meaningful? Yes — TestSendTelemetryPing_Success retains assertions on TelemetryType, DeploymentMode, OS, Arch, RuntimeVersion, Features, InstanceID. Coverage of the heartbeat happy-path remains.
  3. Other call sites for the same assertion? Hunks 5-6 cover the dedicated subtest.
  4. Helper utilities deleted? None — pure assertion removal.
  5. Race / setup change? None — same t.Setenv setup.

Hunks 5-6 — telemetry_test.go removal of t.Run("profile from AXONFLOW_PROFILE; unknown when unset", ...) subtest (33 lines).

  1. Intent match? Yes — drops the dedicated profile-sourcing subtest entirely.
  2. Other coverage? TestBuildPayload retains its deployment_mode classifies from endpoint host and stream=sandbox tag emitted only for sandbox mode subtests; both run independently of the dropped subtest.
  3. Compiles? go test -c ./... passes.
  4. No shared state? The dropped subtest used only t.Setenv (per-subtest scoped); no fixture cleanup needed.
  5. Closing brace alignment? Verified — the next subtest (stream=sandbox) is indented correctly.

Test results

$ go build ./... && go vet ./... && go test ./...
ok  	github.com/getaxonflow/axonflow-sdk-go/v8	7.642s
ok  	github.com/getaxonflow/axonflow-sdk-go/v8/interceptors	12.335s
ok  	github.com/getaxonflow/axonflow-sdk-go/v8/internal/wireshape	0.514s

All green. Wire-shape baseline (testdata/wire_shape_baseline.json) had no profile reference, so no baseline refresh needed.

Migration notes

  • Customers who set AXONFLOW_PROFILE for governance enforcement on the agent (the original ADR-036 use case): no change. The env var is read by the agent for that purpose unchanged.
  • Customers who set AXONFLOW_PROFILE solely to influence telemetry: no longer needed — the SDK no longer reads it for telemetry. Telemetry's topology dimension lives on deployment_mode (auto-derived from endpoint host).

DO NOT TAG / PUBLISH

This is part of a coordinated train (9 client PRs + server PR per the session-2033 brief). Tagging is operator-gated and happens only after the full train merges. Do not tag v8.0.1 from this PR.

Test plan

  • go build ./...
  • go vet ./...
  • go test ./... — all green
  • grep -rn 'AXONFLOW_PROFILE\|Profile\b' --include='*.go' . — zero hits
  • CI green
  • Coordinated merge with the rest of the session-2033 train
  • Runtime proof against staging-checkpoint.getaxonflow.com (cross-train step, not gated by this PR alone)

Skip-runtime-e2e justification

This PR is part of the #2033 coordinated train across 1 server + 9 client repos. Per the session-2033 brief, runtime proof is deferred to the post-server-merge staging-checkpoint deploy:

  1. axonflow-enterprise#2035 (server) merges + gh workflow run deploy-checkpoint.yml -f environment=staging deploys the new code to staging-checkpoint.getaxonflow.com.
  2. Each of the 9 client builds is then driven against staging-checkpoint with verification that (a) POST returns 200 (no validator complaint about missing profile) and (b) the resulting DDB row has no profile attribute.
  3. EVIDENCE lands at runtime-e2e/profile_field_removal/EVIDENCE/<utc-ts>/ post-deploy.

Adding a same-PR runtime-e2e/ test here would either:

  • exercise a fake checkpoint server (forbidden by lint-no-mocks-in-runtime-e2e.sh), or
  • run against the live deployed Lambda BEFORE the server PR has deployed, which would either silently pass (ignored field) or fail (depending on deploy ordering) — neither outcome is informative.

The post-deploy proof against staging-checkpoint is the only meaningful runtime test for this train.

The v8.0.0 telemetry payload added a `profile` field sourced from
AXONFLOW_PROFILE, but that env var was already used by the agent for
governance enforcement (platform/agent/profile.go EnvProfile, allowlist
dev|default|strict|compliance per ADR-036).

A customer with `AXONFLOW_PROFILE=strict` (a perfectly valid governance
config) would have produced HTTP 400 silent drops on the heartbeat path
because the v1 telemetry validator only accepts dev|prod|unknown.

Drop the field rather than rename it: there are zero analytics consumers
of `profile` in production today, and `deployment_mode`
(self_hosted | community_saas | unknown) covers the topology dimension
the field was intended to add. Renaming would force a 9-repo coordination
on a value-allowlist split that nothing reads.

This reverts only the profile portion of #161 (15a448e); deployment_mode,
endpoint_type, telemetry_type discriminator, and stream tag are left
intact.

Refs #2033.

Signed-off-by: Saurabh Jain <saurabhjain1592@gmail.com>
@saurabhjain1592 saurabhjain1592 changed the title fix(telemetry): drop profile field (collides with governance env var) fix(telemetry): drop profile field [skip-runtime-e2e] May 8, 2026
Signed-off-by: Saurabh Jain <saurabhjain1592@gmail.com>
@saurabhjain1592 saurabhjain1592 merged commit ef418ac into main May 8, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant