ci: bump Elixir to 1.19.5 + OTP 28, drop yes-pipe from install-smoke#5
Merged
ci: bump Elixir to 1.19.5 + OTP 28, drop yes-pipe from install-smoke#5
Conversation
…moke Two root causes broke all five CI jobs on the initial push to GitHub: 1. Elixir 1.18.4 pin silently ignored `test_load_filters` (mix.exs) and rejected the `~r"..."E` regex modifier used by the Phoenix 1.8 phx.new dev.exs live-reload watcher. Both features are Elixir 1.19+. Library tests swept into test/example/** and failed to compile; example_*_smoke jobs died at `mix deps.get` with Regex.CompileError. 2. `yes Y | mix phx.new --no-install` in install-smoke.sh: --no-install means mix never reads stdin, so `yes` gets SIGPIPE. With `set -o pipefail` that propagates as exit 1 before phx.new even finishes. The yes-pipe was precautionary and not needed once --no-install is set. Bumping CI to match local dev (Elixir 1.19.5 / OTP 28.0) fixes root cause #1 across all five jobs. Dropping the yes-pipe fixes #2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…uto delivery
Addresses three remaining failures on the first GitHub CI run after the
initial Elixir 1.19 bump. Each is a structural/correctness fix rather
than a CI workaround.
1. .tool-versions as single source of truth
Replace duplicated `otp-version: '28.0'` + `elixir-version: '1.19.5'`
pairs across five jobs with `version-file: .tool-versions` +
`version-type: strict`. Local dev and CI now read from one file —
standard erlef/setup-beam pattern.
2. Guard Oban workers behind Code.ensure_loaded?/1
Oban is declared `optional: true` in mix.exs, but the four worker
modules (account_deletion, audit_cleanup, email_delivery, token_cleanup)
were defined unconditionally with `use Oban.Worker`. When a consuming
app pulls sigra as a dep without adding oban to its own deps,
compilation fails with `module Oban.Worker is not loaded`. This was
blocking the install_smoke job's `mix compile` inside a fresh phx.new
project without Oban.
Fix: wrap each module in `if Code.ensure_loaded?(Oban.Worker) do ... end`,
the standard Elixir optional-dep pattern used by phoenix_live_view,
swoosh, etc. Modules exist when the consumer pulls in Oban and simply
aren't defined otherwise — matching the intent of `optional: true`.
3. Sigra.Delivery :auto mode detects supervised Oban, not loadable Oban
`delivery_mode: :auto` routed to `:async` whenever Oban was loadable as
a module. That's the wrong check — it only tells us the dep is present,
not that the supervisor is running. Apps that add `{:oban, "~> 2.17"}`
to mix.exs without wiring the supervisor tree (common during onboarding,
and the state the test/example app is in) crashed at `oban.insert/1`.
The Playwright golden-path smoke's register step reproduced this: the
LiveView `save` handler crashed silently inside `deliver_user_
confirmation_instructions`, leaving the form on /users/register and
failing the `expect(page).not.toHaveURL(/register/)` assertion.
Fix: `oban_running?/0` now also checks `Process.whereis(Oban) != nil`.
:auto → :sync whenever the supervisor isn't running. Tests updated to
cover both branches (dummy registered-name process for the :async case).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs uncovered on the PR's second CI run: 1. Example cache key was keyed only on `test/example/mix.lock`, but sigra is a path dep — library source changes don't bump that lock file. Cached `test/example/_build` was serving old compiled sigra artifacts across commits, so the previous `Sigra.Delivery.deliver` fix never actually ran in CI. Add `lib/**/*.ex` + `mix.exs` to the cache key for all three example jobs. Library source changes now force a fresh example compile. 2. `install-smoke.sh` passed `--no-gettext` to `mix phx.new`, but nine installer templates (reset_password, confirmation, API token emails, etc.) call `dgettext/2`. The generated controller then failed to compile with `undefined function dgettext/2`. Gettext is a soft requirement for `mix sigra.install` — drop the flag so the fresh app ships with gettext wired. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two unrelated issues uncovered after the previous fixes cleared:
1. install_smoke: warnings-as-errors on optional-dep references
Downstream consumers (fresh phx.new apps) compile sigra as a path dep
without pulling in sigra's optional deps. The compiler then emits
"module X is not available" warnings for every unguarded reference to
Oban, Assent.Strategy.*, Joken, EQRCode, Bcrypt, etc — and `mix compile
--warnings-as-errors` in the consuming app fails on them.
Fix: add project-wide `elixirc_options: [no_warn_undefined: [...]]` in
mix.exs listing every optional-dep module sigra calls plus the four
conditionally-compiled worker modules. This is the standard library
pattern — applies whether sigra is compiled standalone or as a dep.
2. example_playwright_smoke: registration LiveView crashed at mailer
The registration `save` handler consistently crashed with:
** (KeyError) key :adapter not found in: [otp_app: :example]
(swoosh) lib/swoosh/mailer.ex:207: Swoosh.Mailer.deliver/2
(example) lib/example/mailer.ex:11
The example app has two mailer modules: `Example.Mailer` (the raw
`use Swoosh.Mailer`) and `Example.Accounts.Mailer` (the `Sigra.Mailer`
behaviour wrapper that delegates to `Example.Mailer.deliver/1`). Only
the raw one actually calls `Swoosh.Mailer.deliver/2`, so only the raw
one needs the `:adapter` config. `test/example/config/dev.exs` had
the adapter set on the wrong module (the wrapper), so Swoosh never
found one on the raw mailer and crashed.
Fix the example dev config, and fix `sigra.install`'s inject_swoosh_
config helper to target `<AppModule>.Mailer` instead of
`<ContextModule>.Mailer` so fresh installs don't hit the same bug.
Reproduced and verified locally via playwright-mcp: register now
redirects to `/` on submit (post-registration auto-login), and the
dev mailbox receives the confirmation email through the Swoosh.Local
adapter. Full library suite (1253 tests) + example suite (46 tests)
still green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e templates Two installer templates defined their LiveView modules under a `<web_module>.Auth.<Name>` namespace: defmodule <%= web_module %>.Auth.SettingsLive do defmodule <%= web_module %>.Auth.ReactivationLive do But the router injection in `sigra.install` writes plain route names: live "/settings", SettingsLive, :edit live "/reactivation", ReactivationLive Phoenix resolves those relative to the router's scope alias (the web module itself, e.g. `TmpAppWeb`), so they look for `TmpAppWeb.SettingsLive` and `TmpAppWeb.ReactivationLive` — not `TmpAppWeb.Auth.*`. With `mix compile --warnings-as-errors` in the consuming app, the undefined-module warnings become compile errors. test/example already uses the flat `ExampleWeb.SettingsLive` / `ExampleWeb.ReactivationLive` shape that matches its router, so this was a drift between the templates and the shipped example. `MFASettingsLive` is already flat in both places. Fix: drop `.Auth.` from both template defmodule lines so fresh installs match the router injection and the example app. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Second round of cache-staleness debugging: the example cache key missed two important source trees, letting stale compiled artifacts linger across commits. Missing from the hash: - test/example/config/** — config/dev.exs changes (like the Swoosh mailer adapter fix) never bumped the key, so the cached `_build` retained an `example.app` with the old compile-time config. Phoenix then booted with the stale config, Swoosh couldn't find `:adapter` on `Example.Mailer`, and the registration LiveView crashed. - test/example/lib/**/*.ex — example source changes (templates mirrored into the example app during development) would also have been masked by cache hits. Expand the key to hash test/example/config/**, test/example/lib/**/*.ex, and the existing library sources. Three example jobs (example, example_http_smoke, example_playwright_smoke) all updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iframe
Phoenix Live Reload injects a hidden `iframe[src="/phoenix/live_reload/frame"]`
in MIX_ENV=dev. The mailbox scraper was using `frameLocator('iframe')`
which matched both that and Swoosh's `iframe#html-mail`, failing strict
mode. Tighten the selector to `iframe#html-mail` — an ID Swoosh has used
since its MailboxPreview plug was introduced.
Register → confirm → login flow now reaches the mailbox step cleanly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two follow-up fixes after the previous run: 1. Cold-start race on /users/register The example app runs with `plug_init_mode: :runtime` in dev (Phoenix 1.8's default), so each route pays a compile-on-demand cost on first request. The wait-for-app loop only hit `/`, so when Playwright clicked "Create an account" the LiveView was still compiling and the URL- change assertion timed out at 5s. Retries succeeded because the second request to the same route was cached. Add a warmup loop after the health check that curls every route the golden-path test will touch (/users/register, /users/log_in, /users/confirm, /dev/mailbox, /users/sessions, /users/sudo, /users/settings/mfa). Warmup failures are non-fatal so a broken route still surfaces via the real Playwright assertion, not an opaque curl. 2. Flash-text assertion on confirm page was brittle The test asserted `getByText(/confirmed|confirmation/i)` on the page after following the email link. ConfirmationLive auto-confirms in handle_params and immediately `redirect`s to `/` with a flash — but the flash is a toast component whose visibility lifecycle has shifted across Phoenix / daisyUI versions, and the page snapshot on failure showed zero flash elements even though the redirect succeeded. Switch to a URL-change assertion instead: `expect(page).not.toHaveURL(/\/users\/confirm\//)`. We care that the user got past the confirmation token URL, not about the exact rendering of the flash toast. If the user isn't actually confirmed, the later login/sessions steps will still catch it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CI server logs reveal the actual root cause of the flaky register step: LiveView is connecting via :longpoll transport in CI, not WebSocket. The full validate→save→trigger_submit chain becomes N sequential HTTP round-trips and can exceed the default 5s expect timeout, which is why retries sometimes passed and sometimes didn't. Two adjustments: 1. Wait for `body.phx-connected` before filling the register form. If the page loads faster than the LV channel joins, Playwright's fill fires against a DOM that LiveView hasn't attached its bindings to yet — the resulting phx-submit gets queued and may lose state. 2. Bump the post-click `toHaveURL` timeout from the 5s default to 15s. Longpoll + validate + save + trigger_submit + full HTTP POST can take 6-10s on a cold CI worker. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous commit waited for `body.phx-connected` but Phoenix LiveView
attaches the `.phx-connected` class to the LV root element (the
`<div data-phx-session>`), not `<body>`. The selector never matched
and the test hit its 15s timeout before ever clicking register.
Verified locally via playwright-mcp's page.evaluate:
{ bodyClass: '', rootClass: 'phx-connected',
connectedSelector: 'DIV',
phxHooks: [{ tag: 'DIV', cls: 'phx-connected' }] }
Switch to `[data-phx-session].phx-connected` with `state: 'attached'`
— we only care that the element exists in the DOM, not that it's in
the viewport. Also replaces expect() with the more direct
waitForSelector.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three related fixes surfaced by running the golden-path Playwright
smoke locally and tracing every crash in the dev server log.
1. Sudo controller crash: `conn.private[:sigra_session]` was nil
SudoController.create/2 reads `conn.private[:sigra_session]` to get
the session's hashed_token and stamp sudo_at, but nothing in the plug
chain was actually populating that private key. The form submit
crashed with `(BadMapError) expected a map, got: nil`.
Fix: introduce `Accounts.get_user_and_session_by_token/1` that returns
`{user, session}` and have `UserAuth.fetch_current_scope/2` stash the
session record into `conn.private[:sigra_session]`. Mirrored into the
installer templates so fresh installs land with the same wiring.
2. Sigra.MFA.confirm_enrollment bulk insert passed updated_at
The library hardcoded `updated_at: now` in the backup-code entries
map, but the shipped schemas use `timestamps(updated_at: false)` so
the DB doesn't have that column. insert_all failed with "unknown
field `:updated_at`". Drop the field — backup codes are effectively
write-once (only `used_at` changes on consumption), so updated_at is
meaningless anyway.
3. Playwright config + golden-path rewrite
- playwright.config.ts: global `expect.timeout: 15_000`,
`actionTimeout`, `navigationTimeout`, and test-level `timeout` so
longpoll-transport LV events have room to complete without
sprinkling per-call `{ timeout }` options everywhere.
- waitForLiveViewReady helper: waits for
`[data-phx-session].phx-connected` (verified via browser inspect —
the class is on the LV root div, NOT <body>).
- Add waits on every LiveView navigation: register, sessions,
settings/mfa, mfa challenge.
- Confirm step: don't wait for phx-connected (ConfirmationLive
redirects to `/` during handle_params, so by load time we're
already on a non-LV page).
- MFA enroll: submit via Enter keypress on the code input to avoid a
DOM-detach race — phx-change re-renders the form on each keystroke
which detaches the submit button between fill and click.
- MFA "save backup codes" step: click the phx-click checkbox and
wait for the Done button to become enabled before clicking.
- Mailer config: removed reliance on brittle flash text assertions;
URL-change assertions are more stable across Phoenix/daisy versions.
Also adds `.actrc` for `act` (local GitHub Actions runner) — enables
iterating on the full CI workflow in Docker instead of push-and-wait.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
act (github.com/nektos/act) lets us run .github/workflows/ci.yml in Docker locally, mirroring the real GitHub Actions runner. This is the fastest way to iterate on CI changes — the previous push → wait loop takes ~3 minutes per cycle, act takes ~90s after the first warm-up. .actrc pins the Ubuntu image to `catthehacker/ubuntu:act-20.04`. This is load-bearing and well-documented inline: erlef/setup-beam's arm64 Erlang/OTP prebuilds on builds.hex.pm are ONLY built against Ubuntu 20.04 (libssl1.1). Any newer image (22/24) breaks the :crypto NIF with `libcrypto.so.1.1: cannot open shared object file`, which in turn breaks `mix local.rebar`. 20.04 has libssl1.1 natively. Also: `--container-options --user=0:0` forces root so setup-beam can write to /opt/hostedtoolcache. Added a full "Running CI locally with `act`" section to the UAT runbook covering: - one-time setup (brew install act, docker pull) - port 5432 collision diagnostics (Homebrew Postgres, stale containers) - common commands (-j, -l, --reuse, --graph, --verbose) - troubleshooting the three failure modes we hit during setup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The golden-path Playwright smoke is now GREEN end-to-end locally (2.8s on warm DB). Act reproduces the same flow on arm64 Linux and catches the same bugs, so local iteration is now the tight loop. Root causes fixed along the way: 1. `Sigra.MFA.status/2` read credential schemas from `config.mfa`, but `Sigra.Config`'s NimbleOptions schema for `:mfa` doesn't accept `mfa_credential_schema` / `backup_code_schema` (those are per-call opts, same pattern as `confirm_enrollment/3` and `verify/4`). So `mfa_status/1` always returned `enabled: false` even for enrolled users, and `MFASettingsLive.mount/3` always rendered the pre- enrollment "Set up" surface on remount. Fix: accept `opts` in `Sigra.MFA.status/3` with fallback to `config.mfa` for back-compat, and have both the example and the installer template's `Accounts.mfa_status/1` wrapper pass the schemas explicitly. 2. The enrollment form uses `phx-submit="confirm_enrollment"` but `validate_enroll` also auto-calls `do_confirm_enrollment` as soon as the code hits 6 digits. Pressing Enter after `page.fill` fired confirm_enrollment a SECOND time against a socket whose raw_secret had just been nil'd by the successful first call — crashing the LV with `verify_totp(nil, ...)`. Remove the Enter press; the auto- confirm handles it. 3. The logout step used `page.request.fetch` for `DELETE /users/log_out`, which uses a separate cookie jar from the browser context — the browser session survived, and re-login silently succeeded on the existing authentication. Replace with `page.context().clearCookies()` which is simpler and matches the test's actual intent (force a fresh login), without exercising the server-side delete path (which has its own ConnTest coverage). 4. The example app uses MFA as step-up auth (sudo mode), not as a login challenge — `UserAuth.log_in_user/3` does not route through `MFAChallengeLive`. The test previously expected a `/users/mfa` redirect after re-login, which never happened. Rewrite step 8 to verify: (a) re-login works and (b) MFA state persists across the logout/login round-trip by asserting the "Disable" button is still visible on `/users/settings/mfa`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
szTheory
added a commit
that referenced
this pull request
Apr 11, 2026
Phase 10.1.1 (example-app repair + CI smoke harness) is now fully green on `main` after PR #5 (3d24be8) merged with all five CI jobs passing and branch protection active. This commit closes out plan 10.1.1-08 specifically: - Creates 10.1.1-08-SUMMARY.md documenting the rename + runbook work and the long tail of latent bugs that the first cold CI run against a fresh GitHub repo surfaced - Advances STATE.md to status: awaiting-next-phase and updates the progress counters to 60/60 plans, 12/12 phases, 100% - Updates ROADMAP.md to show phase 10.1.1 as Complete The human-verify checkpoint on plan 08 (GitHub branch protection configured with 5 required checks) is verified by the existence of ruleset 14941512 on the szTheory/sigra repo, enforced by the fact that PR #5 could not merge until all 5 checks reported green. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tasks
szTheory
added a commit
that referenced
this pull request
Apr 11, 2026
Phase 10.1.1 (example-app repair + CI smoke harness) is now fully green on `main` after PR #5 (3d24be8) merged with all five CI jobs passing and branch protection active. This commit closes out plan 10.1.1-08 specifically: - Creates 10.1.1-08-SUMMARY.md documenting the rename + runbook work and the long tail of latent bugs that the first cold CI run against a fresh GitHub repo surfaced - Advances STATE.md to status: awaiting-next-phase and updates the progress counters to 60/60 plans, 12/12 phases, 100% - Updates ROADMAP.md to show phase 10.1.1 as Complete The human-verify checkpoint on plan 08 (GitHub branch protection configured with 5 required checks) is verified by the existence of ruleset 14941512 on the szTheory/sigra repo, enforced by the fact that PR #5 could not merge until all 5 checks reported green. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
szTheory
added a commit
that referenced
this pull request
May 2, 2026
…arer_test.exs - Add describe "call/2 service-account JWT path" block with 4 new tests - Tests: SA scope built (actor_type, service_account_id, user=nil, active_organization), no :membership (ROADMAP SC #5), expired JWT yields nil scope, user-path parity guard - Inline SAMockRepo + SATestOrganizations for organization loading without Postgres dep - All 16 tests (12 existing + 4 new) pass; Gap #2 FetchBearer layer closed
szTheory
added a commit
that referenced
this pull request
May 2, 2026
- Documents gap #5 closure, ROADMAP SC#4 proof, all 6 deviations - Records actual verify-failure reason atom (:epoch_mismatch) - Notes assert_patched_or_navigated_to_sa_detail! removal rationale (Plan 93-09 not yet executed, unused fn would block warnings-as-errors) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First real CI run on GitHub exposed two root causes that broke all five jobs on the initial push. Both fixed here.
1. Elixir 1.18.4 → 1.19.5 (OTP 27.3 → 28.0)
Local dev runs Elixir 1.19.5 / OTP 28, and the codebase has absorbed two 1.19-only features without noticing:
test_load_filters: [~r"^test/(?!example/)"]inmix.exs— added to keep the outer library'smix testfrom sweeping into the nested `test/example/` Phoenix app. On 1.18.4 this option is silently ignored, so the library job compiled `test/example/test/**/*_test.exs` against the library's own elixirc_paths and failed with:```
error: module ExampleWeb.ConnCase is not loaded and could not be found
└─ test/example/test/example_web/controllers/error_html_test.exs:2
```
`~r"..."E` regex modifier in `test/example/config/dev.exs` — the live-reload file watcher patterns that phx.new 1.8 generates. The `E` flag is Elixir 1.19+. On 1.18.4, every job that evaluates the example app's config died at `mix deps.get` with:
```
** (Regex.CompileError) invalid_option at position E
(elixir 1.18.4) expanding macro: Kernel.sigil_r/2
test/example/config/dev.exs:58
```
Bumping to 1.19.5 / OTP 28.0 matches the local dev environment and fixes both.
2. `yes Y | mix phx.new --no-install` SIGPIPE in install-smoke.sh
`--no-install` skips the dep install prompt, so `mix phx.new` never reads stdin. `yes` then gets SIGPIPE, and with `set -euo pipefail` that propagates as exit 1 right after the last `* creating` line — before phx.new's output even finishes. Removing the pipe fixes it.
Test plan
All five should flip from red → green on this PR.
🤖 Generated with Claude Code