Skip to content

feat(ClickHouse): Add ClickHouse dialect with native JSON traits#8

Closed
khvn26 wants to merge 52 commits into
initial-scaffoldfrom
feat/clickhouse-dialect
Closed

feat(ClickHouse): Add ClickHouse dialect with native JSON traits#8
khvn26 wants to merge 52 commits into
initial-scaffoldfrom
feat/clickhouse-dialect

Conversation

@khvn26
Copy link
Copy Markdown
Member

@khvn26 khvn26 commented May 12, 2026

Closes #7.

Summary

New ClickHouseDialect implementing the full Dialect protocol against ClickHouse's native JSON type (24+, GA in 25.x). Trait reads are typed columnar subcolumn scans rather than per-row JSON parses, mirroring Snowflake VARIANT's cost shape.

Engine parity: 599 engine-test-data segments pass against the harness. Three xfails — same set Snowflake has (two semver-prerelease cases, one $.identity-shaped trait shadowing case).

Performance assessment

Benched against the same engine-test-data shape at 10M / 100M / 870M rows on a ClickHouse Cloud trial. Full numbers in the Notion RFC follow-up and the iteration history is captured in the commit log. Headline at 870M:

predicate dialect Snowflake XS
simple (IN + IS_SET) 17.02s 4.11s
multi (4 conditions) 34.28s 4.27s
% Split over identifier 26.32s 0.49s

At 870M we're 4-8× behind Snowflake on trait predicates and ~54× on % Split. The cost-shape conclusion in the RFC doesn't move — every read still fits under CH Cloud's 60s minimum-billing slot, even at 34s p95 — so this is fine to ship.

Translator-level levers to close more of the gap are scoped in the RFC follow-up section and out of scope here.

% Split is also pending an upstream ClickHouse contribution to vectorise MD5: ClickHouse/ClickHouse#104887. Independent of this PR.

Test plan

  • CI green on feat/clickhouse-dialect against initial-scaffold
  • docker compose up --detach --wait clickhouse && make test locally — 599 passed, 3 xfailed
  • make lint && make typecheck locally — clean

khvn26 added 30 commits May 7, 2026 15:56
uv-managed Python package mirroring the structure of flagsmith-flag-engine
(Python) and flagsmith-rust-flag-engine (Rust).

Translates `SegmentContext` predicate trees into SQL WHERE expressions.
The translator emits a predicate that goes against an `IDENTITIES` table
(aliased `i`) directly; trait-bound conditions become `EXISTS` subqueries
against `TRAITS`. `PERCENTAGE_SPLIT` and `:semver`-marked comparators
compile to inline pure-SQL — no UDF call required at runtime.

Public API takes the engine's own TypedDicts:

  TranslateContext(environment: EnvironmentContext, ...)
  translate_segment(segment: SegmentContext, ctx: TranslateContext)

`EnvironmentContext.key` is used directly as the `environment_id` literal
in emitted SQL — no separate integer PK; `IDENTITIES` and `TRAITS` hold
a STRING `environment_id` column matching the engine's vocabulary.

Includes:
- engine-test-data as git submodule
- Dialect protocol with SnowflakeDialect implementation
- 11 unit tests (translator SQL-shape regression, no Snowflake required)
- 510 parity tests parametrised over engine-test-data, gated behind a
  Snowflake test account via SNOWFLAKE_* env vars
- CI workflow: lint+unit on every PR; parity job gated on secrets

beep boop
Ruff (lint + format) now runs as a pre-commit hook on every commit;
mypy runs as a `make typecheck` local hook on staged Python files.
CI calls `make typecheck` and `make test` instead of running the tools
directly. Pattern follows github-webhook-handler.

Adds:
- .pre-commit-config.yaml with ruff-pre-commit, uv-pre-commit (uv-lock),
  pre-commit-hooks (check-yaml/json/toml), and a local python-typecheck
  hook calling `make typecheck`. Pre-commit.ci skips the heavy ones.
- Makefile with install/install-pre-commit/lint/test/typecheck/help.
- pre-commit added to dev deps; ruff dropped (provided by the pre-commit
  hook with its own pinned version).
- README dev section rewritten around the make targets.

Translator type cleanups picked up by mypy --strict on the new path:
private helpers take dict[str, Any], threshold extracted from val with
a None guard, and translate_segment coerces SegmentRule to dict before
recursing.

beep boop
prek (https://github.com/j178/prek) is a Rust rewrite of pre-commit,
drop-in compatible with .pre-commit-config.yaml. PyPI-distributed so
it slots into the dev group cleanly.

Also drops the pre-commit.ci `ci:` block from .pre-commit-config.yaml
(pre-commit.ci runs the standard Python pre-commit, not prek; instead
the CI workflow gains a `make lint` step that runs prek directly).

beep boop
Node 20 deprecation; v5 runs on Node 24.

beep boop
Earlier proposal was column-per-trait wide-form. Doesn't scale: Snowflake
caps tables at ~3,000 columns, ALTER TABLE per new trait key on the write
path needs elevated grants and locks, and SaaS-aggregated trait
vocabularies cross those limits.

Pivoting to a single IDENTITIES table with a `traits` VARIANT column.
Trait keys live as object keys inside the VARIANT — data, not schema.
New trait keys never require DDL; the CDC pipeline just merges them into
the existing VARIANT.

Translator emits VARIANT path-extraction (`i.traits:"<key>"`) for
trait-bound predicates. The slow PoC shape this is sometimes confused
with — OBJECT_AGG at query time over a long-form TRAITS table, 245s at
100M — is a different operation: that one joined and aggregated per
query. Here the VARIANT is pre-materialised; query time is a single
columnar read with subkey extraction.

Bench follow-up to validate VARIANT path-extraction perf is in the same
ballpark as typed columns. If it isn't, consider hybrid (typed columns
for hot traits + VARIANT for cold), but ship VARIANT-only first.

Translator changes:
- TranslateContext takes `traits_col` (default "i.traits"), drops
  `traits_table` and `identity_id_col`.
- `trait_path(key)` produces `<traits_col>:"<key>"`.
- Trait-bound predicates emit VARIANT path access + cast, not EXISTS.

Tests + conftest mirror the new shape: single transient IDENTITIES table
with VARIANT traits column; cases load via PARSE_JSON of the trait map.

beep boop
Numbers from the PoC bench against 870M identities × 50 segments on
X-Small with the result cache disabled. VARIANT pays a 47-73% overhead
for trait-bound queries (simple 1.5s→2.6s, multi 2.1s→3.1s) but stays
sub-4s on the smallest warehouse. Pure %Split is unaffected since it
doesn't touch traits.

beep boop
One job, one pytest invocation that runs both unit and parity tests
with -n auto for parallelism. Drops the two-job separation that was
mostly artificial — the conftest already skips parity tests when
SNOWFLAKE_ACCOUNT is unset, so forked-PR runs (no secrets) get a clean
"X passed, 510 skipped" without needing the workflow's `if:` guard.

beep boop
Previous shape ran 102 INSERTs (one per case) plus 510 SELECTs (one per
parametrised test) — ~612 Snowflake round-trips. With xdist on 4 CPUs
the parity job took ~2m24s in CI, dominated by per-query latency, not
compute.

Two batches:
  - `loaded_cases` does one multi-row INSERT for all 102 cases.
  - `parity_results` runs a single UNION-ALL'd SELECT that evaluates
    every translated segment against every case's identity row, returns
    a `(case_idx, seg_key) -> bool | None` dict. Per-test invocation is
    a dict lookup; no Snowflake call in the parametrised body.

Local bench: 31s wall time for 510 passing tests, serial. Down from
~2m24s. Drops `-n auto` from the CI workflow — the bottleneck is one
Snowflake query, not pytest concurrency, so xdist would just multiply
the mega-query setup across workers without speedup.

beep boop
The numeric comparator family (GREATER_THAN / LESS_THAN / *_INCLUSIVE)
was interpolating segment-supplied values raw into the WHERE clause —
a SQL-injection path for any user with `MANAGE_SEGMENTS` permission on
the project. Same risk class as the MODULO branch already guarded
against. Fix is symmetrical: float() the value in Python first, return
None on ValueError so the caller falls back.

Three new unit tests assert the guard:
  - injection-shaped value on each comparator returns None
  - clean numeric value interpolates the parsed float
  - injection in MODULO's divisor returns None

Also adds the flagsmith-lint-tests pre-commit hook (FT003 / FT004 from
flagsmith-common). Renames the parity test and reshapes every unit
test to the `test_subject__condition__expected` template plus
Given/When/Then comments enforced by the hook.

beep boop
Hardens against SQL injection without adding a query-builder dependency.
Every place a segment-supplied value (operand, trait key, segment key,
env constants) crosses into emitted SQL now routes through one of:

  - Sanitiser.escape_string / Sanitiser.string_literal — single-quote
    SQL string literal escaping
  - Sanitiser.variant_path_key — double-quote VARIANT path key escaping
  - Sanitiser.numeric_literal — float-validates before interpolation;
    rejects non-numeric, None, and bool (bool would diverge from engine
    semantics)
  - Sanitiser.modulo_literal — `divisor|remainder` operand parser; both
    sides float-validated

19 unit tests document the contract, including injection-shaped inputs
on every primitive. Translator now only interpolates Sanitiser outputs
or known-safe substrings. Future reviewers can grep for `Sanitiser.` to
audit the full surface.

beep boop
Sanitiser class was just a static-method namespace; collapse to module-level
functions in utils.py. Move Snowflake-specific VARIANT path syntax
(`traits:"key"`) off the translator and onto the dialect as
`trait_path(traits_col, key)`, where Postgres JSONB / long-form-table
variants will also live.
The dialect already owns the IDENTITIES schema (via `schema_ddl`), so
column references for `$.identity.identifier` / `$.identity.key` and the
trait path now come from dialect methods (`identifier_expr`,
`identity_key_expr`, `trait_path(alias, key)`). `TranslateContext` only
keeps `identities_alias`.

`EvaluationContext.identity` is `NotRequired`, so accept the full eval
context instead of just the environment. Non-identity JSONPath properties
are resolved via `jsonpath_rfc9535.find_one` against the eval context
rather than special-casing `$.environment.{name,key}`.

`dialect` is now required — no implicit Snowflake default.
Imported by the translator for non-identity property resolution; was
resolving transitively via flagsmith-flag-engine.
Carry the JSON shape (`context: EvaluationContext`, `result: EvaluationResult`)
through `load_test_cases`, `loaded_cases`, and the parity test signature so
mypy can type-check the field accesses instead of letting them resolve to Any.
Loading: switch the test-case loader from `*.json` to both `.json` and
`.jsonc` via `json5` (the dataset half-uses `.jsonc` for newer cases —
the old glob was silently dropping ~half of them).

Backslashes in trait JSON: the SQL literal carrying `PARSE_JSON(...)`
needs `\\` for any `\` in the JSON, otherwise Snowflake strips it before
the JSON parser sees it.

Numeric casts: switch `cast_float`/`cast_number` from `(...)::FLOAT/
NUMBER` to `TRY_TO_DOUBLE/TRY_TO_NUMBER`. A non-numeric variant value
(e.g. a string trait used in MODULO) now NULL-propagates through the
predicate instead of erroring the whole UNION ALL.

Regex literal: Snowflake regex flavour treats `\d` in `'\d'` as the
character `d`, not a digit class — needs `\\d`. Pushed regex-pattern
escaping into the dialect (`_regex_literal`) and changed the protocol
so callers pass raw Python regex strings.

Type-aware EQUAL/NOT_EQUAL/IN on traits: mirrors flag_engine's per-type
coercion (segment value cast to the trait's runtime type, no match on
cast failure). Emitted as `(TYPEOF=... AND ...) OR (...)` with each
branch gated — one CASE-WHEN form triggered Snowflake to evaluate all
arms eagerly and choke on `(VARCHAR variant)::BOOLEAN`. The bool branch
compares lowercase string forms to dodge the same trap.

PERCENTAGE_SPLIT bails to FALSE at translate time when the eval
context's identity lacks the referenced field — engine returns False
when the looked-up value is None, but our row-bound SQL would otherwise
hash a present-but-irrelevant column.

Acknowledged divergence: two `:semver` cases with prerelease tags
(`1.0.0-rc.2 < 1.0.0-rc.3`) — the SQL semver-sort-key is major.minor.
patch only. Marked xfail by filename in `XFAIL_CASE_FILENAMES`.

Result: 625 passed / 7 skipped (untranslatable operators) / 2 xfailed.
Audit of the 7 previously-skipped cases (`pytest.skip` because
`translate_segment` returned None) found all 7 are translator gaps,
not genuinely unsupported operators:

- 2× MODULO with bad operand (`""`, `"invalid|value"`): now emits
  `FALSE`. Engine catches the cast error and returns False.
- 4× JSONPath that resolves to None or a non-primitive object
  (`$.leads.to.nowhere`, `$.identity` against operators that don't
  cast): now pre-evaluated against the eval context via
  `is_context_in_segment` and collapsed to `TRUE`/`FALSE` (env values
  and other static jsonpath subjects are constant for every row).
- 1× `$.`-prefixed property that doesn't parse as JSONPath: engine
  treats this as a trait-key lookup, the translator now does too
  (falls through the JSONPath branch into the trait branch).

`pytest.skip` for the untranslatable case dropped — the dataset has
no genuinely-unsupported conditions (no RE2-backref/lookahead regexes
either). Replaced with an assertion + a comment pointing future
unsupported additions at `XFAIL_CASE_FILENAMES` for explicit listing.

Result: 632 passed / 2 xfailed (semver prerelease) / 0 skipped.
EQUAL on a trait used to expand to four `TYPEOF`-gated `OR` branches with
casts in every arm — fine for correctness, but ~2× the runtime of the
old string-only form on a 870M-row scan.

Refactored: emit the cheap string compare unconditionally, and only OR
in the typed branches that catch cases the string compare would miss:

  - BOOLEAN: variant booleans stringify lowercase (`true`/`false`) so
    string compare against `"True"`/`"False"` misses them.
  - DECIMAL/DOUBLE: variant floats stringify to scientific/expanded
    forms that don't equal the segment value's literal text.
  - INTEGER: skipped entirely. Snowflake stringifies INTEGER variants
    canonically (`(1)::STRING = '1'`), so the unconditional string
    compare already handles them.

Branches are emitted only when the segment value can coerce to that
type at translate time — `value="enterprise"` collapses to a single
`(traits:"k")::STRING = 'enterprise'` (the OLD shape).

NOT_EQUAL keeps `TYPEOF` dispatch — its semantics (returns True only
when cast succeeded *and* values differ) can't be expressed as an OR
of positives without over-matching.

IN collapses to a single `TYPEOF IN ('VARCHAR', 'INTEGER')` gate plus
one string IN compare; bool/float/array traits never match per engine
semantics so they fall outside the gate.

Bool branch compares via the variant's `::STRING` form rather than
`::BOOLEAN` — Snowflake's optimiser eagerly evaluates the BOOLEAN cast
even when the `IS_BOOLEAN` guard would have short-circuited
(`100037: Boolean value 'red' is not recognized`). Float branch uses
`TRY_TO_DOUBLE` for the same reason.

Bench (X-Small, 870M rows, multi-condition segment, cache off):
  - typed CASE:  ~6.3s
  - option 2:    ~3.65s
  - old string:  ~2.8s

Parity: 600 passed + 2 xfailed (semver prerelease).

Narrow theoretical divergence on segment-value formats not in the test
data (e.g. `"001"` against int trait `1` — engine matches via `int()`,
SQL doesn't). Acknowledged trade-off for the perf gain.
Adds pytest-cov, configures branch coverage on `src` and `tests`, and
gates CI via 5monkeys/cobertura-action with `minimum_coverage: 100` —
matching the github-webhook-handler setup.

Audit of uncovered lines turned up dead code: `TranslateContext.jsonpath_expr`
had a fallback path for resolving non-identity JSONPath against the eval
context, but `translate_condition` now routes those through
`_engine_static_verdict` before jsonpath_expr is reached. Trimmed.

Defensive paths marked `# pragma: no cover[/branch]` with a one-line why:

  - `_engine_static_verdict` engine-exception fallback
  - conftest skip paths (env-config-dependent, missing submodule)
  - `_engine_in_values`'s `[`-prefixed-but-not-list branch (unreachable —
    valid JSON starting with `[` is always an array)
  - parity-result loop branches that fire only when the dataset has an
    untranslatable case (none today)

New unit tests cover the testable gaps: unknown operator, MODULO/IN with
bad operands, PERCENTAGE_SPLIT with absent identity fields, identity-bound
JSONPath comparators (CONTAINS/NOT_CONTAINS/NOT_EQUAL/IS_SET/IS_NOT_SET),
trait CONTAINS/NOT_CONTAINS, semver with an unsupported operator, rule
composition (NONE, empty rule, untranslatable nested rule, unknown type),
and the helper-level `_engine_in_values` / `jsonpath_expr` edges.
…returns False

Previously the translator returned None for any condition shape it
couldn't compile to "interesting" SQL — including cases the engine
itself evaluates to a deterministic False (bad operands, missing
properties, non-numeric thresholds). That muddied the contract for
callers: "None means fall back to the engine" was right for some Nones
and wrong for others.

Now:

- Engine-deterministic-False inputs compile to `"FALSE"`. Examples:
  numeric comparator with a non-numeric operand (`"100; DROP TABLE …"`,
  empty MODULO operand), `EQUAL` with no `value` field, IN with a
  non-iterable value, semver-suffixed segment value paired with an
  operator the engine treats as plain string compare.
- Internal-misuse contracts use `assert`, not `return None`. Examples:
  `jsonpath_expr` called with a non-identity property (the translator
  routes those through static eval before reaching here), unknown
  segment rule type (Flagsmith's segment validation rejects upstream),
  empty rule (same), `_comparison` exiting without matching any op.
- `None` is now reserved for genuinely untranslatable inputs: RE2-unsafe
  regex (backref / lookaround), an operator outside
  `TRANSLATABLE_OPERATORS`, and PERCENTAGE_SPLIT keyed on a non-identity
  JSONPath (TODO: bake the resolved value as a hash literal). Callers
  fall back to the engine for these.

`return None` site count: 19 → 8. Of the remaining 8, 2 are internal
signals between helpers (caller turns them into `"FALSE"`), 3 are pure
propagation up the rule/segment recursion, 3 are the genuine
untranslatable cases above.

Test cleanup at the same time: removed 8 unit tests that existed only
to colour branch-coverage lines green (asserting on internal helpers,
asserts that the segment_key auto-injection works, parsing
`[{"a":"b"}]` through `_engine_in_values`, etc.) Updated the tests that
previously asserted None to assert `"((FALSE))"`.

Coverage stays at 100% line + branch. Parity stays at 653 + 2 xfailed.
…c eval

$.identity.* JSONPaths look up against the eval-context identity in the
engine. In the parity-test setup that's the same identity as the row, so
static evaluation gave the right answer; in production, where the SQL
scans many identities per query, statically pre-computing
$.identity.traits.foo against the eval-ctx identity returns the wrong
trait value for every row but the eval-ctx one. Latent bug.

Added _classify_jsonpath to inspect the parsed path structure and route
each shape correctly:

  - $.identity.identifier (any syntax) → i.identifier column
  - $.identity.key (any syntax)        → i.identity_key column
  - $.identity.traits.<X> (any syntax) → i.traits:"<X>" per-row
  - $.identity (the whole object)      → static eval (the engine's
    fail-cast on a dict reliably returns False per row, same as static)
  - $.identity.<other>                  → None (no row mapping; caller
    falls back to engine — three engine-test-data cases hit this)
  - non-identity ($.environment.*, etc.) → static eval (correct)

Bracket-notation aliases now work too: $.identity['identifier'],
$.identity['traits']['foo'], etc. all resolve to the same row references
as their dot-notation counterparts.

PERCENTAGE_SPLIT picks up the same routing — $.identity.traits.<X>
hashes the per-row trait, identity columns hash the per-row column.

Parity: 656 passed / 2 xfailed (semver prerelease). Coverage: 100%
line + branch. Three new unit tests cover the new untranslatable
paths ($.identity.foo, $.identity.traits.*, PERCENTAGE_SPLIT on
unmapped identity field).
Translator was emitting Snowflake-specific SQL — TYPEOF(), IS_BOOLEAN(),
IS_DECIMAL(), TRY_TO_DOUBLE(), and ::FLOAT casts — directly from
_trait_typed_eq / _trait_typed_in, which broke the dialect-agnostic
abstraction. Moved both functions onto SnowflakeDialect as `trait_eq` and
`trait_in`, added the corresponding methods to the Dialect protocol,
and have the translator delegate.

Translator now only knows that traits live behind dialect.trait_path()
and that EQUAL/NOT_EQUAL/IN need a type-aware predicate that the dialect
emits — no leaking knowledge of how Snowflake stores or discriminates
trait types. The remaining VARIANT mentions in translator.py are clearly
labelled "Snowflake's implementation" rather than implied invariants.

Engine-faithful IN value parsing (`_engine_in_values`) stays in the
translator since it mirrors flag_engine's parsing rules and is purely
Python-side, dialect-independent. The translator hands the parsed list
to dialect.trait_in so the dialect doesn't redo the parsing.

Parity: 656 passed / 2 xfailed. Coverage: 100% line + branch.
- Result type is now `JsonpathClassification(kind, trait_key)` — kind is
  a Literal of the five classification names; trait_key is set only when
  kind is "trait".
- The "invalid jsonpath syntax" case is folded into the trait kind: a
  prop that doesn't parse as JSONPath becomes `("trait", prop)`, which
  is what the engine does when its jsonpath compile fails. Removes the
  None return entirely.
- Translator always classifies — non-`$`-prefixed and empty-string props
  flow through the same code path; the `if classification is not None`
  guards drop out.

Parity: 656 passed / 2 xfailed. Coverage: 100% line + branch.
Compile a JSONPath only when the property actually starts with `$` —
non-prefixed props are bare trait keys, classified as `("trait", prop)`
without the parse roundtrip.
Splitting the previous "static" classification: $.identity (the whole
identity dict) gets its own kind that encodes the row-based answer
directly, since every IDENTITIES row IS an identity by construction.

Per engine semantics (cf. test_context_value_jsonpath_pointing_to_object
note: "paths leading to non-primitive values are treated as not set"):

  - IS_SET($.identity)     → FALSE
  - IS_NOT_SET($.identity) → TRUE
  - any scalar comparator  → FALSE (engine fail-casts on the dict)

The previous "static" classification went through _engine_static_verdict
against the eval context, which silently assumed the eval ctx carries an
identity. A production caller scanning rows without binding a specific
identity would have got the wrong answer. Encoding the row-truth removes
that assumption.

Also xfail one engine-test-data case that needs trait-first dispatch
(engine resolves `$.identity` as a literal trait key first; the dataset
includes a trait literally named `$.identity`). Replicating per-row
trait fallback in SQL would bloat every JSONPath predicate; defer.
Engine's `_get_context_value_getter` tries the property as a literal
trait key first, falling back to JSONPath only if no such trait exists.
A row that happens to carry a trait literally named `$.identity` (or
`$.environment.name`, etc.) wins over the JSONPath resolution in the
engine; we now mirror that in SQL.

Every non-trait JSONPath classification is wrapped in
`IFF(traits:"<prop>" IS NOT NULL, <trait_predicate>, <fallback>)`.
The fallback is the per-classification SQL we already had (column ref
for identity columns, static verdict for env paths, etc.); the trait
branch goes through the same `_translate_trait_op` path that bare trait
keys use.

PERCENTAGE_SPLIT skips the wrapper — its hash subject can't be split
across an IFF, and trait shadowing on PS is a vanishingly rare shape.
Untranslatable identity-bound paths (`$.identity.foo`,
`$.identity.traits.*`) still return None — without a fallback SQL the
shadow check alone can't be row-correct, so the caller's engine
fallback handles the whole condition (including shadowing).

Cost: one extra `traits:"<prop>" IS NOT NULL` probe per JSONPath
condition per row. Trait-heavy workloads (typical) pay nothing — bare
trait keys bypass the wrapper. JSONPath-heavy workloads see roughly
+15-30% per condition at X-Small on 870M rows.

Removes the trait-shadow xfail (case 139). Now: 658 passed, 2 xfailed
(semver-prerelease only). 100% line + branch coverage.
Bench result on the 870M-row table at X-Small:
  - trait-only segment (4 conditions): unchanged ~3.7s
  - same segment with one wrapped JSONPath: ~7.5s — Snowflake evaluates
    both arms of the IFF rather than short-circuiting, so each wrapped
    condition roughly doubles in cost.

Engine-faithful trait-first dispatch matters only for an esoteric shape
(trait literally named `$.identity` or similar) — exactly one
engine-test-data case. Pay the perf cost on every JSONPath condition in
every production query for that, no thanks. Reverted.

Case 139 returns to XFAIL_CASE_FILENAMES with a comment explaining the
trade.
khvn26 added 6 commits May 8, 2026 13:30
Coverage.py records a fall-through branch from the last `case` of a
`match` to function exit even when the match is exhaustive over a
`Literal` type — it can't see mypy's exhaustiveness check. That
showed up as `114->exit` and `544->exit` partials in CI, dropping
total coverage to 99% and failing the 100% Cobertura gate.

Add `case .+:` to `[tool.coverage.report] partial_branches` so any
`case` line is treated as an intentionally-partial branch. Both
existing match statements are exhaustive and already verified by
mypy; this just stops coverage.py from double-counting against us.

beep boop
Renovate's git-submodules manager only opens PRs when the submodule
declares a tracked branch in `.gitmodules`. We were pinned a single
infra commit ahead of v3.7.0; rolling back to the tag and adding
`branch = v3.7.0` matches the Python flag-engine's pattern and lets
Renovate raise a PR whenever a new release is tagged on engine-test-
data. The infra commit we drop is a pre-commit setup in the upstream
repo with no test-data changes — engine parity stays green.

`renovate.json` mirrors the Python engine's structure but tightens
the policy: all proactive update types are disabled by default, so
non-submodule deps only get PRs through Renovate's vulnerability-
alert path; the git-submodules manager is the only re-enabled lane,
which makes engine-test-data releases the one channel that bumps
without manual intervention.

beep boop
Mirrors the Python flag-engine: every path owned by the
flagsmith-back-end team.

beep boop
The wide-form comparison column gets dropped along the way — we don't
ship a typed-column shape, so contrasting against it was misleading
prose.

beep boop
Implements the `Dialect` protocol against ClickHouse — `MergeTree`
schema with `Nullable(String)` traits read via `JSONExtract*`, RE2
anchored regex via `match`, hex-chunk parse via
`reinterpretAsUInt32(reverse(unhex(...)))`, type dispatch via
`JSONType`.

Adds a `ClickHouseHarness` for the engine-parity suite (clickhouse-
connect HTTP client, per-run Memory table, batched UNION ALL of
`count() > 0`). All 599 engine-test-data segments pass; the three
xfails match Snowflake's set (translator-level divergences shared by
every dialect).

beep boop
@khvn26 khvn26 changed the title ClickHouse dialect feat: ClickHouse dialect May 12, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

File Coverage Missing
All files 100%

Minimum allowed coverage is 100%

Generated by 🐒 cobertura-action against f89c656

khvn26 added 3 commits May 13, 2026 19:30
Replaces the `Nullable(String)` traits column + `JSONExtract*` access
with `JSON` (CH 24+) + typed-variant subcolumn access. Trait reads
become direct columnar scans rather than per-row JSON parses.

Wide-String was 14-30x slower than Snowflake at 870M; native JSON
closes that to 2.5-4x.

Also drops the hex round-trip in `parse_hex_chunk`: `md5_hex` now
returns the raw 16-byte digest and `parse_hex_chunk` reads bytes
directly via `reinterpretAsUInt32(reverse(substring(...)))`.

Engine parity unchanged (599 passed, 3 xfailed — same set as Snowflake).

beep boop
The first cut of the JSON-native dialect emitted an OR across five
typed-variant subcolumns per trait condition (`.:String`, `.:Int64`,
`.:UInt64`, `.:Float64`, `.:Bool`). `toString(<sub>)` returns the JSON
value's canonical string form in a single subcolumn read — matches
Snowflake's `v::STRING` fast path — so the typical case collapses to
one comparison.

`trait_eq` (positive) becomes:
  - fast path: `toString(<sub>) = <str_lit>`
  - bool branch: `<sub>.:Bool = <target>` (covers Python-repr "True" /
    "False" coercions)
  - float branch: `toFloat64OrNull(toString(<sub>)) = <float_lit>`
    (covers floats whose toString integer-trims, e.g. 1.0 → '1')

`trait_in` collapses three typed branches into one `toString(<sub>) IN
(...)` gated on `.:Bool IS NULL AND .:Float64 IS NULL` to exclude bool
and float traits per engine semantics.

`NOT_EQUAL` keeps explicit per-type dispatch (engine wants cast-failure
→ False, which `.:Type IS NOT NULL AND ...` encodes directly).

Measured at 100M on the Cloud trial: simple 2.53s → 2.15s (-15%),
multi 4.61s → 4.06s (-12%), % split unchanged. Modest gain — CH's JSON
engine already skips empty subcolumns cheaply, so the OR-of-typed
shape wasn't 5× worse, just 15%. Worth keeping because the resulting
SQL is also shorter.

Engine parity unchanged (599 passed, 3 xfailed).

beep boop
- Drop a stale `bench` reference (`bench/` was an exploratory artefact,
  not part of the merged dialect).
- In the Snowflake-vs-CH comparison table, replace the under-spec'd
  "Type discrimination" row to lead with the `toString(<sub>)` fast
  path (the actual main path through `trait_eq` / `trait_in`) with
  typed-variant subcolumns as the fallback shape.
- Note in the "Hex chunk parse" row that CH reads raw bytes — no hex
  round-trip — to match what `parse_hex_chunk` actually emits.

beep boop
@khvn26 khvn26 changed the title feat: ClickHouse dialect feat(ClickHouse): Add ClickHouse dialect with native JSON traits May 13, 2026
@khvn26 khvn26 marked this pull request as ready for review May 13, 2026 18:46
@khvn26 khvn26 requested a review from a team as a code owner May 13, 2026 18:46
@khvn26 khvn26 requested review from Zaimwa9 and gagantrivedi and removed request for a team May 13, 2026 18:46
@khvn26
Copy link
Copy Markdown
Member Author

khvn26 commented May 15, 2026

Closing — the ClickHouse work has been folded into the rebuilt #1. Snowflake is now its own follow-up at #9.

@khvn26 khvn26 closed this May 15, 2026
@khvn26 khvn26 deleted the feat/clickhouse-dialect branch May 15, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants