feat(ai): add ROCm and MIGraphX execution providers for AMD GPUs by Benehiko · Pull Request #335 · deven96/ahnlich

Benehiko · 2026-05-18T07:52:43Z

Note: This PR was generated with Claude. The patch has been compile-checked, unit-tested, regenerated for Go/Node/Python SDKs, and smoke-tested end-to-end via ahnlich-cli — but ROCm/MIGraphX on AMD hardware has not been validated because the current upstream Dockerfile ships a CUDA-flavour ONNX Runtime bundle. Treat the runtime side of this PR as enabling the wire-up; an AMD-hardware validation is a follow-up that depends on packaging a ROCm-enabled or MIGraphX-enabled ORT image.

Summary

Adds two opt-in execution providers so ahnlich-ai can target AMD GPUs:

ROCM — ONNX Runtime's ROCm execution provider (works on onnxruntime < 1.23)
MIGRAPHX — AMD's recommended replacement after onnxruntime removed the ROCm provider in 1.23 (ROCm EP removal note)

Both wire through the same path the existing CUDA, TENSOR_RT, DIRECT_ML, and CORE_ML providers use — no new dependencies, no behaviour change for existing providers.

The ort crate (2.0.0-rc.5, already pinned in ahnlich/ai/Cargo.toml) exposes both as ROCmExecutionProvider and MIGraphXExecutionProvider.

Why two providers

ahnlich currently pins ort to 2.0.0-rc.5 (against ONNX Runtime 1.19), where the ROCm execution provider still ships. AMD removed the ROCm provider from onnxruntime in 1.23 and recommends MIGraphX for new builds. To stay useful both today and after an ORT bump, this PR adds both variants. Maintainers can keep both or drop ROCM once ahnlich bumps the ORT pin past 1.23.

Changes

Protocol + Rust core

protos/ai/execution_provider.proto — add ROCM = 4 and MIGRAPHX = 5 with doc comments that flow through to every language binding via the generators
protos/README.md — mention ROCM and MIGRAPHX alongside CUDA / TENSOR_RT
ahnlich/types/src/ai/execution_provider.rs — regenerated Rust enum (mirrors what build.rs produces from the updated .proto)
ahnlich/ai/src/engine/ai/providers/ort/mod.rs
- import ROCmExecutionProvider and MIGraphXExecutionProvider from ort
- add InnerAIExecutionProvider::ROCm and InnerAIExecutionProvider::MIGraphX
- extend the From<AIExecutionProvider> impl for both variants
- extend register_provider to call ROCmExecutionProvider::default().register(...) and MIGraphXExecutionProvider::default().register(...)

DSL grammar + parser + tests

ahnlich/dsl/src/syntax/syntax.pest — extend the execution_provider rule so the pest tokeniser actually emits rocm / migraphx. Without this the DSL rejects the new keywords at the parser layer before parse_to_execution_provider ever sees them.
ahnlich/dsl/src/ai.rs — accept "rocm" and "migraphx" in parse_to_execution_provider
ahnlich/dsl/src/tests/ai.rs — add test_get_sim_n_parse_rocm_execution_provider and test_get_sim_n_parse_migraphx_execution_provider, mirroring the existing TensorRT / CUDA round-trips

SDK regeneration

sdk/ahnlich-client-go/grpc/ai/execution_provider/execution_provider.pb.go — regenerated via buf generate
sdk/ahnlich-client-node/grpc/ai/execution_provider_pb.ts — regenerated via buf generate
sdk/ahnlich-client-py/ahnlich_client_py/grpc/ai/execution_provider/__init__.py — ROCM + MIGRAPHX variants added in the shape betterproto emits. A full make grpc-update-python run in the maintainer environment is recommended so any incidental codegen drift (formatter / generator version) is captured properly.

Docs

README.md — document ROCm and MIGraphX prerequisites under "Execution Providers", including the note that upstream ORT removed the ROCm provider in 1.23

Verification done locally

cargo check -p ahnlich_types -p dsl — passes
cargo test -p dsl — 30 / 30 passing (the two new tests round-trip "rocm" and "migraphx" end-to-end through parse_ai_query)
cargo check -p ai — passes against the actual libonnxruntime.so 1.19.0 shipped in the official ahnlich-ai image (used as ORT_LIB_LOCATION), confirming the patched ort/mod.rs compiles cleanly
End-to-end smoke test: built the patched binary into a local Docker image, ran it under a container runtime, and verified via ahnlich-cli:
- executionprovider rocm and executionprovider migraphx parse successfully (server responds with the expected store-lookup error against an empty store)
- executionprovider bogus is rejected at the parser layer (the negative test that surfaced the missing syntax.pest rule in the first place)
Proto regen: confirmed build.rs regenerates ahnlich/types/src/ai/execution_provider.rs from the updated .proto to the same Rust shape this PR ships; confirmed buf generate reproduces the committed Go and Node stubs

What is intentionally not included

Dockerfile / image work. The existing Dockerfile builds against onnxruntime-linux-x64-gpu-1.19.0.tgz, which is the CUDA flavour. To actually exercise ROCm or MIGraphX at runtime, ahnlich-ai needs an image built against a ROCm-enabled or MIGraphX-enabled ORT (either built from source with --use_rocm / --use_migraphx, or pulled from AMD's rocm-onnxruntime package). That is a packaging concern that deserves its own PR + maintainer input — probably Dockerfile.rocm and/or Dockerfile.migraphx plus a CI matrix entry.
AMD-hardware validation. Without a ROCm/MIGraphX-enabled ORT image the new register_provider arms can only be exercised through the DSL → gRPC layer (which is done above), not through actual model inference.
A full make grpc-update-python run. The Python __init__.py change is a minimal manual mirror of what betterproto would emit. A maintainer-side poetry run generate_from_protos is recommended to catch any formatter drift.

Manual checklist for whoever picks this up

Run make grpc-update-python from a clean environment to confirm the Python stub matches the patch
Decide whether to keep both ROCM and MIGRAPHX or drop ROCM after the next ORT bump past 1.23
Add Dockerfile.rocm (or Dockerfile.migraphx) bundling a ROCm-enabled / MIGraphX-enabled ORT, plus a CI matrix entry
On an AMD host with the appropriate runtime installed: confirm a model loads through each new provider
Confirm CPU fallback still works when the AMD runtime is absent (InnerAIExecutionProvider::CPU path is unchanged, so this should be free, but worth a sanity run)

References

ONNX Runtime ROCm Execution Provider docs (note removal in 1.23): https://onnxruntime.ai/docs/execution-providers/ROCm-ExecutionProvider.html
ONNX Runtime MIGraphX Execution Provider docs: https://onnxruntime.ai/docs/execution-providers/MIGraphX-ExecutionProvider.html
ort crate ROCmExecutionProvider: https://docs.rs/ort/2.0.0-rc.5/ort/struct.ROCmExecutionProvider.html
ort crate MIGraphXExecutionProvider: https://docs.rs/ort/2.0.0-rc.5/ort/struct.MIGraphXExecutionProvider.html
AMD ROCm compatibility matrix: https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html

Iamdavidonuh · 2026-05-18T10:36:03Z

This looks interesting.

CC: @Ayobami-00

Wires ort's ROCmExecutionProvider and MIGraphXExecutionProvider through the same path as the existing CUDA / TensorRT / DirectML / CoreML providers so ahnlich-ai can target AMD GPUs on Linux when the host has a matching ROCm or MIGraphX runtime and an ORT build that supports either provider. Two variants are included because upstream onnxruntime removed the ROCm execution provider in release 1.23 and recommends MIGraphX as its replacement. ahnlich currently pins ort to 2.0.0-rc.5 (against ORT 1.19, which still ships ROCm), so both variants stay useful until the ORT pin moves past 1.23. - protos/ai/execution_provider.proto: add ROCM = 4 and MIGRAPHX = 5 - ahnlich/types/src/ai/execution_provider.rs: regenerated Rust enum - ahnlich/ai/src/engine/ai/providers/ort/mod.rs: register ROCmExecutionProvider and MIGraphXExecutionProvider via InnerAIExecutionProvider::ROCm and ::MIGraphX - ahnlich/dsl/src/ai.rs: accept "rocm" and "migraphx" in parse_to_execution_provider - ahnlich/dsl/src/syntax/syntax.pest: extend the execution_provider rule to tokenise "rocm" and "migraphx" (without this the DSL rejects the new keywords at the parser layer before parse_to_execution_provider runs) - ahnlich/dsl/src/tests/ai.rs: add round-trip tests for "rocm" and "migraphx" through parse_ai_query (mirrors the existing TensorRT and CUDA tests) - sdk/ahnlich-client-go/grpc/ai/execution_provider/execution_provider.pb.go: regenerate via `buf generate` - sdk/ahnlich-client-node/grpc/ai/execution_provider_pb.ts: regenerate via `buf generate` - sdk/ahnlich-client-py/ahnlich_client_py/grpc/ai/execution_provider/__init__.py: add ROCM and MIGRAPHX variants (matches betterproto's emit; full `make grpc-update-python` regen left to the maintainer's environment) - README.md / protos/README.md: document ROCm and MIGraphX prerequisites and the ORT 1.23 ROCm removal Generated as a reference patch with Claude — not validated against AMD hardware. Verified locally with: - `cargo test -p dsl` (30 / 30 passing, +2 new tests) - `cargo check -p ai` against the actual libonnxruntime.so 1.19.0 bundled in the official ahnlich-ai image - end-to-end DSL smoke test via ahnlich-cli: the patched binary boots, accepts `executionprovider rocm` and `executionprovider migraphx` through the gRPC layer, and rejects unknown tokens at the parser - proto regen confirmed reproducible (build.rs round-trips execution_provider.rs cleanly, `buf generate` produces the same Go/Node stubs committed here)

Benehiko force-pushed the feat/rocm-execution-provider branch 2 times, most recently from 750bffe to 27ce036 Compare May 19, 2026 18:28

Benehiko changed the title ~~feat(ai): add ROCm execution provider for AMD GPUs~~ feat(ai): add ROCm and MIGraphX execution providers for AMD GPUs May 19, 2026

Benehiko force-pushed the feat/rocm-execution-provider branch 2 times, most recently from 517c297 to 5cb85ec Compare May 19, 2026 19:43

Benehiko force-pushed the feat/rocm-execution-provider branch from 5cb85ec to e63d12e Compare May 19, 2026 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): add ROCm and MIGraphX execution providers for AMD GPUs#335

feat(ai): add ROCm and MIGraphX execution providers for AMD GPUs#335
Benehiko wants to merge 1 commit into
deven96:mainfrom
Benehiko:feat/rocm-execution-provider

Benehiko commented May 18, 2026 •

edited

Loading

Uh oh!

Iamdavidonuh commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Benehiko commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why two providers

Changes

Protocol + Rust core

DSL grammar + parser + tests

SDK regeneration

Docs

Verification done locally

What is intentionally not included

Manual checklist for whoever picks this up

References

Uh oh!

Iamdavidonuh commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Benehiko commented May 18, 2026 •

edited

Loading