MINIFICPP-2719 - Add multimodal capability to llama.cpp processor by adamdebreceni · Pull Request #2107 · apache/nifi-minifi-cpp

adamdebreceni · 2026-02-17T14:08:52Z

Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit?

For code changes:

If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE file?
If applicable, have you updated the NOTICE file?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.

Copilot

Pull request overview

This PR updates the MiNiFi C++ llama.cpp extension to support multimodal (mtmd) inference, including wiring FlowFile content as “files” into the llama.cpp mtmd pipeline and optionally writing model output to a FlowFile attribute instead of overwriting content.

Changes:

Bump vendored llama.cpp to b8944 and apply a new patch to build mtmd support and fix missing includes.
Extend RunLlamaCppInference with multimodal model configuration + optional “output to attribute” behavior.
Update the LlamaContext interface and DefaultLlamaContext implementation to accept file buffers and perform mtmd tokenization/eval.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
thirdparty/llamacpp/mtmd-fix.patch	Adds mtmd subdirectory to llama.cpp build, fixes an include, and removes some mtmd tool executables.
thirdparty/llamacpp/lu8_macro_fix.patch	Removes an older llama.cpp patch no longer applied after the version bump.
thirdparty/llamacpp/cpp-23-fixes.patch	Removes an older llama.cpp patch no longer applied after the version bump.
cmake/LlamaCpp.cmake	Bumps llama.cpp tag, enables `LLAMA_BUILD_COMMON`, applies mtmd patch, and extends include dirs for common/tools/vendor headers.
extensions/llamacpp/CMakeLists.txt	Links the extension against `mtmd` and `llama-common` in addition to `llama`.
extensions/llamacpp/processors/LlamaContext.h	Extends `generate()` to accept a list of binary “files” (e.g., images/audio).
extensions/llamacpp/processors/DefaultLlamaContext.h	Adds mtmd/chat-template state and updates constructor/generate signature for multimodal support.
extensions/llamacpp/processors/DefaultLlamaContext.cpp	Implements mtmd initialization, multimodal tokenization/eval, and updated decode loop.
extensions/llamacpp/processors/RunLlamaCppInference.h	Adds `MultiModal Model Path` and `Output Attribute Name` properties and stores them in member state.
extensions/llamacpp/processors/RunLlamaCppInference.cpp	Passes FlowFile bytes as multimodal files, inserts mtmd marker, and optionally writes output to an attribute.
extensions/llamacpp/tests/RunLlamaCppInferenceTests.cpp	Updates the mock context to match the new `generate()` signature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lordgamez · 2026-05-18T11:53:36Z

+  if (!multimodal_model_path) {
+    logger->log_info("No multimodal model path provided");
+    return;
+  }
+
+  mtmd_context_params mparams = mtmd_context_params_default();
+  mparams.use_gpu = false;
+  mparams.flash_attn_type  = LLAMA_FLASH_ATTN_TYPE_DISABLED;
+
+  multimodal_ctx_ = mtmd_init_from_file(multimodal_model_path->string().c_str(), llama_model_, mparams);
+  if (!multimodal_ctx_) {
+    throw Exception(ExceptionType::PROCESS_SCHEDULE_EXCEPTION, fmt::format("Failed to load multimodal model from '{}'", multimodal_model_path->string()));
+  }
+
+  logger->log_info("Successfully loaded multimodal model from '{}'", multimodal_model_path->string());


I would extract this to a separate function and have something like

if (multimodal_model_path) { initializeMultimodalContext(); }

lordgamez · 2026-05-18T12:46:12Z

+  auto batch_deleter = gsl::finally([&] {llama_batch_free(batch);});
+  batch.n_tokens = 1;
+  batch.n_seq_id[0] = 1;
+  batch.seq_id[0][0] = 0;
+  batch.logits[0] = true;


This can be moved before the while (decode_status == 0) { line as it is only used in the loop. Also it might be better to use a wrapper object for automatic initialization and destruction.

lordgamez · 2026-05-18T12:51:44Z

+    if (files.empty()) {
+      return std::unexpected{"Multimodal input requires at least one file"};
+    }
+    std::vector<unique_bitmap_ptr> bitmaps;
+    for (auto& file : files) {
+      unique_bitmap_ptr bitmap{mtmd_helper_bitmap_init_from_buf(multimodal_ctx_, reinterpret_cast<const unsigned char*>(file.data()), file.size())};
+      if (!bitmap) {
+        throw Exception(PROCESSOR_EXCEPTION, "Failed to create multimodal bitmap from buffer");
+      }
+      bitmaps.push_back(std::move(bitmap));
+    }
+    mtmd_input_text inp_txt = {
+      .text = prompt.c_str(),
+      .add_special = true,
+      .parse_special = true,
+    };
+    unique_mtmd_input_chunks_ptr chunks{mtmd_input_chunks_init()};
+    auto bitmap_c_ptrs = bitmaps | ranges::views::transform([] (auto& ptr) {return static_cast<const mtmd_bitmap*>(ptr.get());}) | ranges::to<std::vector>();
+    auto tokenized = mtmd_tokenize(multimodal_ctx_, chunks.get(), &inp_txt, bitmap_c_ptrs.data(), bitmap_c_ptrs.size());
+    if (tokenized != 0) {
+      throw Exception(PROCESSOR_EXCEPTION, fmt::format("Failed to tokenize multimodal prompt, error: {}", tokenized));
+    }
+    auto status = mtmd_helper_eval_chunks(multimodal_ctx_, llama_ctx_, chunks.get(), 0, 0, 1, true, &n_past);
+    if (status != 0) {
+      throw Exception(PROCESSOR_EXCEPTION, fmt::format("Failed to eval multimodal chunks, error: {}", status));
+    }


I would extract this to a separate function. Additionally why is llama_decode run in case of the of string tokenization, but not in the multimodal use case?

lordgamez · 2026-05-18T12:55:00Z

+    if (multimodal_) {
+      if (files.empty()) {


This could be merged to if (multimodal_ && files.empty())

fgerlits · 2026-05-21T08:16:49Z


+@step('a directory at "{directory}" has a file with the content from "{path}"')
+@step("a directory at '{directory}' has a file with the content from '{path}'")
+def create_file_with_content_in_directory(context: MinifiTestContext, directory: str, path: str):


we should keep function names unique:

Suggested change

def create_file_with_content_in_directory(context: MinifiTestContext, directory: str, path: str):

def create_file_with_content_from_path_in_directory(context: MinifiTestContext, directory: str, path: str):

fgerlits · 2026-05-21T11:01:43Z

  EXTENSIONAPI static constexpr auto Properties = std::to_array<core::PropertyReference>({
    ModelPath,
+    OutputAttributeName,
+    MultiModalModelPath,


please add the new properties to PROCESSORS.md, too

adamdebreceni force-pushed the multimodal_llama branch from 8cb0923 to b207ec4 Compare February 18, 2026 12:39

adamdebreceni added 5 commits April 30, 2026 10:03

MINIFICPP-2719 - Add multimodal capability to llama.cpp processor

31ddfff

MINIFICPP-2719 - Do not build executable tools

6e5d52c

MINIFICPP-2719 - Fix build

d2ce276

MINIFICPP-2719 - Fix rebase

9dc1f90

MINIFICPP-2719 - Fix template use

09c3416

adamdebreceni force-pushed the multimodal_llama branch from b207ec4 to 09c3416 Compare April 30, 2026 08:03

adamdebreceni added 4 commits April 30, 2026 11:20

MINIFICPP-2719 - Fix build

efb65a9

MINIFICPP-2719 - Clang tidy fix, win fix

f96ef0b

MINIFICPP-2719 - Linter fix

e111c62

MINIFICPP-2719 - gcc 13 fix

841edc6

adamdebreceni marked this pull request as ready for review May 4, 2026 12:18

lordgamez self-requested a review May 5, 2026 11:32

martinzink requested review from Copilot and martinzink May 5, 2026 12:27

Copilot started reviewing on behalf of martinzink May 5, 2026 12:28 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

adamdebreceni and others added 10 commits May 13, 2026 17:12

MINIFICPP-2719 - Fix review

97c02ac

MINIFICPP-2719 - Add output attribute test

049df46

MINIFICPP-2719 - Add multimodal test

09fe599

MINIFICPP-2719 - Copy test file

b90ec86

MINIFICPP-2719 - Fix typo

c0bf227

MINIFICPP-2719 - Fix docker

263fa3e

MINIFICPP-2719 - Fix test

15ca989

Fix file content assignment in core_steps.py

1a85882

Increase timeout for output file placement to 300 seconds

d059ed1

Increase timeout for output file placement

3bfaffb

lordgamez reviewed May 18, 2026

View reviewed changes

martinzink approved these changes May 20, 2026

View reviewed changes

fgerlits reviewed May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MINIFICPP-2719 - Add multimodal capability to llama.cpp processor#2107

MINIFICPP-2719 - Add multimodal capability to llama.cpp processor#2107
adamdebreceni wants to merge 19 commits into
apache:mainfrom
adamdebreceni:multimodal_llama

adamdebreceni commented Feb 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lordgamez May 18, 2026

Uh oh!

lordgamez May 18, 2026

Uh oh!

lordgamez May 18, 2026

Uh oh!

lordgamez May 18, 2026

Uh oh!

fgerlits May 21, 2026

Uh oh!

fgerlits May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	def create_file_with_content_in_directory(context: MinifiTestContext, directory: str, path: str):
	def create_file_with_content_from_path_in_directory(context: MinifiTestContext, directory: str, path: str):

Conversation

adamdebreceni commented Feb 17, 2026

For all changes:

For code changes:

For documentation related changes:

Note:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lordgamez May 18, 2026

Choose a reason for hiding this comment

Uh oh!

lordgamez May 18, 2026

Choose a reason for hiding this comment

Uh oh!

lordgamez May 18, 2026

Choose a reason for hiding this comment

Uh oh!

lordgamez May 18, 2026

Choose a reason for hiding this comment

Uh oh!

fgerlits May 21, 2026

Choose a reason for hiding this comment

Uh oh!

fgerlits May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants