Skip to content

Adds NVIDIA PixelDiT and PiD support#1393

Open
jtreminio wants to merge 10 commits into
mcmonkeyprojects:masterfrom
jtreminio:pixeldit-pid-support
Open

Adds NVIDIA PixelDiT and PiD support#1393
jtreminio wants to merge 10 commits into
mcmonkeyprojects:masterfrom
jtreminio:pixeldit-pid-support

Conversation

@jtreminio
Copy link
Copy Markdown
Contributor

@jtreminio jtreminio commented May 26, 2026

Depends on Comfy-Org/ComfyUI#14103

Not included: docs updates.

PixelDiT is an image model. Not that great.

The interesting part of this PR is the PiD, a 4x-locked upscaler that now replaces the refiner stage's upscaler. The upscale happens after the refiner's SwarmKSampler node.

PixelDiT workflow:
CleanShot 2026-05-26 at 12 30 51

PiD upscale workflow:
CleanShot 2026-05-27 at 18 28 36

@jtreminio jtreminio marked this pull request as ready for review May 27, 2026 23:29
@jtreminio
Copy link
Copy Markdown
Contributor Author

1839001-A high-quality, cinematic portrait featu

Comment thread docs/Model Support.md Outdated
# PixelDiT

- NVIDIA's [PixelDiT](<https://huggingface.co/Comfy-Org/PixelDiT>) is supported in SwarmUI!
- Or the smaller FP8 version: [Comfy-Org/PixelDiT - mxfp8](<https://huggingface.co/Comfy-Org/PixelDiT/resolve/main/diffusion_models/pixeldit_1300m_1024px_mxfp8.safetensors>)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonktext

if (doUpscale && upscaleMethod.StartsWith("pidmodel-"))
{
string pidModelName = upscaleMethod.After("pidmodel-");
T2IModel pidModel = Program.MainSDModels.GetModel(pidModelName);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check t2iprompthandling for "lora", there's a weird special case pattern for how indirectly specified models are read that accommodates both white/blacklisting of models and user-typing issues (eg excluding the .safetensors or not)

string pidSampled = g.CreateKSampler(g.CurrentModel.Path, [pidCond, 0], pidNeg, [pidEmptyLatent, 0], pidCfg, pidSteps, 0, 10000,
g.UserInput.Get(T2IParamTypes.Seed) + 2, false, true, defsampler: "lcm", defscheduler: "simple", explicitSampler: pidSampler, explicitScheduler: pidScheduler, sectionId: T2IParamInput.SectionID_PixelDecoder);
g.CurrentMedia = g.CurrentMedia.WithPath([pidSampled, 0], WGNodeData.DT_LATENT_IMAGE, pidModel.ModelClass?.CompatClass);
g.CurrentMedia.Width = pidWidth;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the Refiner Upscale, since target size is user-specified, follow user specified size by way of doing a post-rescale in pixel space, see how ImageUpscaleWithModel does it above

Comment thread src/Text2Image/T2IModelClassSorter.cs Outdated
bool isHiDreamO1Lora(JObject h) => hasLoraKey(h, "final_layer2.linear") && hasLoraKey(h, "language_model.layers.0.self_attn.q_proj");
bool isChroma(JObject h) => h.ContainsKey("distilled_guidance_layer.in_proj.bias") && h.ContainsKey("double_blocks.0.img_attn.proj.bias");
bool isChromaRadiance(JObject h) => h.ContainsKey("nerf_image_embedder.embedder.0.bias");
bool isPiD(JObject h) => h.ContainsKey("net.lq_proj.latent_proj.0.weight");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you pick another key or two each just to narrow it? The list is getting long enough that we're getting occasional surprise overlaps.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added net.pixel_blocks.0.attn.q_norm.weight for isPid() and core.pixel_blocks.0.attn.q_norm.weight for isPixelDiT(). I figure keys with pixel_ in them aren't very common (yet). Clearing metadata is clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants