Skip to content

KeyError: 'default' for discrete diffusion language model LLaDA2 #13357

@ksasi

Description

@ksasi

Describe the bug

Hi,

The following code block from the documentation (https://huggingface.co/docs/diffusers/main/api/pipelines/llada2#diffusers.LLaDA2PipelineOutput) is giving key error :

model_id = "inclusionAI/LLaDA2.1-mini"

model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto"
)

Reproduction

mport torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig

from diffusers import BlockRefinementScheduler, LLaDA2Pipeline

model_id = "inclusionAI/LLaDA2.1-mini"

model = AutoModelForCausalLM.from_pretrained(
model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
scheduler = BlockRefinementScheduler()

Logs

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[/tmp/ipykernel_1800/3647447919.py](https://localhost:8080/#) in <cell line: 0>()
      7 model_id = "inclusionAI/LLaDA2.1-mini"
      8 
----> 9 model = AutoModelForCausalLM.from_pretrained(
     10     model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto"
     11 )

4 frames
[/usr/local/lib/python3.12/dist-packages/transformers/models/auto/auto_factory.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    363                 model_class.register_for_auto_class(auto_class=cls)
    364             model_class = add_generation_mixin_to_remote_model(model_class)
--> 365             return model_class.from_pretrained(
    366                 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    367             )

[/usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, weights_only, *model_args, **kwargs)
   4070         with ContextManagers(model_init_context):
   4071             # Let's make sure we don't run the init function of buffer modules
-> 4072             model = cls(config, *model_args, **model_kwargs)
   4073 
   4074             if hf_quantizer is not None:  # replace module with quantized modules (does not touch weights)

[~/.cache/huggingface/modules/transformers_modules/inclusionAI/LLaDA2_dot_1_hyphen_mini/f21be037104f6e044e1a86b6d8864a6b85cc868e/modeling_llada2_moe.py](https://localhost:8080/#) in __init__(self, config)
    960     def __init__(self, config: LLaDA2MoeConfig):
    961         super().__init__(config)
--> 962         self.model = LLaDA2MoeModel(config)
    963         self.vocab_size = config.vocab_size
    964         self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)

[~/.cache/huggingface/modules/transformers_modules/inclusionAI/LLaDA2_dot_1_hyphen_mini/f21be037104f6e044e1a86b6d8864a6b85cc868e/modeling_llada2_moe.py](https://localhost:8080/#) in __init__(self, config)
    781         self._use_flex_attention = config._attn_implementation == "flex_attention"
    782         self.norm = LLaDA2MoeRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
--> 783         self.rotary_emb = LLaDA2MoeRotaryEmbedding(config=config)
    784         self.gradient_checkpointing = False
    785         # Initialize weights and apply final processing

[~/.cache/huggingface/modules/transformers_modules/inclusionAI/LLaDA2_dot_1_hyphen_mini/f21be037104f6e044e1a86b6d8864a6b85cc868e/modeling_llada2_moe.py](https://localhost:8080/#) in __init__(self, config, device)
    106 
    107         self.config = config
--> 108         self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type]
    109 
    110         inv_freq, self.attention_scaling = self.rope_init_fn(self.config, device)

KeyError: 'default'

System Info

0.38.0.dev0 (from git+https://github.com/huggingface/diffusers@f2be8bd6b3dc4035bd989dc467f15d86bf3c9c12)

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions