gguf

Star

Here are 421 public repositories matching this topic...

AlexsJones / llmfit

Sponsor

Star

Hundreds of models & providers. One command to find what runs on your hardware.

skill mlx llm localai gguf unsloth

Updated Mar 27, 2026
Rust

LostRuins / koboldcpp

Star

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

llama language-model gemma mistral koboldai llm llamacpp ggml koboldcpp gguf

Updated Mar 28, 2026
C++

Michael-A-Kuykendall / shimmy

Sponsor

Star

⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

rust machine-learning transformers api-server developer-tools llama command-line-tool lora inference-server rust-crate huggingface huggingface-transformers huggingface-models llamacpp llm-inference local-ai gguf ollama-api openai-compatible

Updated Mar 26, 2026
Rust

janhq / cortex.cpp

Star

Local AI API Platform

onnx onnxruntime llamacpp gguf

Updated Jul 4, 2025
C++

Mobile-Artificial-Intelligence / maid

Sponsor

Star

Maid is a free and open source application for interfacing with llama.cpp models locally, and with Anthropic, DeepSeek, Ollama, Mistral and OpenAI models remotely.

android facebook chatbot openai llama mistral claude chatgpt anthropic llama-cpp ollama gguf mobile-artificial-intelligence deepseek

Updated Mar 10, 2026
TypeScript

datawhalechina / handy-ollama

Star

动手学Ollama，CPU玩转大模型部署，在线阅读地址：https://datawhalechina.github.io/handy-ollama/

agent tutorial rag large-language-models llm langchain llamaindex ollama gguf

Updated Jan 15, 2026
Jupyter Notebook

heshengtao / comfyui_LLM_party

Star

LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces, such as o1,ollama, gemini, grok, qwen, GLM, deepseek, kimi,doubao. Adapted to local llms, vlm, gguf such as llama-3.3 Janus-Pro, Linkage graphRAG

linux agent flux workflow ocr mcp gemini openai llama vlm dify o1 comfyui ollama gguf gpt-sovits graphrag omost janus-pro

Updated Mar 8, 2026
Python

withcatai / node-llama-cpp

Star

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

Updated Mar 17, 2026
TypeScript

sammcj / gollama

Sponsor

Star

Go manage your Ollama models

macos linux ai models tui llm ggml ollama gguf

Updated Dec 30, 2025
Go

edwko / OuteTTS

Sponsor

Star

Interface for OuteTTS models.

text-to-speech transformers tts llama gguf

Updated Mar 23, 2026
Python

kitops-ml / kitops

Star

An open source DevOps tool from the CNCF for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI Artifact.

Updated Mar 26, 2026
Go

alichherawalla / off-grid-mobile-ai

Star

The Swiss Army Knife of Offline AI. Chat, Speak, and Generate Images - Privacy First, Zero Internet. Download an LLM and use it on your mobile device. No data ever leaves your phone. Supports text-to-text, vision, text-to-image

privacy-first edge-ai ondevice mobile-ai llama-cpp local-ai offline-llm gguf stable-diffusion-android offline-ai whisper-android tool-calling ondevice-ai

Updated Mar 28, 2026
TypeScript

intel / auto-round

Star

SOTA rounding-based quantization for high-accuracy low-bit LLM inference, seamlessly optimized for CPU, Intel GPU, and CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.

transformers rounding quantization int4 llms vllm gguf vlms sglang mxfp4 nvfp4

Updated Mar 28, 2026
Python

alvarobartt / hf-mem

Sponsor

Star

A CLI to estimate inference memory requirements for Hugging Face models, written in Python.

huggingface safetensors gguf hf-extension

Updated Mar 21, 2026
Python

eastriverlee / LLM.swift

Sponsor

Star

LLM.swift is a simple and readable library that allows you to interact with large language models locally with ease for macOS, iOS, watchOS, tvOS, and visionOS.

macos swift ios tvos watchos llm llm-inference visionos gguf