这是一个关于免费的大模型api的合集,并精选了一部分模型
This is a collection of free LLM apis, and selected some models
我会尽可能更新维护这个项目(目前只有我一个人)
I will keep maintaining and updating this project to the best of my ability
入选原则是:限制请求速率而不是token > 尽可能多的来源 > 尽可能新且好的模型 > 足够用的请求速率
The selection criteria are: limit request rate over token count > more sources > newer and better models > sufficient rate limits
主要是有一定热度的文本模型
Primarily text models that have gained some popularity
目前只接受提供了OpenAI格式的API
At present, only accepted OpenAI-formated API
欢迎大家分享更多api
Welcome to share more apis
这个表格是由Deepseek V4 Flash Thinking生成的,由Taple渲染
This table was generated by Deepseek V4 Flash Thinking, Rendered by Taple
- API: https://api.siliconflow.cn/v1
- Rate Limits: 1000 RPM (each model)
- Models:
deepseek-ai/DeepSeek-R1-0528-Qwen3-8BQwen/Qwen3-8BQwen/Qwen3.5-4BTHUDM/glm-4-9b-chatTHUDM/GLM-4-9B-0414THUDM/GLM-Z1-9B-0414THUDM/GLM-4.1V-9B-Thinking
- API: https://openrouter.ai/api/v1
- Rate Limits: 20 RPM / 200 RPD (each model)
- Models:
qwen/qwen3-coder:freeqwen/qwen3-next-80b-a3b-instruct:freeopenai/gpt-oss-120b:freenvidia/nemotron-3-nano-30b-a3b:freenvidia/nemotron-3-nano-omni-30b-a3b-reasoning:freenvidia/nemotron-3-super-120b-a12b:freearcee-ai/trinity-large-preview:freestepfun/step-3.5-flash:freeminimax/minimax-m2.5:freepoolside/laguna-m.1:freepoolside/laguna-xs.2:freegoogle/gemma-4-26b-a4b-it:freegoogle/gemma-4-31b-it:free
- API: https://chat.intern-ai.org.cn/api/v1
- Rate Limits: 10 RPM
- Tip: 密钥有效期6个月 / The key is valid for 6 months
- Models:
intern-latestintern-s1-miniintern-s1intern-s1-prointernvl3.5-latestinternvl3.5-241b-a28b
-
API: https://generativelanguage.googleapis.com/v1beta/openai
-
Rate Limits: 5 RPM / 20 RPD
-
Models:
gemini-3-flash-preview
-
Rate Limits: 5 RPM / 20 RPD
-
Models:
gemini-2.5-flash
-
Rate Limits: 15 RPM / 500 RPD
-
Models:
gemini-3.1-flash-lite-preview
-
Rate Limits: 10 RPM / 20 RPD
-
Models:
gemini-2.5-flash-lite
-
Rate Limits: 15 RPM / 1500 RPD
-
Models:
gemma-4-26b-a4b-itgemma-4-31b-it
- API: https://api.cohere.ai/compatibility/v1
- Rate Limits: 20 RPM
- Tip:
- 绑定支付方式可以使用速率限制更宽松的 Production Key / Binding payment methods can use rate limiting and relaxed Production Key
- Models:
command-a-reasoning-08-2025command-a-vision-07-2025
- API: https://open.bigmodel.cn/api/paas/v4/
- Rate Limits: 只有并发数限制(均为30) / Only the number of concurrent transactions is limited (both 30).
- Models:
GLM-4-Flash-250414GLM-4V-FlashGLM-4.1V-Thinking-FlashGLM-4.6V-FlashGLM-4.7-Flash
- API: https://models.github.ai/inference
- Rate Limits: 15 RPM / 150 RPD
- Tip:
- 如果使用 Azure API,可以使用更多模型 / If used Azure API, more models available
- Models:
openai/gpt-4.1-nanoopenai/gpt-4.1-miniopenai/gpt-4.1openai/gpt-4oopenai/gpt-4o-miniopenai/gpt-5-nanoopenai/gpt-5-miniopenai/gpt-5-chatopenai/gpt-5
- API: https://api520.pro/v1
- Rate Limits: Unknown
- Tip:
- 赠送¥100额度 / Gift ¥100 Credit
- Models:
太多了自己看
- API: https://integrate.api.nvidia.com/v1
- Rate Limits: 40 RPM
- Models:
deepseek-ai/deepseek-v3.2deepseek-ai/deepseek-v3.1-terminusz-ai/glm4.7moonshotai/kimi-k2-thinkingmoonshotai/kimi-k2-instruct-0905qwen/qwen3-coder-480b-a35b-instructqwen/qwen3.5-122b-a10bstepfun-ai/step-3.5-flash
- API: https://api.llm7.io/v1
- Rate Limits: 2 RPS / 20 RPM / 100RPH
- Models:
gpt-oss-20bGLM-4.6V-Flash
- API: https://api-inference.modelscope.cn/v1/
- Rate Limits: Unknown
- Tip:
- 每天2000次 / 2000 times per day
- Models:
inclusionAI/Ling-2.6-1Tdeepseek-ai/DeepSeek-V4-Prodeepseek-ai/DeepSeek-V4-Flashdeepseek-ai/DeepSeek-V3.2ZhipuAI/GLM-5.1ZhipuAI/GLM-5MiniMax/MiniMax-M2.5Qwen/Qwen3-235B-A22B-Instruct-2507Qwen/Qwen3-Coder-480B-A35B-Instruct
- API: https://api.kilo.ai/api/gateway
- Rate Limits: 200RPH (Hour)
- Models:
google/gemma-4-26b-a4b-it:freeinclusionai/ling-2.6-1t:freeinclusionai/ling-2.6-flash:freenvidia/nemotron-3-super-120b-a12b:freetencent/hy3-preview:freeopenrouter/free
- API: https://router.huggingface.co/v1
- Rate Limits: 300 RPH (Hour)
- Models:
deepseek-ai/DeepSeek-V4-Pro:fastestmoonshotai/Kimi-K2.6:fastestgoogle/gemma-4-31B-it:fastestzai-org/GLM-5.1:fastestinclusionAI/Ling-2.6-1T:fastestMiniMaxAI/MiniMax-M2.7:fastestdeepseek-ai/DeepSeek-V3.2:fastestzai-org/GLM-5:fastestnvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16:fastest
- API: https://api.groq.com/openai/v1/
- Rate Limits: 30 RPM / 1000 RPD
- Models:
openai/gpt-oss-120bopenai/gpt-oss-20bqwen/qwen3-32b
-
Rate Limits: 30 RPM / 900 RPH / 1440 RPD
-
Models:
gpt-oss-120bqwen-3-235b-a22b-instruct-2507
-
Rate Limits: 10 RPM / 100RPH / 100 RPD
-
Models:
zai-glm-4.7
- API: https://api.mistral.ai/v1
- Rate Limits: Unknown
- Models:
mistral-large-2512mistral-small-2603mistral-medium-3.5
- API: https://api.longcat.chat/openai/v1
- Rate Limits: Unknown
- Tip:
- 500,000Tokens Per Day:
LongCat-Flash-ChatLongCat-Flash-ThinkingLongCat-Flash-Thinking-2601LongCat-Flash-Omni-2603LongCat-Flash-Chat-2602-Exp - 50,000,000 Tokens Per Day:
LongCat-Flash-Lite - 5,000,000 Tokens Per Day:
LongCat-2.0-Preview
- 500,000Tokens Per Day:
- Models:
LongCat-Flash-ChatLongCat-Flash-ThinkingLongCat-Flash-Thinking-2601LongCat-Flash-LiteLongCat-Flash-Omni-2603LongCat-Flash-Chat-2602-ExpLongCat-2.0-Preview
- API: https://opencode.ai/zen/v1
- Rate Limit: Unknown
- Models:
minimax-m2.5-freenemotron-3-super-free
-
自行收集 / Self-collected
-
投稿(B站等) / Contributed by others
- llm_benchmark:个人评测榜单,可信度高,而且收录更全 / A personal review list, it is highly credible, and it is more comprehensive
- Artifical Analysis
- LMArena
