Rag Of All Trades

This project is a complete RAG (Retrieval-Augmented Generation) microservice. The service is based on FastAPI, PGVector, Redis, Celery, and LlamaIndex.

The service is built with extensibility in mind and provides a flexible configuration that allows you to easily connect to an arbitrary number of data sources with pre-defined ingestion schedules.

✨ Features

Ingestion from S3 buckets with Everything-to-Markdown conversion via MarkItDown
Ingestion from local directories via LlamaIndex SimpleDirectoryReader
Ingestion from MediaWiki with Wiki-to-Markdown conversion via html2text
SerpAPI ingestion from Google Search results with customizable queries
Jira ingestion from Cloud and on-premise instances via JQL queries, with optional comment loading
Flexible configuration supporting an arbitrary number of connectors
Built with extensibility in mind, allowing for custom connectors with ease

🌐 Connectors included

S3
Directory
MediaWiki
SerpAPI
Jira
Web

Embeddings support

Local (running arbitrary embedding models from HuggingFace)
OpenRouter
OpenAI

Inference support

OpenRouter
OpenAI

Tech Stack

FastAPI
Vector search with pgvector
Celery-based ingestion pipeline
OpenAI/OpenRouter support for inference and embeddings
Local LLM support for inference and embeddings
LlamaIndex-powered RAG Query Engine
Docker Compose for deployment

⚡️Quick Start

Create a .env file based on the .env.example file.
- The defaults are good enough, you just need to put your OpenRouter key into OPENROUTER_API_KEY
If you use OpenAI or a different OpenAI-compatible endpoint, also update the OPENROUTER_API_BASE variable
By default, a single S3 connector is configured; specify your S3 bucket credentials in the S3_ACCOUNT1_ variables
Create a config.yaml file based on the config.yaml.example file
- The defaults are good enough with openai/gpt-oss-120b:free used for inference and sentence-transformers/all-mpnet-base-v2 for embeddings.
- If you would like to use different models, update the embedding and inference sections accordingly.
Run docker compose up -d --build to start the service
Access the API at :8000
Access the API docs at :8000/docs

MCP endpoint

The service includes an optional MCP (Model Context Protocol) server at /mcp/ (trailing slash required). It is disabled by default and can be enabled via environment variables.

Enabling MCP

Set the following in your .env file:

MCP_ENABLE=1
MCP_API_KEY=your-strong-api-key

Authentication

All MCP requests require a Bearer token in the Authorization header:

Authorization: Bearer <MCP_API_KEY>

Transport

The MCP server uses stateless HTTP mode (Streamable HTTP transport), so no mcp-session-id header is required. The endpoint accepts JSON-RPC requests.

Available tools

retrieve_chunks - top-k retrieval from vector store with optional metadata filters
rephrase_chunks - LLM-based answer generation over top-k retrieved chunks (requires inference to be configured)

Testing with MCP Inspector

You can use the MCP Inspector to test the MCP endpoint:

./mcp_inspector.sh

This starts the inspector in Docker and prints a URL with pre-filled connection settings.

Connectors

The service supports multiple data sources, including multiple data sources of the same type, each with its own ingestion schedule. The connectors to enable are defined via config.yaml, and their secrets are defined in the .env file.

S3 Connector

The S3 connector ingests documents from S3 buckets and converts them to Markdown format. The connector has the following configuration options:

# config.yaml

sources:
  - type: "s3" # must be s3
    name: "account1" # arbitrary name for the connector, will be stored in metadata
    config:
      endpoint: "${S3_ACCOUNT1_ENDPOINT}" # s3 endpoint
      access_key: "${S3_ACCOUNT1_ACCESS_KEY}" # s3 access key
      secret_key: "${S3_ACCOUNT1_SECRET_KEY}" # s3 secret key
      region: "${S3_ACCOUNT1_REGION}" # s3 region
      use_ssl: "${S3_ACCOUNT1_USE_SSL}" # use ssl for s3 connection, can be True or False
      buckets: "${S3_ACCOUNT1_BUCKETS}" # single entry or comma-separated list i.e. bucket1,bucket2
      schedules: "${S3_ACCOUNT1_SCHEDULES}" # single entry or comma-separated list i.e. 3600,60

  - type: "s3"
    name: "account2"
    config:
      ...

  - type: "s3"
    name: "account3"
    config:
      ...

# .env

S3_ACCOUNT1_ENDPOINT=https://s3.amazonaws.com
S3_ACCOUNT1_ACCESS_KEY=xxx
S3_ACCOUNT1_SECRET_KEY=xxx
S3_ACCOUNT1_REGION=us-east-1
S3_ACCOUNT1_USE_SSL=True
S3_ACCOUNT1_BUCKETS=bucket1,bucket2
S3_ACCOUNT1_SCHEDULES=3600,60

Directory Connector

The directory connector ingests files from a local filesystem directory using LlamaIndex SimpleDirectoryReader. The connector has the following configuration options:

# config.yaml

sources:
  - type: "directory"
    name: "local_docs"
    config:
      path: "/data/docs" # required path to directory
      recursive: true # optional, default true
      required_exts: "txt,md,pdf" # optional, comma-separated extensions
      exclude_hidden: true # optional, default true
      exclude_empty: false # optional, default false
      num_files_limit: 1000 # optional, positive integer
      schedules: "3600"

MediaWiki Connector

The MediaWiki connector ingests documents from MediaWiki sites and converts them to Markdown format. The connector has the following configuration options:

# config.yaml

sources:
  - type: "mediawiki"
    name: "wiki1"
    config:
      api_url: "${MEDIAWIKI1_API_URL}"
      request_delay: 0.1
      schedules: "${MEDIAWIKI1_SCHEDULES}"

  - type: "mediawiki"
    name: "wiki2"
    config:
      ...

  - type: "mediawiki"
    name: "wiki3"
    config:
      ...

# .env

MEDIAWIKI1_API_URL=https://en.wikipedia.org/w/api.php
MEDIAWIKI1_SCHEDULES=3600

SerpAPI Connector

The SerpAPI connector ingests documents from Google Search results and converts them to Markdown format. The connector has the following configuration options:

# config.yaml

sources:
  - type: "serpapi"
    name: "serp_ingestion1"
    config:
      api_key: "${SERPAPI1_KEY}"
      queries: "${SERPAPI1_QUERIES}"
      schedules: "${SERPAPI1_SCHEDULES}"

  - type: "serpapi"
    name: "serp_ingestion2"
    config:

  - type: "serpapi"
    name: "serp_ingestion3"
    config:

# .env

SERPAPI1_KEY=xxxx
SERPAPI1_QUERIES=aaa
SERPAPI1_SCHEDULES=3600

Web Connector

The Web connector ingests content from web pages using the LlamaIndex BeautifulSoupWebReader (URLs mode) or SitemapReader (sitemap mode). The two modes are mutually exclusive.

URLs mode — scrape a fixed list of pages:

- type: web
  name: web1
  config:
    urls:
      - https://example.com/page1
      - https://example.com/page2
    html_to_text: true   # optional, default true
    schedules: "${WEB1_SCHEDULES}"

Sitemap mode — discover and scrape URLs from a sitemap.xml:

- type: web
  name: web2
  config:
    sitemap_url: https://example.com/sitemap.xml
    include_prefix: "/wiki/"   # optional: only ingest URLs containing this string
    html_to_text: true         # optional, default true
    schedules: "${WEB2_SCHEDULES}"

Note: exclude_prefix and sitemap index (<sitemapindex>) are not supported in this iteration — the underlying SitemapReader only supports include-style filtering and flat sitemaps.

.env variables:

WEB1_SCHEDULES=60
WEB2_SCHEDULES=60

No other credentials are required for public web pages.

Jira Connector

The Jira connector ingests issues from Jira Cloud or on-premise (Server/Data Center) instances using a JQL query. Issue content (summary + description) is converted to Markdown. Metadata collected per issue includes: id, title, url, status, assignee, reporter, labels, project, priority, issue type

Supports two authentication modes:

Basic auth (auth_type: basic) — email + API token, for Jira Cloud
Personal Access Token (auth_type: token) — PAT as Bearer header, for Jira Server / Data Center

# config.yaml

sources:
  - type: "jira"
    name: "jira1"
    config:
      server_url: "${JIRA1_SERVER_URL}"
      auth_type: "basic"              # "basic" or "token"
      email: "${JIRA1_EMAIL}"         # required for auth_type=basic
      api_token: "${JIRA1_API_TOKEN}"
      jql: "${JIRA1_JQL}"
      max_results: 50                 # optional, default 50
      schedules: "${JIRA1_SCHEDULES}"
      # Optional: load top N comments per issue
      load_comments: false            # optional, default false
      max_comments: 10                # optional, default 10

# .env

# Jira Cloud (basic auth)
JIRA1_SERVER_URL=https://your-org.atlassian.net
JIRA1_EMAIL=your-email@example.com
JIRA1_API_TOKEN=your-api-token
JIRA1_JQL=project = MYPROJECT ORDER BY updated DESC
JIRA1_SCHEDULES=3600

# Jira Server / Data Center (Personal Access Token)
# JIRA1_SERVER_URL=https://jira.your-company.com
# JIRA1_API_TOKEN=your-personal-access-token
# (set auth_type: "token" in config.yaml; email is not needed)

Reference of the `config.yaml`

The config.yaml file contains the main configuration of the service.

Common connector parameters

The following parameters are supported by all connector types:

Parameter	Type	Default	Description
`schedules`	string	—	Cron expression or interval (in seconds) defining how often the connector runs.
`request_delay`	float	`0`	Delay in seconds between processing each item. Useful for rate-limiting requests to external APIs.

Environment variables (${...}) in the config file are evaluated at runtime.

sources: # holds the list of sources to ingest from (Connectors)

  - type: # type of the connector (s3, directory, mediawiki, serpapi, jira, etc.)
    name: # arbitrary name for the connector, will be stored in metadata
    config:
      # connector specific configuration
      schedules: "${S3_ACCOUNT1_SCHEDULES}"
      request_delay: 0  # optional, delay in seconds between items (default: 0)

# configures models and dimensions for embeddings
embedding:
  provider: openrouter # `openrouter`/`openai` or `local` for local HuggingFace embeddings
  model_config: text-embedding-3-small # model to use
  embedding_dim: 1536 # dimensions (check with the model docs)

# configures the LLM provider and model
inference:
  provider: openrouter # `openrouter`/`openai`
  model_config: gpt-4o # model to use

# vector store configuration
vector_store:
  table_name: embeddings
  hybrid_search: true # whether to use hybrid search or not
  chunk_size: 512 # chunk size for vector indexing
  chunk_overlap: 50 # overlap between chunks
  # hnsw indexes settings
  hnsw:
    hnsw_m: 16 # number of neighbors
    hnsw_ef_construction: 64 # ef construction parameter for HNSW
    hnsw_ef_search: 40 # ef search parameter for HNSW
    hnsw_dist_method: vector_cosine_ops # distance metric for HNSW

Embeddings and Inference configuration examples

Embeddings-only HuggingFace local model

You can configure the service to use local embeddings only, in this mode you can use any embedding model supported by HuggingFace. Inference is disabled in this mode, so you won't be able to use the rephrase endpoint.

# config.yaml

embedding:
  provider: local
  # you can use any embedding model supported by HuggingFace
  model_config: sentence-transformers/all-MiniLM-L6-v2
  embedding_dim: 384

inference:
  provider: None
  model_config: None

Embeddings-only OpenRouter/OpenAI model

You can configure the service to use remote embeddings, in this mode you can use any embedding model supported by OpenRouter/OpenAI. Inference is disabled in this mode, so you won't be able to use the rephrase endpoint.

# config.yaml

embedding:
  provider: openrouter
  model_config: text-embedding-3-small
  embedding_dim: 1536

inference:
  provider: None
  model_config: None

You must set OPENROUTER_API_KEY and OPENROUTER_API_BASE in the .env file.

Embeddings and inference OpenRouter/OpenAI model

You can configure the service to use remote embeddings and remote inference, in this mode you can use any embedding and inference models supported by OpenRouter/OpenAI.

# config.yaml

embedding:
  provider: openrouter
  model_config: text-embedding-3-small
  embedding_dim: 1536

inference:
  provider: openrouter
  model_config: gpt-4o

You must set OPENROUTER_API_KEY and OPENROUTER_API_BASE in the .env file.

API

The following API endpoints are available:

/api/v1/query/

This endpoint is used to perform a query against the vector store:

curl -X 'POST' \
  'http://localhost:8000/api/v1/query/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "AWS Services",
  "top_k": 5
}'

Response example:

{
  "references": [
    {
      "source_name": null,
      "source_type": null,
      "url": null,
      "score": 0.6172290216224814,
      "title": null,
      "text": "You can also\n\nrequire WAF Captcha challenges for suspicious...",
      "extras": {
        "source": "s3",
        "key": "aws-overview.pdf",
        "checksum": "5b4da9267b0b861792d1163fcc9f0550",
        "version": 1,
        "format": "markdown"
      }
    },
    {...},
    {...}
  ],
  "raw": [
    "Score: 0.6172 | Text: You can also\n\nrequire WAF Captcha challenges for suspicious...",
    "Score: 0.5172 | Text: You can also\n\nrequire WAF Captcha challenges for suspicious...",
    "Score: 0.3172 | Text: You can also\n\nrequire WAF Captcha challenges for suspicious..."
  ]
}

/api/v1/rephrase/

This endpoint rephrases the query and provides the best answer.

This endpoint requires inference to be configured in the config.yaml.

curl -X 'POST' \
  'http://localhost:8000/api/v1/rephrase/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "WAF Captcha challenges for suspicious requests"
}'

Response format:

{
  "answer": "You can configure AWS WAF to require Captcha challenges for suspicious requests based on:\n- Request rate and attributes",
  "references": [
    {
      "source_name": null,
      "source_type": null,
      "url": null,
      "score": 0.5415070280718167,
      "title": null,
      "extras": {
        "source": "s3",
        "key": "aws-overview.pdf",
        "checksum": "5b4da9267b0b861792d1163fcc9f0550",
        "version": 1,
        "format": "markdown"
      }
    }
  ]
}

/health

This endpoint checks the health of the service.

curl -X 'GET' \
  'http://localhost:8000/health' \
  -H 'accept: application/json'

Response example:

{
  "status": "ok",
  "vector_store_loaded": true,
  "celery_healthy": true
}

Integration examples

OpenWebUI

TODO

LibreChat

TODO

LobeHub

TODO

Anything-LLM

TODO

🔧 Development

Pre-commit hooks

This project uses prek (a fast, drop-in alternative to pre-commit) to enforce formatting and linting on every commit.

Install prek (once, globally):

# Using pip
pip install prek

# Or using the standalone installer (Linux/macOS)
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/j178/prek/releases/latest/download/prek-installer.sh | sh

Install the hooks (once, per clone):

prek install

From that point on, every git commit will automatically run:

Hook	What it does
`trailing-whitespace`	Removes trailing whitespace
`end-of-file-fixer`	Ensures files end with a newline
`check-yaml`	Validates YAML syntax
`check-merge-conflict`	Detects unresolved merge conflict markers
`ruff` (lint)	Lints Python with auto-fix (pycodestyle, pyflakes, isort, pyupgrade)
`ruff-format`	Formats Python code (replaces black)

Run hooks manually (without committing):

prek run --all-files

Ruff configuration is in pyproject.toml under [tool.ruff].

✨ Contributions

Contributions, suggestions, bug reports, and fixes are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
alembic		alembic
api		api
models		models
tasks		tasks
tests		tests
utils		utils
.aiignore		.aiignore
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.ignore		.ignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
celery_app.py		celery_app.py
compose.dev.yml		compose.dev.yml
compose.yml		compose.yml
config.yaml.example		config.yaml.example
main.py		main.py
mcp_inspector.sh		mcp_inspector.sh
openapi.yaml		openapi.yaml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Rag Of All Trades

✨ Features

🌐 Connectors included

Embeddings support

Inference support

Tech Stack

⚡️Quick Start

MCP endpoint

Enabling MCP

Authentication

Transport

Available tools

Testing with MCP Inspector

Connectors

S3 Connector

Directory Connector

MediaWiki Connector

SerpAPI Connector

Web Connector

Jira Connector

Reference of the config.yaml

Common connector parameters

Embeddings and Inference configuration examples

Embeddings-only HuggingFace local model

Embeddings-only OpenRouter/OpenAI model

Embeddings and inference OpenRouter/OpenAI model

API

/api/v1/query/

/api/v1/rephrase/

/health

Integration examples

OpenWebUI

LibreChat

LobeHub

Anything-LLM

🔧 Development

Pre-commit hooks

✨ Contributions

Star History

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Reference of the `config.yaml`

Packages