FastAPI + Streamlit RAG demo for small consultancies. Upload PDFs/Docs, chunk/encode, store in Qdrant, and ask grounded questions with citations.
- Document Upload: Upload PDFs, Word docs, PowerPoint, and text files
- RAG Chat: Ask questions grounded in your uploaded documents with source citations
- Section Generator: Generate proposal sections (Executive Summary, Methodology, etc.) using context from past projects
- Staff Resume Generator: Generate tailored experience summaries for project proposals
- Manage a staff directory in Settings
- Select team members for each project
- Customize roles and generate 4-6 sentence experience paragraphs
- Multilingual UI: Interface available in English, German, and French
- Configurable Profiles: Industry-specific configurations for engineering, consulting, etc.
- Backend: FastAPI (
backend/app.py) for upload, chat, section generation, and resume generation. - LLM Provider: Supports OpenAI (default) or Google Gemini, configurable via
LLM_PROVIDERenv var. Provider abstraction inbackend/llm/. - Embeddings + retrieval: Provider-specific embeddings stored in Qdrant (
backend/embeddings/embedder.py). Each backend start recreates the collection. - Vector store: Qdrant with a named volume
qdrant_storage(data persists while the volume exists). - Frontend: Streamlit (
frontend/streamlit_app.py) with pages for Upload, Library, Search, Chat, Section Generator, Staff Resumes, and Settings.
Prereqs: Docker + Docker Compose, and an API key for your chosen provider (OpenAI or Gemini).
- Create
.envin repo root:
# LLM Provider: "openai" (default) or "gemini"
LLM_PROVIDER=openai
# OpenAI settings (required if LLM_PROVIDER=openai)
OPENAI_API_KEY=sk-...
# Gemini settings (required if LLM_PROVIDER=gemini)
GEMINI_API_KEY=...
# Override if running qdrant elsewhere
# QDRANT_URL=http://localhost:6333
- Build and run:
# Using OpenAI (default)
docker compose up --build
# Using Gemini
LLM_PROVIDER=gemini docker compose up --build
- Backend: http://localhost:8000
- Frontend (Streamlit): http://localhost:8501
Data Droid includes example configurations for different industries:
docker compose up --build
# or explicitly:
FRONTEND_CONFIG=example_configs/default.json docker compose up --build# Use bridge engineering config with English/French support
FRONTEND_CONFIG=example_configs/bridge_eng.json docker compose up --build
# Seed example documents
uv run python examples/bridge_eng/seed_docs.py# Use AlpenBau config with German/English support
FRONTEND_CONFIG=example_configs/alpenbau.json docker compose up --build
# Seed example documents
uv run python examples/alpenbau/seed_docs.pySee examples/*/README.md for demo questions and document descriptions.
- Stop containers (keep data):
docker compose down
- Stop and remove Qdrant data:
docker compose down -v
- Qdrant data lives in the Docker volume
qdrant_storage(e.g.,/var/lib/docker/volumes/qdrant_storage/_data). - Note:
backend/embeddings/embedder.pycurrently callsqdrant.recreate_collection(...)at startup, which clears the collection on each backend container start.
- FastAPI: https://fastapi.tiangolo.com
- Streamlit: https://docs.streamlit.io
- Qdrant: https://qdrant.tech/documentation
- OpenAI SDK: https://platform.openai.com/docs/api-reference
- Google GenAI SDK: https://googleapis.github.io/python-genai/
- Unstructured: https://unstructured.io
- PyPDF: https://pypdf.readthedocs.io
- Docker Compose: https://docs.docker.com/compose
- Private knowledge: answers come from their project docs/proposals/wiki; data stays in their environment vs. pasting into public ChatGPT/Gemini.
- Grounded, auditable answers: every reply cites source docs (“Project_X_Report_2023.pdf, section 3.2”) so they can click/verify.
- Firm-specific workflows: pre-built flows like “draft proposal,” “summarize site visit,” “answer client question based on project history” (1–2 clicks, no prompt engineering).
- Consistent, up-to-date internal knowledge: indexes their latest file shares; avoids stale web data.
- Multilingual/local context (Zurich/Paris): German/French/English in one workspace; uses their terminology and templates.
- A) Project history co-pilot: “For the Bahnhofstrasse upgrade project, what were the main geotechnical risks and mitigations?” → concise answer + 2–3 citations. You say: “ChatGPT can’t know this; it’s from your internal reports.”
- B) Proposal drafting assistant: “Draft a 1-page project description in German… emphasizing sustainability and noise mitigation.” → finds similar past projects, drafts in German, cites sources; highlight tone reuse and multilingual capability.
- C) Internal handbook Q&A: “What’s our standard approval process for change orders above CHF 100k?” → 3–5 steps with links to the handbook; stress instant, reliable answers with sources.
- D) Precedent finder: “Shallow foundations on soft clay near the Seine—show similar past projects and what solutions we used.” → returns 2–3 precedents with links; emphasize surfacing institutional precedent.
- E) Norms navigator: “For retaining walls >4 m, what do our guidelines say about safety factors and monitoring?” → synthesizes internal guidelines + Eurocode snippets with citations; generic GPT can’t cite their adaptations.
- F) Site report summarizer (FR → EN): paste a rough French site note/email → English client-ready summary with actions/risks; keeps technical meaning.
- Side-by-side: one question (“key open issues on project X”) vs. your app; show citations and grounded answers.
- 5-minute live script: current workflow (SharePoint hunting) → new workflow (ask, draft, clarify) with speed/trust/privacy emphasis.
- Visible “RAG-ness”: show sources sidebar, clickable documents, filters (project/client/date) so it feels like an internal copilot.
- ChatGPT/Gemini uploads: great for 1–20 small PDFs; limited by context window; no persistence or metadata filters; uploads leave when the chat closes; privacy depends on provider.
- This RAG: indexes once, searches many; persistent knowledge base; filters/metadata possible; enforced citations/grounding; only top-k snippets per query go to the LLM. Embeddings use OpenAI or Gemini APIs (text leaves infra at embed time); you can swap to local embeddings if needed.
- Scale guidance: tens of docs (fast pilots); hundreds (sweet spot for most firms); thousands (needs durable vector store + ingestion pipeline). Qdrant already covers persistence—no extra SQLite/Postgres needed for 1k–10k chunks; consider heavier DB only when you need multi-tenant corpora or >100k chunks.
- Limitations: bad scans/images need OCR; chunking trade-offs (too big → overflow, too small → weak answers); bulk reindexing of 10k PDFs takes time; if content isn’t in the docs, the model can’t invent it (good for trust).