Technical review

SmartDocs AI Enterprise RAG SaaS

This page is written for a reviewer evaluating product depth and engineering execution. It summarizes the live demo, architecture, security boundaries, billing behavior, limitations, and source files worth reading.

Architecture

flowchart LR
  Web[Next.js App Router] --> API[FastAPI API]
  API --> Auth[JWT Auth + Workspace RBAC]
  API --> Docs[Document Service]
  Docs --> Storage[/uploads volume]
  Docs --> Worker[Indexing path]
  Worker --> Chunks[(PostgreSQL document chunks)]
  API --> Chat[Streaming Chat Route]
  Chat --> Graph[LangGraph RAG Nodes]
  Graph --> Retrieval[pgvector + Full-text + RRF]
  Graph --> Gateway[DeepSeek/Qwen/OpenAI-compatible or demo-local]
  Graph --> Langfuse[Optional Langfuse Trace]
  Chat --> Billing[Credits + Usage Logs]
  Billing --> DB[(PostgreSQL)]

What to verify live

Guest demo login finds the seeded SmartDocs workspace and opens the dashboard.
Four seeded documents are indexed and available for cited RAG answers.
Chat streams tokens, returns citations, exposes retrieval debug data, and records provider metadata.
Successful calls deduct credits and create usage records; failed calls do not deduct credits.
Viewer roles can read and ask questions, while write controls are hidden from the UI.

Document Indexing

Uploads are stored per workspace, parsed into text, split into chunks, and persisted with document metadata. The demo seed endpoint creates review-ready documents so the public deployment can be tested without manual setup.

RAG Retrieval

The chat flow uses a RetrievalService with pgvector similarity SQL, PostgreSQL full-text search, and reciprocal rank fusion. If a local database lacks vector support, the service falls back to deterministic Python ranking for demo continuity.

Model Gateway

The no-key production path is explicitly labeled demo-local. The backend is structured so real DeepSeek, Qwen, or OpenAI-compatible model calls can replace deterministic demo answers through environment configuration.

RBAC

Workspace roles are returned with dashboard data and used by the UI to hide upload, re-index, invite, and settings actions from guest/viewer sessions.

Credits

Credit deduction happens after a successful AI response and is written alongside usage logs. This keeps billing visible and prevents failed model calls from consuming balance.

LangGraph Flow

The RAG request moves through validate_access, check_credits, rewrite_query, retrieve, build_context, generate, and finalize nodes. Finalize deducts credits and writes the usage log after generation succeeds.

Tenant Isolation

Every workspace route and API request is scoped by workspace id. Documents, chunks, credits, members, settings, and usage records are tenant-owned surfaces.

Current public demo mode

The public demo may use demo-local provider mode for stability and cost control.

Real providers supported by the backend include DeepSeek, Qwen, and OpenAI-compatible APIs when keys are configured.

EmbeddingGateway supports Qwen embeddings when configured and deterministic demo-local embeddings for the public no-key demo.

China-market contact

WeChat: mgamal012

Email: mohamed.gamalj8@gmail.com

For HR review, technical discussion, collaboration, or hiring discussion.

中国区联系:微信 mgamal012。适用于 HR 沟通、技术评审、项目合作或面试交流。

Request flow

1. The browser authenticates with JWT and sends workspace-scoped API requests.

2. FastAPI checks the workspace membership and role before returning tenant data.

3. LangGraph runs access, credit, retrieval, context, generation, and finalize nodes.

4. Credits and usage logs are written only after the answer is complete.

Known limitations

The public demo uses a deterministic demo-local provider when external model keys are absent.

Langfuse traces are disabled in the public demo unless LANGFUSE keys are configured.

Settings and invites are intentionally read-only in the public demo to avoid mutation risk on the shared deployment.

Engineering details

Document indexing flow: upload, type validation, extraction, chunking, embedding generation, chunk persistence, and indexed status update.

RAG chat flow: JWT workspace request, RBAC check, credit precheck, retrieval, context build, ModelGateway generation, citations, billing, and usage log.

Hybrid retrieval algorithm: pgvector distance and PostgreSQL full-text search are merged by reciprocal rank fusion.

ModelGateway provider strategy: DeepSeek first, Qwen fallback, OpenAI-compatible optional fallback, then demo-local when no keys are configured.

Embedding provider strategy: Qwen embeddings when configured, deterministic hash embeddings for public demo fallback.

Failed-call no-deduction logic: exceptions roll back pending writes, add failed usage logs with zero credits, and avoid exposing provider secrets.

Usage logs and audit trail: status, provider, model, tokens, latency, credits, trace id, and error details are recorded.

RBAC and tenant isolation: workspace membership gates routes; guest/viewer users see read-only UI for risky actions.

Langfuse observability: tracing code is implemented and safely disabled unless Langfuse keys are configured.

Conversation history: each RAG call can persist user and assistant messages with citations, provider, model, tokens, credits, latency, and trace id.

Security guardrails: document content is treated as untrusted context and workspace membership gates protected routes.

Test coverage summary: backend gateway, retrieval, security, and RRF tests run in CI alongside frontend type-check, lint, and build.

Production deployment notes: the live Vercel deployment serves frontend routes and FastAPI under the /_/api prefix.

Source files to review

services/api/app/services/chat_service.pyservices/api/app/services/retrieval_service.pyservices/api/app/repositories/retrieval_repository.pyservices/api/app/ai/model_gateway.pyservices/api/app/ai/embedding_gateway.pyservices/api/app/rag/rag_graph.pyservices/api/app/models/conversation.pyservices/api/app/observability/tracing.pyservices/api/app/services/document_service.pyservices/api/app/services/billing_service.pyservices/api/app/api/v1/chat.pyservices/api/app/api/v1/admin.pyapps/web/app/demo/page.tsxapps/web/app/workspaces/[workspaceId]/chat/page.tsxapps/web/app/workspaces/[workspaceId]/usage/page.tsx