Rajat Portfolio

Educational RAG Platform

RAG Playground

Interactive web application that demonstrates and visualizes the complete Retrieval-Augmented Generation (RAG) pipeline. Experiment with document ingestion, preprocessing, chunking strategies, embeddings, vector search, and AI-powered generation across multiple providers.

View Liveopen_in_new Source Codecode

account_tree

RAG Pipeline Stages

upload_file

1. Ingestion

Upload PDF, TXT, MD, HTML files. Supports local filesystem and Supabase S3 cloud storage with automatic fallback. Uses pdfplumber for PDF extraction with PyPDF2 fallback.

cleaning_services

2. Preprocessing

Configurable text cleaning: lowercase conversion, whitespace normalization, URL/email removal, OCR artifact fixes, and Unicode normalization with detailed statistics.

splitscreen

3. Chunking

Three strategies: Fixed-size (1000 chars, 200 overlap), Recursive (paragraph-aware), and Semantic (sentence-boundary). All with token counting and metadata.

neurology

4. Embedding

Generate vectors using Google Gemini (768-dim), OpenAI text-embedding-3-small/large (1536/3072-dim), or ada-002. Batch processing with error handling.

database

5. Vector Indexing

Index to Qdrant vector database with HNSW algorithm. Auto-creates collections, handles dimension mismatches, stores full chunk content in payloads for reconstruction.

6. Retrieval

Similarity search with query embedding, model auto-detection, and metadata filtering. Returns top-k chunks with scores and latency metrics.

sort

7. Reranking

Re-ranks retrieved chunks using keyword overlap scoring. Production systems would use cross-encoder models or Cohere Rerank API.

auto_awesome

8. Generation

Generate responses using OpenAI GPT-3.5/4, Google Gemini, or Groq Llama. Multiple prompt templates: default, detailed, concise, step-by-step reasoning.

analytics

9. Evaluation

Track latency, token counts, estimated costs, faithfulness, relevance, context utilization, and response quality metrics for pipeline optimization.

Architecture Highlights

Modular Pipeline Design

Each RAG stage is independently testable with clear input/output contracts. Stages can be swapped or upgraded without affecting others.

Provider Abstraction

Unified interfaces for embeddings (Google/OpenAI) and LLMs (OpenAI/Gemini/Groq) enable easy provider switching via configuration.

Storage Flexibility

Environment-based switching between local filesystem and Supabase S3 with automatic fallback ensures development and production compatibility.

Observability First

Structured logging with request IDs, stage tracking, and error traces. Frontend log viewer provides real-time pipeline visibility.

Technology Stack

Frontend

Next.js 16React 19TypeScript 5Tailwind 4shadcn/uiZustandReact FlowRecharts

Backend

FastAPI 0.135Python 3.11UvicornPydantic 2

RAG Pipeline

pdfplumberqdrant-clientgoogle-generativeaiopenaigroqtiktokenscikit-learn

Infrastructure

Supabase S3Qdrant CloudNetlifyDocker

Key Features

Interactive visual pipeline with React Flow DAG visualization
Multi-provider embedding support (Google, OpenAI) with dimension auto-detection
Dual vector store support: Local FAISS and Qdrant Cloud with automatic failover
Three chunking strategies with configurable parameters and token counting
Real-time logging with color-coded levels and stage filtering
Multiple LLM providers: OpenAI, Google Gemini, Groq with prompt templates
Comprehensive evaluation metrics: latency, tokens, cost, faithfulness, relevance
Cloud-ready architecture with environment-based configuration switching

External Services

Supabase S3

Cloud file storage with automatic fallback to local filesystem

Qdrant Cloud

Vector database with HNSW indexing and payload storage

LLM APIs

OpenAI, Google Gemini, Groq for embeddings and generation

Live Application LinkedIn