Reflex deepdive
Reflex is a Software Development Kit (SDK) that wraps document loading, parsing, chunking, vector stores, and multiple agents.
Modular architecture
Reflex follows a modular architecture with clear separation of concerns making it easy to understand, extend, and maintain
- Core Layer
Logging, tracing, error handling, and base types used across the entire SDK. - Data Layer
Database connectors, vector stores, and storage providers for data persistence. - Processing Layer
Document loaders, parsers, chunking strategies, and embedding generation. - AI Layer
LLM providers, agents, prompts, and AI orchestration components. - Engine Layer
High-level orchestration with Text2SQL and Retrieval Augmented Generation (RAG) engines for end-to-end workflows. - Evaluation Layer
Metrics, test harness, and datasets for quality assurance and monitoring.
Core capabilities
Follow the data
RAG processing flow
From document upload to cited, structured answer.
1 Document upload → PDF, Excel, images
2 Loader→ PDFLoader, VLMPDFLoader, ImageLoader, ExcelLoader
3 Parsing → VLMParser or PDFPlumberParser
4 Chunking & metadata extraction → paragraph, semantic, tokens, context-aware, ai-driven
5 Embedding & indexing Azure OpenAI embeddings
6 Retrieval agent → streaming structured output with source citations
Text2SQL processing flow
From natural language question to SQL result with summary.
1 Query input → User submits natural language question
2 Schema retrieval → Fetch database schema and metadata
3 Enhancement → AI enhances query with schema context and instructions
4 SQL generation → Optimised SQL for target dialect
5 Execution → Runs against any SQL database
6 Summarisation → AI generates natural language summary
7 Result return → Structured result with retrieved dataset and model response
Shared infrastructure
The capabilities that power Reflex
File Loaders
A flexible, multimodal ingestion layer built to handle diverse file types from simple documents to complex, visually rich PDFs.
- PDFLoader: pdfplumber-based extraction for text, tables, images, headers, footers, footnotes, citations etc.
- VLMPDFLoader: Vision-language model parsing for advanced PDF understanding
- ImageLoader: OCR for text extraction and/or VML for detailed image descriptions
- ExcelLoader: Converts spreadsheets into CSV with structured metadata
Vision-Language PDF Parsing (VML)
Leverage advanced language models with built-in visual understanding to extract rich, structured data from complex PDFs. Beyond standard text and table parsing, VML analyzes charts and images using vision models to capture insights that traditional parsers miss.
- Comprehensive extraction: Text, tables, charts, and images
- Structured outputs: Extract structured data from tables, charts and images
- Batch processing: Efficiently process 1–5 pages per API call
- Custom schemas: Define your own JSON structure with built-in validation
Chunking Strategies
Five configurable chunking approaches ensure optimal downstream processing for retrieval, search, and LLM workflows:
- Paragraph: Table-aware splitting with preserved markdown tables
- Semantic: Recursive chunking guided by document structure and section headers
- Tokens: Token-based segmentation using tiktoken
- Context-Aware: Enhances chunks with neighboring context for better continuity
- AI-Driven: LLM-powered chunking with automatic metadata extraction
Vector Stores & Embeddings
A modular retrieval layer supporting multiple vector backends and embedding strategies for scalable semantic search.
- Pluggable vector stores: Support for both local and cloud-based indexes
- Advanced retrieval modes: Hybrid, similarity, and semantic search options
- Intelligent filtering: LLM-assisted query parsing for structured filter generation
- Metadata handling: Automatic sanitisation and normalization for consistent indexing
- Embedding models: Compatible with modern embedding APIs and open-source models
Database Connectivity
Flexible integration with analytical and transactional databases, optimized for query generation and data exploration.
- Relational databases: Full schema introspection with optimized SQL query generation
- Analytical engines: High-performance querying with embedded and in-process databases
- Lakehouse support: Integration with modern data lakehouse architectures and catalog systems
- Query optimization: Intelligent query construction tailored to underlying database capabilities
Storage
Flexible, environment-aware storage designed to support both local development and production-scale deployments.
- Multi-backend support: Local file storage for development, scalable object storage for production
- Environment-aware configuration: Automatically adapts storage strategy based on runtime context
- Metadata persistence: Store document metadata, embeddings, and intermediate artifacts
- Caching layer: Intelligent caching of parsed data and query results to reduce latency and cost
- Extensible design: Plug in custom storage backends as needed
Evaluation
A built-in evaluation framework to systematically measure, compare, and improve system performance.
- Standardized metrics: Evaluate quality across accuracy, relevance, and consistency
- Test harness: Run repeatable evaluations across prompts, queries, and pipelines
- Dataset management: Create, version, and benchmark against curated datasets
- Comparative evaluation: Test different models, prompts, or configurations side-by-side
- Pre-deployment validation: Ensure quality before releasing to production
Reliability
Behind the scenes, a robust middleware layer handles logging, error management, and environment-aware storage—ensuring reliability from development through production.
- Built-in evaluation framework: Measure system quality before deployment
- Quality metrics: Standardized metrics for consistent assessment
- Test harness: Systematic evaluation across scenarios and use cases
- Dataset management: Organize and benchmark against curated datasets
- Performance tracking: Monitor accuracy, latency, and overall system behavior
Built to scale with you—from experimentation to production.
Reflex follows well-established software engineering patterns so your team can extend, maintain, and deploy with confidence.
- Factory pattern: Used for creating embeddings, vector stores, databases, and storage providers. Swap implementations without changing business logic.
- Strategy pattern: Chunking strategies, retrieval methods, and parsing approaches can be selected and configured at runtime.
- Dependency injection: Components accept dependencies through constructor injection, enabling easy testing and customisation.
- Configuration as code: Pydantic models for type-safe configuration with validation, environment variable support, and clear schemas.

