DAKSH is architected as a modular AI stack, where each component can function independently or be scaled as needed. The core subsystems include:
a) Query Ingestion Layer
The entry point of the DAKSH pipeline, this subsystem accepts queries from diverse channels — web clients, mobile apps, APIs, WhatsApp bots, Slack workspaces, and voice interfaces. Text normalization, language detection, and tokenization happen here. All interactions are time-stamped, securely logged, and passed into a preprocessing buffer.
b) Embedding and Indexing Engine
DAKSH uses custom-trained transformer encoders optimized for enterprise data. These encoders process knowledgebase files, chunk them semantically, and generate vector embeddings. Each vector chunk includes metadata such as source document, section headers, timestamp, and user-defined tags. These embeddings are stored in a highly optimized vector database (FAISS or DeepQuery’s proprietary alternative).
c) Retriever and Reranker
When a query is passed, a semantic similarity search retrieves the top-k document chunks. To further refine results, DAKSH uses a reranker model trained on supervised QA relevance scores. This reranker ensures that only the most relevant and recent information reaches the next stage, reducing hallucination and increasing precision.
d) LLM Inference Core – Proprietary to DAKSH
This is where DAKSH distinguishes itself. The inference engine runs a custom-trained LLM, architected with multi-head self-attention, positional encoding, and domain-tuned vocabulary embeddings. This model supports:
-
Schema-aware generation (structured JSON, XML, etc.)
-
Inline citation for source references
-
Low-latency inference pipelines
-
Multilingual context retention across prompt windows
Unlike API-bound LLMs, DAKSH's LLM is tuned iteratively on real user queries and fine-grained context windows, giving it an edge in adaptability and output reliability.