Every mid-market company runs on documents and conversations no one can find when they need them. Meeting minutes get saved as PDFs and forgotten. Policies live in Word files on someone's laptop.
"I know we decided this six months ago but I can't find which meeting."
CEO · 80-person logistics company
"What's our policy on travel reimbursements?"
HR Head · 150-person fintech
"What exactly did we agree to in the vendor contract with [X]?"
COO · 60-person manufacturing firm
"Did we ever discuss this project in a board meeting?"
CFO · 200-person retailer
"Send me everything we have on that client from the last two years."
Sales Lead · 45-person SaaS company
Get an answer with the source attached — document name, page number, or audio timestamp — every single time.
PDFs (text + scanned), Word, PowerPoint, spreadsheets, plain text, audio. Drag-and-drop or bulk import. Up to 100 MB per file.
Every document is extracted, chunked into three levels (2,048 / 512 / 128 tokens), embedded with BGE-M3, and enriched with summaries, keywords, and named entities. A 10-page PDF is ready in under 90 seconds.
Type any question on web or WhatsApp. Hybrid retrieval runs across your entire workspace — dense vector search plus BM25 full-text, fused with Reciprocal Rank Fusion.
Every answer links to the exact document, page number, or audio timestamp. If confidence is below threshold, skryb says so. Hallucination is not a product tradeoff we accept.
Every answer includes the source document name, page number or audio timestamp, and a confidence indicator. If retrieved context doesn't contain the answer, skryb returns "I don't have enough information to answer this confidently" . Never a fabricated answer. This is non-negotiable.
Max 100 MB per file · Max 4 hours per audio file · Google Drive and SharePoint connectors in v1.1
Every engineering decision is documented in the technical spec and grounded in benchmarks. Nothing is hand-wavy.
Self-hosted on AWS Lambda (Docker container). 1,024-dim vectors. 8,192-token context window. Covers English, Hindi, Tamil, and Arabic without a model change.
pgvector HNSW cosine similarity + PostgreSQL BM25 full-text search. Signals merged with Reciprocal Rank Fusion. Proper nouns, dates, and contract numbers score correctly.
Three-level hierarchy: 2,048 / 512 / 128 tokens. AutoMergingRetriever promotes grandchild chunks to parent context before synthesis. No information loss.
Nova Pro for complex multi-document synthesis (~30% of queries). Nova Lite for simple factual recall (~70%). Smart-routed per query. Blended cost ~$0.002/query.
Every model invocation is instructed to answer only from retrieved context, cite the source, and return a refusal if confidence is below threshold. No exceptions.
Vectors, metadata, conversations, and audit logs in Supabase Postgres. Raw files in S3. Row-Level Security enforces tenant isolation at the database level.
Title · Summary (with prev/next context) · Questions answered · Keywords · Named entities (people, orgs, amounts, dates). Runs on every chunk level at ingest.
pdfplumber primary. Auto-fallback to Textract when quality check fails. Full table/form analysis is feature-flagged to Growth and Boardroom tiers.
Batch mode. Speaker diarization for multi-speaker recordings. Timestamps preserved per chunk so every cited answer links to the exact second.
The audit trail is append-only and retained for 12 months minimum — exportable as CSV on Growth and Boardroom tiers. Built for regulated buyers: BFSI, listed companies, company secretaries managing board compliance under SS-1 and SS-2.
All tiers · Annual contracts · Monthly billing · 2 months free on annual prepay
We're engineers first. Services revenue funds product development — which means we know what companies at this scale actually need. skryb is Module 1 of 5 in the DemanualAI Org Intelligence OS.
We don't call ourselves revolutionary. We write software that works, price it for your market, and put our names on it.