How FactoryOS Retrieves the Right Context
A language model is only as good as the text you put in front of it. Hand it the right three paragraphs and it answers well; hand it the wrong ones and it answers wrong with total confidence.
So the decisive work in any AI system is retrieval, not generation. FactoryOS treats it as a stack of complementary methods, because no single retrieval trick is reliable on its own.
Retrieval Decides the Answer
Most AI quality problems are retrieval problems wearing a generation costume. The model rarely fails because it cannot reason; it fails because it was handed the wrong context and reasoned faithfully over it.
This is why "we do RAG" tells you almost nothing. Everything depends on how the retrieval is done, and that is where the engineering actually lives.
The haystack is mostly unstructured, which makes it harder. IDC estimates about 80% of enterprise data has no tidy rows and columns, so retrieval that only handles clean records misses most of what an organization knows.1
Keywords and Vectors Together
FactoryOS runs both keyword and vector search, because each catches what the other misses. BM25 keyword search nails exact terms, names, part numbers, and codes, while vector search finds passages that mean the same thing in different words.
Run alone, each leaves a predictable gap. Keywords miss the paraphrase; vectors miss the one exact term that actually mattered. Run together, they cover the literal and the conceptual at once.
Fusing and Reranking the Results
The two result sets are merged with reciprocal rank fusion, which combines them into one ranked list without letting either method dominate. A reranker then reorders that list by how well each passage actually fits the specific query.
The fusion gathers strong candidates; the reranker sharpens the order. What reaches the model is the residue of two passes, each correcting the other's mistakes.
Embeddings With Context
The vector search starts from better material because the embeddings are contextual. Rather than embedding a chunk in isolation, the system accounts for the surrounding context it came from, so its representation reflects where it sits in the larger document.
Better representations mean better matches before any ranking happens. Quality compounds from the bottom up, and good embeddings make every step above them easier.
Fresh Without Reprocessing Everything
Retrieval is only as current as the index behind it, so the index stays fresh without brute force. Ingesters run on their own schedule, any file can be refreshed on demand, and the pipeline is hash-gated so it only reprocesses content that actually changed.
Freshness becomes cheap rather than a nightly tax. Nothing gets reprocessed because the clock advanced, only because the bytes did.
Why the Stack Wins
Each method covers a different failure mode, and stacking them covers far more ground than any one could alone. Keyword search, vector search, fusion, reranking, and contextual embedding can each be used on their own, but the strength is in the combination.
This is why answers track your actual data so closely: the model reasons over the right material because the system worked to find it. How often does your current tool quietly hand the model the wrong page?
Sources
- IDC, Data Age 2025; figure widely cited (e.g. Solutions Review). https://solutionsreview.com/data-management/80-percent-of-your-data-will-be-unstructured-in-five-years/