pgvector + Embeddings in Production: The Foundation of Medical Reasoning in Examya
Architecture for semantic search and text similarity in production with pgvector, pg_trgm, and real MINSAL data.
Mario Inostroza
In Examya, the accuracy of medical result interpretation depends on two search layers working together: pgvector for semantic embeddings and pg_trgm for text similarity when vector search fails.
The production system in Supabase (examya_agents) needs both extensions to function. pg_vector handles LLM-generated embeddings, but pg_trgm saves the system when queries fail or data is incomplete.
The real implementation included a critical migration: pg_trgm extension installed in production, medical_guides table populated with 5 real MINSAL GPC guides (hemogram, lipid, glucose, thyroid, urine), and an idempotent deploy-interpreter-tables.sql script for future deployments.
The fallback architecture is key: when a doctor sends a lab result, the system first tries vector search with pgvector. If it fails due to low similarity or missing data, it automatically falls back to pg_trgm with similarity() to find results by text similarity.
MINSAL data is the lifeblood of the system. The medical_guides and exam_normalization_mappings tables contain normal reference ranges, FONASA levels, and exam codes with the exact terminology used in Chile. Without this data, the AI cannot correctly interpret values like glucose 99 vs 100.
The biggest challenge was data quality. The audit showed that many exams had incorrect or missing normal parameters. For example, glucose normalMax was at 100 when it should be 99 according to MINSAL GPC. Worse yet, some hemograms were classified as urine due to naming errors.
The integration with WhatsApp OCR creates a robust system. When a user sends a results photo, the system first classifies the document. If it’s exam_result, it uses ExamResultInterpreterAgent with both search methods to ensure accuracy. This solves the problem of two different paths that produced inconsistent results.
The production deployment was complex. It required creating formal migrations, idempotent scripts, and updating schema.prisma with extensions = [vector, pg_trgm]. The Judgment Day audit found critical bugs: exposed API keys, missing deprecation notices, and failures in the fallback mechanism.
Today the system processes real medical exams with precision. The flow: photo → classification → vector/textual search → interpretation with MINSAL ranges → accurate response. This all depends on pgvector and pg_trgm working in sync in production.
Contact: Mario Inostroza | WhatsApp | X @marioHealthBits
Related reading
In this series
How I Built Patagonia's First Private COVID PCR Lab (And Why I Ended Up Building AI)
In March 2021, I hoisted 300 kg of biosafety cabinet by crane to a second floor during lockdown. By May we were running the first private COVID PCR tests in Chilean Patagonia. The nights that followed became the real origin of Examya.
In this series
Examya: how I built a medical WhatsApp agent that processes exam orders
Technical details of implementing the Shuri agent in Examya, a system for processing medical orders via WhatsApp with FONASA integration.
In this series
DeepEval: how I measure the quality of my medical agent with objective metrics
How I built an evaluation layer with DeepEval to measure the quality of Shuri, Examya's medical agent. With real data: from 20% to 70% on E2E, custom metrics for Chile's FONASA system, and why gpt-5-nano doesn't work for structured output.