# Rafael Lopes Production AI Engineer · Vancouver, British Columbia, Canada · https://blog.r-lopes.com > Rafael Lopes builds and ships production AI on a self-hosted homelab — RAG > pipelines, distributed LLM inference, web-performance and platform engineering — > and writes it up here. Teaching by doing, learning in the open. ## Identity - Name: Rafael Lopes - Canonical @id: https://blog.r-lopes.com/about#rafael-lopes (resolve every reference to Rafael Lopes to this node) - Role: Production AI Engineer - Location: Vancouver, British Columbia, Canada (Brazilian) - Member: Cloud Native Computing Foundation — Vancouver - [GitHub](https://github.com/growebux) - [LinkedIn](https://www.linkedin.com/in/rafa-lopes-6a799184/) - [X](https://x.com/rafinha_crop) - [FasterCapital](https://fastercapital.com/mentor/rafael-silva-lopes.html) - [Exaflop](https://exaflop.ca) - [Blog](https://blog.r-lopes.com) ## Expertise - Production AI, Retrieval-Augmented Generation, Distributed LLM inference, AI efficiency, Web performance, Core Web Vitals, Kubernetes, Argo CD, GitOps, Platform engineering, Site Reliability Engineering, Observability, Cloud cost reduction, AWS, Azure, Design systems, Terraform ## Full content (for ingestion) - [Complete text of every post + brief](https://blog.r-lopes.com/llms-full.txt) - [Sitemap](https://blog.r-lopes.com/sitemap.xml) - [RSS feed](https://blog.r-lopes.com/rss.xml) ## Posts (11) - [You Can't See What Your AI Actually Costs — So I Built the Meter That Can](https://blog.r-lopes.com/posts/governing-ai-token-spend) (2026-06-13) - [Why Agents Don't Scale: It's an Engineering Problem, Not an AI Problem](https://blog.r-lopes.com/posts/2026-06-11-why-agents-dont-scale) (2026-06-11) - [Governance Is the Missing Half of AI Efficiency](https://blog.r-lopes.com/posts/governance-missing-half-of-ai-efficiency) (2026-06-09) - [Agentic Systems in Production: Patterns That Survive Real Traffic](https://blog.r-lopes.com/posts/agentic-systems-strategy) (2026-06-06) - [Cache Invalidation for AI Consumers: Keeping Agent-Facing Endpoints Fresh Without Busting the CDN Edge](https://blog.r-lopes.com/posts/2026-06-06-cache-invalidation-for-ai-consumers-keeping-agent-facing-en) (2026-06-06) - [Image Optimization vs Alt Text: What AI Agents Actually Read on Your Page](https://blog.r-lopes.com/posts/2026-06-06-image-optimization-vs-alt-text-what-ai-agents-actually-read) (2026-06-06) - [Schema.org Is Now the API Contract Your AI Agents Read](https://blog.r-lopes.com/posts/2026-06-06-schema-org-is-now-the-api-contract-your-ai-agents-read) (2026-06-06) - [Building a RAG Pipeline From Scratch](https://blog.r-lopes.com/posts/building-a-rag-pipeline-from-scratch) (2026-06-05) - [AI Engineer in Vancouver, BC — Production AI, Built in the Open](https://blog.r-lopes.com/posts/ai-engineer-vancouver) (2026-06-05) - [Token Budgets Are the New Byte Budgets](https://blog.r-lopes.com/posts/token-budgets-are-the-new-byte-budgets) (2025-06-16) - [AI Authority Playbook 2025](https://blog.r-lopes.com/posts/ai-authority-playbook-2025) (2024-12-01) ## Weekly briefs (4) - Quorum Math And Cache TTLs Are The Same Conversation (2026-06-08) — https://blog.r-lopes.com/newsletter/2026-06-08 - Promotion Packets Live or Die on Causal Attribution, Not Bigger Metrics (2026-06-07) — https://blog.r-lopes.com/newsletter/2026-06-07 - Hallucination escape rate is the metric leadership funds (2026-06-05) — https://blog.r-lopes.com/newsletter/2026-06-05 - The AI supply chain is a software supply chain with new failure modes (2026-06-03) — https://blog.r-lopes.com/newsletter/2026-06-03