Omkar Chebale

Omkar Chebale - Blog https://omkarchebale.vercel.app/blogs Thoughts on AI, ML, and software engineering. What I build, break, and learn. en-us Thu, 02 Apr 2026 15:27:16 GMT https://omkarchebale.vercel.app/profile.jpg Omkar Chebale - Blog https://omkarchebale.vercel.app AI Memory Isn't Memory — It's Smart Context Injection https://omkarchebale.vercel.app/blogs/ai-memory-isn-t-memory-it-s-smart-context-injection https://omkarchebale.vercel.app/blogs/ai-memory-isn-t-memory-it-s-smart-context-injection AI memory isn't like human memory—models forget everything. What we call memory is actually smart storing, searching, and injecting context at the right time using external systems. Thu, 02 Apr 2026 14:13:16 GMT AI LLM Memory Systems Context Window Vector Database Learning In Public Encoder vs Decoder — What Each Half of the Transformer Actually Does https://omkarchebale.vercel.app/blogs/encoder-vs-decoder-what-each-half-of-the-transformer-actually-does https://omkarchebale.vercel.app/blogs/encoder-vs-decoder-what-each-half-of-the-transformer-actually-does A clear breakdown of what the encoder and decoder each do in a Transformer — their internal structure, how multi-head self-attention works, what cross-attention is, and when you'd use encoder-only vs decoder-only vs full encoder-decoder models. Mon, 30 Mar 2026 07:11:03 GMT transformers encoder decoder deep-learning nlp attention ai Contextual Embeddings — How Transformers Make Words Context-Aware https://omkarchebale.vercel.app/blogs/contextual-embeddings-how-transformers-make-words-context-aware https://omkarchebale.vercel.app/blogs/contextual-embeddings-how-transformers-make-words-context-aware How self-attention produces contextual embeddings by computing a weighted sum of value vectors — and what it means that the same word gets a different representation depending on the sentence it appears in. Sun, 29 Mar 2026 07:10:00 GMT transformers embeddings self-attention deep-learning nlp ai Softmax Demystified — How Raw Scores Become Attention Weights https://omkarchebale.vercel.app/blogs/softmax-demystified-how-raw-scores-become-attention-weights https://omkarchebale.vercel.app/blogs/softmax-demystified-how-raw-scores-become-attention-weights A deep dive into the softmax function — why it's used in self-attention, how it converts raw dot product scores into probabilities, and why the numerically stable variant (subtracting the max) matters in practice. Sat, 28 Mar 2026 11:59:54 GMT transformers softmax self-attention deep-learning nlp math Self-Attention From Scratch — A Complete Numerical Walkthrough https://omkarchebale.vercel.app/blogs/self-attention-from-scratch-a-complete-numerical-walkthrough https://omkarchebale.vercel.app/blogs/self-attention-from-scratch-a-complete-numerical-walkthrough A full step-by-step numerical walkthrough of self-attention using the sentence "please study man" — computing Q, K, V vectors, raw attention scores, softmax weights, and final contextual output vectors from scratch. Fri, 27 Mar 2026 10:48:10 GMT transformers self-attention deep-learning nlp machine-learning ai Query, Key, Value — The Database Analogy That Makes Self-Attention Click https://omkarchebale.vercel.app/blogs/query-key-value-the-database-analogy-that-makes-self-attention-click https://omkarchebale.vercel.app/blogs/query-key-value-the-database-analogy-that-makes-self-attention-click A deep intuitive breakdown of the Q, K, V mechanism in self-attention — using a database retrieval analogy and real weight matrix math to show exactly how Transformers decide which words to attend to. Thu, 26 Mar 2026 06:00:00 GMT transformers self-attention deep-learning nlp machine-learning ai Attention Is All You Need — The Paper That Changed AI Forever https://omkarchebale.vercel.app/blogs/attention-is-all-you-need-the-paper-that-changed-ai-forever https://omkarchebale.vercel.app/blogs/attention-is-all-you-need-the-paper-that-changed-ai-forever A deep dive into the Transformer architecture introduced in the landmark 2017 paper — what it is, how it works, why it replaced RNNs, and why every modern AI model from GPT to Gemini traces its roots here. Wed, 25 Mar 2026 14:37:16 GMT transformers deep-learning nlp attention ai machine-learning The Real Difference Between Training, Fine-Tuning, and Inference (My Mental Model) https://omkarchebale.vercel.app/blogs/the-real-difference-between-training-fine-tuning-and-inference-my-mental-model https://omkarchebale.vercel.app/blogs/the-real-difference-between-training-fine-tuning-and-inference-my-mental-model Breaking down the difference between training, fine-tuning, and inference—why they're not the same thing, what actually happens in each stage, and why understanding this makes LLM systems way less confusing. Sun, 22 Mar 2026 13:19:24 GMT Machine Learning LLM Training Fine-Tuning Inference Learning In Public Deep Learning Why Tokenization Is More Important Than You Think https://omkarchebale.vercel.app/blogs/why-tokenization-is-more-important-than-you-think https://omkarchebale.vercel.app/blogs/why-tokenization-is-more-important-than-you-think Why tokenization is the most underrated part of LLMs—how tokens aren't words, why they affect cost and performance, and why bad tokenization breaks everything downstream. Sat, 21 Mar 2026 15:03:38 GMT Tokenization LLM NLP Machine Learning Embeddings Learning In Public The Only Blog You Need to Understand Encoder-Decoder Architecture https://omkarchebale.vercel.app/blogs/the-only-blog-you-need-to-understand-encoder-decoder-architecture https://omkarchebale.vercel.app/blogs/the-only-blog-you-need-to-understand-encoder-decoder-architecture A complete breakdown of encoder-decoder architectures—how they compress sequences into context vectors, generate outputs step-by-step, why teacher forcing matters, and the four key limitations that led to attention mechanisms. Fri, 20 Mar 2026 16:41:21 GMT Encoder-Decoder Sequence to Sequence LSTM RNN NLP Machine Learning Deep Learning Learning In Public I Built "Legal Lens" — A Fine-Tuned AI That Translates Legal Jargon Into Plain English https://omkarchebale.vercel.app/blogs/i-built-legal-lens-a-fine-tuned-ai-that-translates-legal-jargon-into-plain-english https://omkarchebale.vercel.app/blogs/i-built-legal-lens-a-fine-tuned-ai-that-translates-legal-jargon-into-plain-english Building a fine-tuned AI to translate legal jargon into plain English—from FLAN-T5 failures to Gemma-2B success using QLoRA on a free GPU, and the engineering lessons learned along the way. Wed, 25 Feb 2026 21:29:44 GMT LLM Fine-Tuning NLP Legal Tech QLoRA Gemma Machine Learning Hugging Face Building In Public I Built an MCP Server for My Portfolio — Death by Tiny Bugs https://omkarchebale.vercel.app/blogs/i-built-an-mcp-server-for-my-portfolio-death-by-tiny-bugs https://omkarchebale.vercel.app/blogs/i-built-an-mcp-server-for-my-portfolio-death-by-tiny-bugs How I exposed my portfolio blog system as an MCP server so Claude could operate it with natural language — and the 5 small but painful bugs that stood in the way. Tue, 17 Feb 2026 14:23:16 GMT MCP Python FastMCP Debugging Developer Tools Learning In Public Portfolio Embeddings, Vector Databases, and Re-Ranking: My Confusion Dump https://omkarchebale.vercel.app/blogs/embeddings-vector-databases-and-re-ranking-my-confusion-dump https://omkarchebale.vercel.app/blogs/embeddings-vector-databases-and-re-ranking-my-confusion-dump An honest, unstructured brain dump about embeddings, vector databases, and re-ranking—from confusion about what the numbers mean to understanding coordinates, similarity search, and retrieval optimization. Mon, 16 Feb 2026 09:32:54 GMT Embeddings Vector Database Re-ranking RAG Semantic Search Learning In Public Quantization Isn't Scary: What I Wish Someone Told Me Earlier https://omkarchebale.vercel.app/blogs/quantization-isn-t-scary-what-i-wish-someone-told-me-earlier https://omkarchebale.vercel.app/blogs/quantization-isn-t-scary-what-i-wish-someone-told-me-earlier Breaking down quantization from scary optimization technique to simple concept—how reducing bit precision makes models smaller and faster, and why calibration matters more than the math. Fri, 06 Feb 2026 15:16:45 GMT Quantization Model Optimization Inference LLM Machine Learning Learning In Public Text Classification Inference Benchmark: What Actually Happens on CPU vs GPU https://omkarchebale.vercel.app/blogs/text-classification-inference-benchmark-what-actually-happens-on-cpu-vs-gpu https://omkarchebale.vercel.app/blogs/text-classification-inference-benchmark-what-actually-happens-on-cpu-vs-gpu A practical inference benchmark comparing DistilBERT performance on CPU vs GPU—measuring latency, throughput, and memory across different batch sizes to understand what actually happens in production. Thu, 05 Feb 2026 09:37:51 GMT Transformers Inference machinelearning DistilBERT PyTorch I Learned Machine Learning Without a Single Mentor https://omkarchebale.vercel.app/blogs/i-learned-machine-learning-without-a-single-mentor https://omkarchebale.vercel.app/blogs/i-learned-machine-learning-without-a-single-mentor Learning machine learning alone in a Tier-3 city without mentors, bootcamps, or a tech ecosystem—why constraints became advantages and how building in public taught me more than any course. Mon, 02 Feb 2026 16:33:27 GMT Machine Learning Self-Taught Learning In Public Remote Learning Developer Journey Why I Stopped Chasing Perfect Code (And Started Shipping Instead) https://omkarchebale.vercel.app/blogs/why-i-stopped-chasing-perfect-code-and-started-shipping-instead https://omkarchebale.vercel.app/blogs/why-i-stopped-chasing-perfect-code-and-started-shipping-instead A personal reflection on breaking free from perfectionism—why I stopped over-engineering side projects and started shipping imperfect code that actually reaches users. Wed, 14 Jan 2026 10:45:23 GMT engineering productivity career lessons Developer Mindset Vision Language Models: How Machines Learned to See and Understand https://omkarchebale.vercel.app/blogs/vision-language-models-how-machines-learned-to-see-and-understand https://omkarchebale.vercel.app/blogs/vision-language-models-how-machines-learned-to-see-and-understand Breaking down Vision Language Models into their core components—vision encoders, text encoders, fusion mechanisms—and the two main paradigms: contrastive learning (CLIP-style) and generative models. Thu, 08 Jan 2026 20:56:52 GMT Vision Language Models VLM Machine Learning CLIP Transformers Computer Vision Learning In Public How I Built an AI-Powered Blog Recommendation System From Scratch https://omkarchebale.vercel.app/blogs/how-i-built-an-ai-powered-blog-recommendation-system-from-scratch https://omkarchebale.vercel.app/blogs/how-i-built-an-ai-powered-blog-recommendation-system-from-scratch Building a semantic blog recommendation system from scratch using embeddings, vector databases, and pre-computed results—why tags aren't enough and how I integrated ML into my Next.js portfolio. Sun, 04 Jan 2026 10:26:59 GMT Machine Learning Embeddings Vector Database Python Next.js Pinecone Recommendation System Learning In Public Logistic Regression From Scratch — What It Actually Does (Without Skipping the Thinking) https://omkarchebale.vercel.app/blogs/logistic-regression-from-scratch-what-it-actually-does-without-skipping-the-thinking https://omkarchebale.vercel.app/blogs/logistic-regression-from-scratch-what-it-actually-does-without-skipping-the-thinking Breaking down Logistic Regression from first principles—why it exists to express confidence in binary outcomes, how sigmoid transforms linear scores into probabilities, and a minimal from-scratch implementation. Sat, 03 Jan 2026 13:47:36 GMT Machine Learning Logistic Regression From Scratch Classification Learning In Public AI, ML, and Deep Learning: clearing the confusion once and for all https://omkarchebale.vercel.app/blogs/ai-ml-and-deep-learning-clearing-the-confusion-once-and-for-all https://omkarchebale.vercel.app/blogs/ai-ml-and-deep-learning-clearing-the-confusion-once-and-for-all Understanding AI, Machine Learning, and Deep Learning as a hierarchy rather than competing terms—from the broad AI umbrella to data-driven ML to neural-network-based deep learning. Mon, 29 Dec 2025 05:08:41 GMT AI Machine Learning Deep Learning Learning In Public Fundamentals RAG Explained: The 5 Steps That Make LLMs Smarter https://omkarchebale.vercel.app/blogs/rag-explained-the-5-steps-that-make-llms-smarter https://omkarchebale.vercel.app/blogs/rag-explained-the-5-steps-that-make-llms-smarter A beginner-friendly breakdown of RAG's five core steps: from document preprocessing and chunking to embeddings, vector databases, and how LLMs use retrieved context to generate accurate answers. Thu, 25 Dec 2025 05:50:32 GMT RAG LLM Embeddings Vector Database AI Learning In Public Understanding MCP: Servers, Tools, and Why They Matter https://omkarchebale.vercel.app/blogs/understanding-mcp-servers-tools-and-why-they-matter https://omkarchebale.vercel.app/blogs/understanding-mcp-servers-tools-and-why-they-matter Breaking down MCP (Model Context Protocol) through a simple analogy: tools are functions, MCP servers are toolboxes, and LLMs can invoke them through natural language without any UI interaction. Mon, 22 Dec 2025 05:36:41 GMT MCP LLM AI Tools APIs Learning In Public System Design How I Built a Blog System Into My Portfolio (And Why I Did It) https://omkarchebale.vercel.app/blogs/how-i-built-a-blog-system-into-my-portfolio-and-why-i-did-it https://omkarchebale.vercel.app/blogs/how-i-built-a-blog-system-into-my-portfolio-and-why-i-did-it How I built a simple blog system into my portfolio using a custom API, MongoDB, and markdown—so I can write and publish from anywhere. Sun, 21 Dec 2025 13:53:09 GMT Personal Portfolio API Design MongoDB Next.js Learning In Public Blogging My First Real n8n Workflow: Why It Took 12+ Hours (and What I Learned) https://omkarchebale.vercel.app/blogs/my-first-real-n8n-workflow-why-it-took-12-hours-and-what-i-learned https://omkarchebale.vercel.app/blogs/my-first-real-n8n-workflow-why-it-took-12-hours-and-what-i-learned My honest beginner experience with n8n, why my first simple workflow took 12+ hours, and what I learned about automation, triggers, and platform limitations. Sat, 20 Dec 2025 11:04:04 GMT n8n Automation Low-Code Debugging Learning In Public Daily Log