# Omkar Chebale — AI Engineer & Full-Stack Developer > Personal portfolio and technical blog at https://omkarchebale.vercel.app ## About Omkar Chebale is an AI Engineer and Full-Stack Developer building production-grade LLM systems, RAG pipelines, and scalable web applications. Currently at AI Planet. ## Pages - [Home](https://omkarchebale.vercel.app/): Portfolio landing — services, skills, featured work, testimonials - [About](https://omkarchebale.vercel.app/about): Bio, experience timeline, education - [Blog](https://omkarchebale.vercel.app/blogs): Technical blog on AI, ML, and software engineering - [Skills](https://omkarchebale.vercel.app/skills): Full technology stack ## Blog Posts - [AI Memory Isn't Memory — It's Smart Context Injection](https://omkarchebale.vercel.app/blogs/ai-memory-isn-t-memory-it-s-smart-context-injection): AI memory isn't like human memory—models forget everything. What we call memory is actually smart storing, searching, and injecting context at the right time using external systems. (AI, LLM, Memory Systems, Context Window, Vector Database, Learning In Public) - [Encoder vs Decoder — What Each Half of the Transformer Actually Does](https://omkarchebale.vercel.app/blogs/encoder-vs-decoder-what-each-half-of-the-transformer-actually-does): A clear breakdown of what the encoder and decoder each do in a Transformer — their internal structure, how multi-head self-attention works, what cross-attention is, and when you'd use encoder-only vs decoder-only vs full encoder-decoder models. (transformers, encoder, decoder, deep-learning, nlp, attention, ai) - [Contextual Embeddings — How Transformers Make Words Context-Aware](https://omkarchebale.vercel.app/blogs/contextual-embeddings-how-transformers-make-words-context-aware): How self-attention produces contextual embeddings by computing a weighted sum of value vectors — and what it means that the same word gets a different representation depending on the sentence it appears in. (transformers, embeddings, self-attention, deep-learning, nlp, ai) - [Softmax Demystified — How Raw Scores Become Attention Weights](https://omkarchebale.vercel.app/blogs/softmax-demystified-how-raw-scores-become-attention-weights): A deep dive into the softmax function — why it's used in self-attention, how it converts raw dot product scores into probabilities, and why the numerically stable variant (subtracting the max) matters in practice. (transformers, softmax, self-attention, deep-learning, nlp, math) - [Self-Attention From Scratch — A Complete Numerical Walkthrough](https://omkarchebale.vercel.app/blogs/self-attention-from-scratch-a-complete-numerical-walkthrough): A full step-by-step numerical walkthrough of self-attention using the sentence "please study man" — computing Q, K, V vectors, raw attention scores, softmax weights, and final contextual output vectors from scratch. (transformers, self-attention, deep-learning, nlp, machine-learning, ai) - [Query, Key, Value — The Database Analogy That Makes Self-Attention Click](https://omkarchebale.vercel.app/blogs/query-key-value-the-database-analogy-that-makes-self-attention-click): A deep intuitive breakdown of the Q, K, V mechanism in self-attention — using a database retrieval analogy and real weight matrix math to show exactly how Transformers decide which words to attend to. (transformers, self-attention, deep-learning, nlp, machine-learning, ai) - [Attention Is All You Need — The Paper That Changed AI Forever](https://omkarchebale.vercel.app/blogs/attention-is-all-you-need-the-paper-that-changed-ai-forever): A deep dive into the Transformer architecture introduced in the landmark 2017 paper — what it is, how it works, why it replaced RNNs, and why every modern AI model from GPT to Gemini traces its roots here. (transformers, deep-learning, nlp, attention, ai, machine-learning) - [The Real Difference Between Training, Fine-Tuning, and Inference (My Mental Model)](https://omkarchebale.vercel.app/blogs/the-real-difference-between-training-fine-tuning-and-inference-my-mental-model): Breaking down the difference between training, fine-tuning, and inference—why they're not the same thing, what actually happens in each stage, and why understanding this makes LLM systems way less confusing. (Machine Learning, LLM, Training, Fine-Tuning, Inference, Learning In Public, Deep Learning) - [Why Tokenization Is More Important Than You Think](https://omkarchebale.vercel.app/blogs/why-tokenization-is-more-important-than-you-think): Why tokenization is the most underrated part of LLMs—how tokens aren't words, why they affect cost and performance, and why bad tokenization breaks everything downstream. (Tokenization, LLM, NLP, Machine Learning, Embeddings, Learning In Public) - [The Only Blog You Need to Understand Encoder-Decoder Architecture](https://omkarchebale.vercel.app/blogs/the-only-blog-you-need-to-understand-encoder-decoder-architecture): A complete breakdown of encoder-decoder architectures—how they compress sequences into context vectors, generate outputs step-by-step, why teacher forcing matters, and the four key limitations that led to attention mechanisms. (Encoder-Decoder, Sequence to Sequence, LSTM, RNN, NLP, Machine Learning, Deep Learning, Learning In Public) - [I Built "Legal Lens" — A Fine-Tuned AI That Translates Legal Jargon Into Plain English](https://omkarchebale.vercel.app/blogs/i-built-legal-lens-a-fine-tuned-ai-that-translates-legal-jargon-into-plain-english): Building a fine-tuned AI to translate legal jargon into plain English—from FLAN-T5 failures to Gemma-2B success using QLoRA on a free GPU, and the engineering lessons learned along the way. (LLM, Fine-Tuning, NLP, Legal Tech, QLoRA, Gemma, Machine Learning, Hugging Face, Building In Public) - [I Built an MCP Server for My Portfolio — Death by Tiny Bugs](https://omkarchebale.vercel.app/blogs/i-built-an-mcp-server-for-my-portfolio-death-by-tiny-bugs): How I exposed my portfolio blog system as an MCP server so Claude could operate it with natural language — and the 5 small but painful bugs that stood in the way. (MCP, Python, FastMCP, Debugging, Developer Tools, Learning In Public, Portfolio) - [Embeddings, Vector Databases, and Re-Ranking: My Confusion Dump](https://omkarchebale.vercel.app/blogs/embeddings-vector-databases-and-re-ranking-my-confusion-dump): An honest, unstructured brain dump about embeddings, vector databases, and re-ranking—from confusion about what the numbers mean to understanding coordinates, similarity search, and retrieval optimization. (Embeddings, Vector Database, Re-ranking, RAG, Semantic Search, Learning In Public) - [Quantization Isn't Scary: What I Wish Someone Told Me Earlier](https://omkarchebale.vercel.app/blogs/quantization-isn-t-scary-what-i-wish-someone-told-me-earlier): Breaking down quantization from scary optimization technique to simple concept—how reducing bit precision makes models smaller and faster, and why calibration matters more than the math. (Quantization, Model Optimization, Inference, LLM, Machine Learning, Learning In Public) - [Text Classification Inference Benchmark: What Actually Happens on CPU vs GPU](https://omkarchebale.vercel.app/blogs/text-classification-inference-benchmark-what-actually-happens-on-cpu-vs-gpu): A practical inference benchmark comparing DistilBERT performance on CPU vs GPU—measuring latency, throughput, and memory across different batch sizes to understand what actually happens in production. (Transformers, Inference, machinelearning, DistilBERT, PyTorch) - [I Learned Machine Learning Without a Single Mentor](https://omkarchebale.vercel.app/blogs/i-learned-machine-learning-without-a-single-mentor): Learning machine learning alone in a Tier-3 city without mentors, bootcamps, or a tech ecosystem—why constraints became advantages and how building in public taught me more than any course. (Machine Learning, Self-Taught, Learning In Public, Remote Learning, Developer Journey) - [Why I Stopped Chasing Perfect Code (And Started Shipping Instead)](https://omkarchebale.vercel.app/blogs/why-i-stopped-chasing-perfect-code-and-started-shipping-instead): A personal reflection on breaking free from perfectionism—why I stopped over-engineering side projects and started shipping imperfect code that actually reaches users. (engineering, productivity, career, lessons, Developer Mindset) - [Vision Language Models: How Machines Learned to See and Understand](https://omkarchebale.vercel.app/blogs/vision-language-models-how-machines-learned-to-see-and-understand): Breaking down Vision Language Models into their core components—vision encoders, text encoders, fusion mechanisms—and the two main paradigms: contrastive learning (CLIP-style) and generative models. (Vision Language Models, VLM, Machine Learning, CLIP, Transformers, Computer Vision, Learning In Public) - [How I Built an AI-Powered Blog Recommendation System From Scratch](https://omkarchebale.vercel.app/blogs/how-i-built-an-ai-powered-blog-recommendation-system-from-scratch): Building a semantic blog recommendation system from scratch using embeddings, vector databases, and pre-computed results—why tags aren't enough and how I integrated ML into my Next.js portfolio. (Machine Learning, Embeddings, Vector Database, Python, Next.js, Pinecone, Recommendation System, Learning In Public) - [Logistic Regression From Scratch — What It Actually Does (Without Skipping the Thinking)](https://omkarchebale.vercel.app/blogs/logistic-regression-from-scratch-what-it-actually-does-without-skipping-the-thinking): Breaking down Logistic Regression from first principles—why it exists to express confidence in binary outcomes, how sigmoid transforms linear scores into probabilities, and a minimal from-scratch implementation. (Machine Learning, Logistic Regression, From Scratch, Classification, Learning In Public) - [AI, ML, and Deep Learning: clearing the confusion once and for all](https://omkarchebale.vercel.app/blogs/ai-ml-and-deep-learning-clearing-the-confusion-once-and-for-all): Understanding AI, Machine Learning, and Deep Learning as a hierarchy rather than competing terms—from the broad AI umbrella to data-driven ML to neural-network-based deep learning. (AI, Machine Learning, Deep Learning, Learning In Public, Fundamentals) - [RAG Explained: The 5 Steps That Make LLMs Smarter](https://omkarchebale.vercel.app/blogs/rag-explained-the-5-steps-that-make-llms-smarter): A beginner-friendly breakdown of RAG's five core steps: from document preprocessing and chunking to embeddings, vector databases, and how LLMs use retrieved context to generate accurate answers. (RAG, LLM, Embeddings, Vector Database, AI, Learning In Public) - [Understanding MCP: Servers, Tools, and Why They Matter](https://omkarchebale.vercel.app/blogs/understanding-mcp-servers-tools-and-why-they-matter): Breaking down MCP (Model Context Protocol) through a simple analogy: tools are functions, MCP servers are toolboxes, and LLMs can invoke them through natural language without any UI interaction. (MCP, LLM, AI Tools, APIs, Learning In Public, System Design) - [How I Built a Blog System Into My Portfolio (And Why I Did It)](https://omkarchebale.vercel.app/blogs/how-i-built-a-blog-system-into-my-portfolio-and-why-i-did-it): How I built a simple blog system into my portfolio using a custom API, MongoDB, and markdown—so I can write and publish from anywhere. (Personal Portfolio, API Design, MongoDB, Next.js, Learning In Public, Blogging) - [My First Real n8n Workflow: Why It Took 12+ Hours (and What I Learned)](https://omkarchebale.vercel.app/blogs/my-first-real-n8n-workflow-why-it-took-12-hours-and-what-i-learned): My honest beginner experience with n8n, why my first simple workflow took 12+ hours, and what I learned about automation, triggers, and platform limitations. (n8n, Automation, Low-Code, Debugging, Learning In Public, Daily Log) ## APIs - [Blog Search](https://omkarchebale.vercel.app/api/search?q=YOUR_QUERY): Full-text search across all blog posts - [Search Suggestions](https://omkarchebale.vercel.app/api/search/suggest?q=YOUR_QUERY): Autocomplete for titles and tags - [Blog List](https://omkarchebale.vercel.app/api/blog): JSON list of all published blog posts - [RSS Feed](https://omkarchebale.vercel.app/feed.xml): RSS 2.0 feed of all blog posts - [Sitemap](https://omkarchebale.vercel.app/sitemap.xml): XML sitemap of all pages ## Contact - Email: omkarchebale0@gmail.com - GitHub: https://github.com/Chebaleomkar - LinkedIn: https://www.linkedin.com/in/omkar-chebale-8b251726b/ - Twitter: https://twitter.com/chebalerushi ## Topics Covered AI/ML, LLMs, RAG Systems, Agentic Workflows, LangChain, Next.js, React, Node.js, MongoDB, Python, TypeScript, Full-Stack Development, Software Engineering