Embeddings, Vector Databases, and Re-Ranking: My Confusion Dump
An honest, unstructured brain dump about embeddings, vector databases, and re-ranking—from confusion about what the numbers mean to understanding coordinates, similarity search, and retrieval optimization.
Disclaimer
This is not structured. This is literally me thinking through embeddings, vector databases, and re-ranking—dumping all my confusion and what finally clicked.
What Actually Happens With Embeddings?
So first question I had:
If I give a word like "king" to an embedding model... does it return something like:
[0.0213, -0.8892, 0.4421, ... 768 numbers]
And yes. That's basically what happens.
Most embedding models output something like:
- 384 dimensions
- 768 dimensions
- 1024 dimensions
- 1536 dimensions
Depends on the model.
So yes, it's just a big array of floating point numbers.
But the thing that confused me earlier was—what are these numbers even representing?
They're not "meaning" directly. They're coordinates.
Like imagine semantic space:
- "King" lives somewhere in that space
- "Queen" lives nearby
- "Apple" lives somewhere else
So the 768 numbers are basically coordinates in a high-dimensional space.
Nothing mystical. Just math.
How Are Embeddings Trained? What Do the Numbers Mean?
This part took time to click.
Embeddings are trained through neural networks.
Basically during training:
- Model sees tons of text
- It learns patterns
- It adjusts weights
- Tokens get mapped into vectors
The goal is: words (or sentences) used in similar contexts → end up with similar vectors.
So mathematically:
- Similar meaning → smaller cosine distance
- Different meaning → larger distance
The numbers themselves don't "mean" anything individually.
Dimension 432 does not mean "royalty" or "power".
Meaning emerges from:
- All dimensions together
- Relative position in space
So embeddings are relational, not symbolic.
Why Store Embeddings in a Vector Database?
Because once you convert text → vector, you can't search it using normal SQL LIKE queries.
You need:
- Similarity search
- Nearest neighbor search
- Cosine similarity / dot product
And doing that efficiently for:
- 10,000 vectors
- 1 million vectors
- 100 million vectors
Is not trivial.
That's where vector databases come in.
What Is the "Dimension" of a Vector Database?
This confused me initially.
The database itself doesn't have a "dimension."
Each embedding has a fixed dimension.
For example:
- OpenAI embedding → 1536 dims
- BERT-style → 768 dims
The vector DB just stores vectors of a fixed size.
So if your embedding model outputs 768-d vectors, your DB schema will expect vectors of size 768.
That's it.
What Architecture Does a Vector Database Use?
This is where it becomes more systems-level.
Vector DBs use:
- ANN (Approximate Nearest Neighbor) algorithms
- HNSW graphs
- IVF (inverted file index)
- Product quantization sometimes
Instead of scanning all vectors linearly, they build special index structures to find "closest vectors" fast.
So it's not like:
SELECT * FROM embeddings ORDER BY cosine_distance LIMIT 5
It's way more optimized internally.
How Are Embeddings Stored Physically?
This was another doubt I had.
Inside the DB, embeddings are just:
- Arrays of floats
- Usually
float32 - Sometimes compressed
Internally stored in:
- Memory
- Disk
- SSD
Depending on the database and configuration.
Just like SQL or Mongo, eventually everything lives on disk.
But vector DBs optimize:
- Indexing in memory
- Faster similarity search
- Sometimes memory-mapped files
So at the lowest level, yes, they're still stored on hard disk or SSD.
Nothing magical.
What Is Re-Ranking?
Okay, this part is interesting.
Vector search gives you top-k similar results.
But similarity search is not perfect.
So re-ranking means:
- Take top-k candidates
- Run a more expensive model on them
- Reorder them more accurately
For example:
- Use embeddings to fetch top 50 similar documents
- Pass those 50 to a cross-encoder model
- Re-score them more precisely
- Return top 5
So re-ranking improves precision.
It's like: fast filter first → accurate judge later.
How Does Storage Compare to SQL or Mongo?
In SQL:
- Rows and columns
In Mongo:
- JSON-like documents
In vector DB:
- Vector + metadata
For example:
{
"id": 123,
"embedding": [0.12, -0.55, ...],
"text": "This is a paragraph...",
"category": "AI"
}
So you still store metadata. The vector is just an additional field.
But indexing is different. That's the key difference.
Word vs Sentence vs Paragraph Embeddings
This confused me a lot.
If I input:
- One word
- One sentence
- Full paragraph
Do I get:
- One embedding?
- Or one embedding per token?
Answer: Usually one embedding per input.
For most sentence embedding models:
- You pass a full sentence
- You get one fixed-size vector
Internally:
- Model generates token embeddings
- Then applies pooling (like mean pooling or CLS token)
- Gives one final embedding
So even if paragraph has 100 tokens, you still get one 768-d vector (for example).
Unless you explicitly ask for token-level embeddings, you get one embedding per input text.
What Clicked Overall
- Embeddings are just coordinates
- Similarity is geometry
- Vector DB is optimized nearest-neighbor search
- Re-ranking improves quality
- Storage is still disk + memory, just optimized
- One input → one embedding (usually)
Earlier all of this felt abstract.
Now it feels more like:
- Linear algebra
- Indexing
- Systems design
- Retrieval optimization
Still learning. But this is my current understanding dump.