Embeddings, Vector Databases, and Re-Ranking: My Confusion Dump

Disclaimer

This is not structured. This is literally me thinking through embeddings, vector databases, and re-ranking—dumping all my confusion and what finally clicked.

What Actually Happens With Embeddings?

So first question I had:

If I give a word like "king" to an embedding model... does it return something like:

[0.0213, -0.8892, 0.4421, ... 768 numbers]

And yes. That's basically what happens.

Most embedding models output something like:

384 dimensions
768 dimensions
1024 dimensions
1536 dimensions

Depends on the model.

So yes, it's just a big array of floating point numbers.

But the thing that confused me earlier was—what are these numbers even representing?

They're not "meaning" directly. They're coordinates.

Like imagine semantic space:

"King" lives somewhere in that space
"Queen" lives nearby
"Apple" lives somewhere else

So the 768 numbers are basically coordinates in a high-dimensional space.

Nothing mystical. Just math.

How Are Embeddings Trained? What Do the Numbers Mean?

This part took time to click.

Embeddings are trained through neural networks.

Basically during training:

Model sees tons of text
It learns patterns
It adjusts weights
Tokens get mapped into vectors

The goal is: words (or sentences) used in similar contexts → end up with similar vectors.

So mathematically:

Similar meaning → smaller cosine distance
Different meaning → larger distance

The numbers themselves don't "mean" anything individually.

Dimension 432 does not mean "royalty" or "power".

Meaning emerges from:

All dimensions together
Relative position in space

So embeddings are relational, not symbolic.

Why Store Embeddings in a Vector Database?

Because once you convert text → vector, you can't search it using normal SQL LIKE queries.

You need:

Similarity search
Nearest neighbor search
Cosine similarity / dot product

And doing that efficiently for:

10,000 vectors
1 million vectors
100 million vectors

Is not trivial.

That's where vector databases come in.

What Is the "Dimension" of a Vector Database?

This confused me initially.

The database itself doesn't have a "dimension."

Each embedding has a fixed dimension.

For example:

OpenAI embedding → 1536 dims
BERT-style → 768 dims

The vector DB just stores vectors of a fixed size.

So if your embedding model outputs 768-d vectors, your DB schema will expect vectors of size 768.

That's it.

What Architecture Does a Vector Database Use?

This is where it becomes more systems-level.

Vector DBs use:

ANN (Approximate Nearest Neighbor) algorithms
HNSW graphs
IVF (inverted file index)
Product quantization sometimes

Instead of scanning all vectors linearly, they build special index structures to find "closest vectors" fast.

So it's not like:

SELECT * FROM embeddings ORDER BY cosine_distance LIMIT 5

It's way more optimized internally.

How Are Embeddings Stored Physically?

This was another doubt I had.

Inside the DB, embeddings are just:

Arrays of floats
Usually float32
Sometimes compressed

Internally stored in:

Memory
Disk
SSD

Depending on the database and configuration.

Just like SQL or Mongo, eventually everything lives on disk.

But vector DBs optimize:

Indexing in memory
Faster similarity search
Sometimes memory-mapped files

So at the lowest level, yes, they're still stored on hard disk or SSD.

Nothing magical.

What Is Re-Ranking?

Okay, this part is interesting.

Vector search gives you top-k similar results.

But similarity search is not perfect.

So re-ranking means:

Take top-k candidates
Run a more expensive model on them
Reorder them more accurately

For example:

Use embeddings to fetch top 50 similar documents
Pass those 50 to a cross-encoder model
Re-score them more precisely
Return top 5

So re-ranking improves precision.

It's like: fast filter first → accurate judge later.

How Does Storage Compare to SQL or Mongo?

In SQL:

Rows and columns

In Mongo:

JSON-like documents

In vector DB:

Vector + metadata

For example:

{
  "id": 123,
  "embedding": [0.12, -0.55, ...],
  "text": "This is a paragraph...",
  "category": "AI"
}

So you still store metadata. The vector is just an additional field.

But indexing is different. That's the key difference.

Word vs Sentence vs Paragraph Embeddings

This confused me a lot.

If I input:

One word
One sentence
Full paragraph

Do I get:

One embedding?
Or one embedding per token?

Answer: Usually one embedding per input.

For most sentence embedding models:

You pass a full sentence
You get one fixed-size vector

Internally:

Model generates token embeddings
Then applies pooling (like mean pooling or CLS token)
Gives one final embedding

So even if paragraph has 100 tokens, you still get one 768-d vector (for example).

Unless you explicitly ask for token-level embeddings, you get one embedding per input text.

What Clicked Overall

Embeddings are just coordinates
Similarity is geometry
Vector DB is optimized nearest-neighbor search
Re-ranking improves quality
Storage is still disk + memory, just optimized
One input → one embedding (usually)

Earlier all of this felt abstract.

Now it feels more like:

Linear algebra
Indexing
Systems design
Retrieval optimization

Still learning. But this is my current understanding dump.

Embeddings, Vector Databases, and Re-Ranking: My Confusion Dump

Disclaimer

What Actually Happens With Embeddings?

How Are Embeddings Trained? What Do the Numbers Mean?

Why Store Embeddings in a Vector Database?

What Is the "Dimension" of a Vector Database?

What Architecture Does a Vector Database Use?

How Are Embeddings Stored Physically?

What Is Re-Ranking?

How Does Storage Compare to SQL or Mongo?

Word vs Sentence vs Paragraph Embeddings

What Clicked Overall

Related Reading

Subscribe to my newsletter