2026-02-16

Embeddings, Vector Databases, and Re-Ranking: My Confusion Dump

An honest, unstructured brain dump about embeddings, vector databases, and re-ranking—from confusion about what the numbers mean to understanding coordinates, similarity search, and retrieval optimization.

EmbeddingsVector DatabaseRe-rankingRAGSemantic SearchLearning In Public

Disclaimer

This is not structured. This is literally me thinking through embeddings, vector databases, and re-ranking—dumping all my confusion and what finally clicked.

What Actually Happens With Embeddings?

So first question I had:

If I give a word like "king" to an embedding model... does it return something like:

[0.0213, -0.8892, 0.4421, ... 768 numbers]

And yes. That's basically what happens.

Most embedding models output something like:

  • 384 dimensions
  • 768 dimensions
  • 1024 dimensions
  • 1536 dimensions

Depends on the model.

So yes, it's just a big array of floating point numbers.

But the thing that confused me earlier was—what are these numbers even representing?

They're not "meaning" directly. They're coordinates.

Like imagine semantic space:

  • "King" lives somewhere in that space
  • "Queen" lives nearby
  • "Apple" lives somewhere else

So the 768 numbers are basically coordinates in a high-dimensional space.

Nothing mystical. Just math.

How Are Embeddings Trained? What Do the Numbers Mean?

This part took time to click.

Embeddings are trained through neural networks.

Basically during training:

  • Model sees tons of text
  • It learns patterns
  • It adjusts weights
  • Tokens get mapped into vectors

The goal is: words (or sentences) used in similar contexts → end up with similar vectors.

So mathematically:

  • Similar meaning → smaller cosine distance
  • Different meaning → larger distance

The numbers themselves don't "mean" anything individually.

Dimension 432 does not mean "royalty" or "power".

Meaning emerges from:

  • All dimensions together
  • Relative position in space

So embeddings are relational, not symbolic.

Why Store Embeddings in a Vector Database?

Because once you convert text → vector, you can't search it using normal SQL LIKE queries.

You need:

  • Similarity search
  • Nearest neighbor search
  • Cosine similarity / dot product

And doing that efficiently for:

  • 10,000 vectors
  • 1 million vectors
  • 100 million vectors

Is not trivial.

That's where vector databases come in.

What Is the "Dimension" of a Vector Database?

This confused me initially.

The database itself doesn't have a "dimension."

Each embedding has a fixed dimension.

For example:

  • OpenAI embedding → 1536 dims
  • BERT-style → 768 dims

The vector DB just stores vectors of a fixed size.

So if your embedding model outputs 768-d vectors, your DB schema will expect vectors of size 768.

That's it.

What Architecture Does a Vector Database Use?

This is where it becomes more systems-level.

Vector DBs use:

  • ANN (Approximate Nearest Neighbor) algorithms
  • HNSW graphs
  • IVF (inverted file index)
  • Product quantization sometimes

Instead of scanning all vectors linearly, they build special index structures to find "closest vectors" fast.

So it's not like:

SELECT * FROM embeddings ORDER BY cosine_distance LIMIT 5

It's way more optimized internally.

How Are Embeddings Stored Physically?

This was another doubt I had.

Inside the DB, embeddings are just:

  • Arrays of floats
  • Usually float32
  • Sometimes compressed

Internally stored in:

  • Memory
  • Disk
  • SSD

Depending on the database and configuration.

Just like SQL or Mongo, eventually everything lives on disk.

But vector DBs optimize:

  • Indexing in memory
  • Faster similarity search
  • Sometimes memory-mapped files

So at the lowest level, yes, they're still stored on hard disk or SSD.

Nothing magical.

What Is Re-Ranking?

Okay, this part is interesting.

Vector search gives you top-k similar results.

But similarity search is not perfect.

So re-ranking means:

  1. Take top-k candidates
  2. Run a more expensive model on them
  3. Reorder them more accurately

For example:

  1. Use embeddings to fetch top 50 similar documents
  2. Pass those 50 to a cross-encoder model
  3. Re-score them more precisely
  4. Return top 5

So re-ranking improves precision.

It's like: fast filter first → accurate judge later.

How Does Storage Compare to SQL or Mongo?

In SQL:

  • Rows and columns

In Mongo:

  • JSON-like documents

In vector DB:

  • Vector + metadata

For example:

{
  "id": 123,
  "embedding": [0.12, -0.55, ...],
  "text": "This is a paragraph...",
  "category": "AI"
}

So you still store metadata. The vector is just an additional field.

But indexing is different. That's the key difference.

Word vs Sentence vs Paragraph Embeddings

This confused me a lot.

If I input:

  • One word
  • One sentence
  • Full paragraph

Do I get:

  • One embedding?
  • Or one embedding per token?

Answer: Usually one embedding per input.

For most sentence embedding models:

  • You pass a full sentence
  • You get one fixed-size vector

Internally:

  • Model generates token embeddings
  • Then applies pooling (like mean pooling or CLS token)
  • Gives one final embedding

So even if paragraph has 100 tokens, you still get one 768-d vector (for example).

Unless you explicitly ask for token-level embeddings, you get one embedding per input text.

What Clicked Overall

  • Embeddings are just coordinates
  • Similarity is geometry
  • Vector DB is optimized nearest-neighbor search
  • Re-ranking improves quality
  • Storage is still disk + memory, just optimized
  • One input → one embedding (usually)

Earlier all of this felt abstract.

Now it feels more like:

  • Linear algebra
  • Indexing
  • Systems design
  • Retrieval optimization

Still learning. But this is my current understanding dump.

Related Reading

Subscribe to my newsletter

No spam, promise. I only send curated blogs that match your interests — the stuff you'd actually want to read.

Interests (optional)

Unsubscribe anytime. Your email is safe with me.