Home | Rupesh's Blog

Apr 15, 2026 · 2 min read

Hello, blog

A quick note on what this space is for and how it's organised.

Mar 10, 2026 · 5 min read · 🤖 Understanding Transformers Part 3

Understanding Transformers — Part 3: Positional Encoding

Attention is permutation-invariant — it treats 'the cat sat' identically to 'sat cat the' without help. Positional encoding is the elegant fix. Here's the sinusoidal construction and why it works.

machine-learning deep-learning nlp transformers

Read more →

Feb 20, 2026 · 6 min read · 🤖 Understanding Transformers Part 2

Understanding Transformers — Part 2: Multi-Head Attention

One attention head is a single lens. Multi-head attention runs several lenses in parallel — each free to specialise on a different relationship type. Here's exactly how and why.

machine-learning deep-learning nlp transformers

Read more →

Feb 5, 2026 · 4 min read · 🤖 Understanding Transformers Part 1

Understanding Transformers — Part 1: The Attention Mechanism

Before multi-head attention, before positional encoding, before the encoder-decoder stack — there's one core idea that makes Transformers work. Let's build it from scratch.

machine-learning deep-learning nlp transformers

Read more →

Jan 20, 2026 · 4 min read · 📚 Book Notes: Rationality & Decision-Making Part 1

Book Notes: Thinking, Fast and Slow

Kahneman's magnum opus on the two systems of thought — what still holds up, what's been replicated, and what I take away as a practitioner.

books cognitive-science rationality psychology

Read more →