The Math Doesn't Exactly Math
A lifelong Vijay fan tries to explain an election result that defies explanation.
Finding joy in making things and making sense of things.
I explore ideas, build software, and write about whatever catches my curiosity. Welcome to my corner of the internet.
A lifelong Vijay fan tries to explain an election result that defies explanation.
How LLM inference works step by step: prefill, decode, the KV cache, sampling, tool use, and the engineering that makes it economical.
Inside the months-long pipeline that turns trillions of words of text into a deployable LLM: data, tokenizer, pre-training, alignment, and shipping.
From raw text to streaming response - every step of how large language models are built and how they process your prompts.
From the text you type to the binary that runs - the full compiler pipeline, type safety ladder, and patterns every developer should know.