Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples
Date | Post | Archived |
---|---|---|
2025-10-05 |
![]()
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples
|
|
2025-08-09 |
![]()
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
And How They Stack Up Against Qwen3
|
|
2025-07-19 |
![]()
The Big LLM Architecture Comparison
From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design
|
|
2025-06-17 |
![]()
Understanding and Coding the KV Cache in LLMs from Scratch
KV caches are one of the most critical techniques for efficient inference in LLMs in production.
|
|
2025-04-19 |
![]()
The State of Reinforcement Learning for LLM Reasoning
Understanding GRPO and New Insights from Reasoning Model Papers
|
|
2025-03-08 |
![]()
The State of LLM Reasoning Model Inference
Inference-Time Compute Scaling Methods to Improve Reasoning Models
|
|
2025-02-05 |
![]()
Understanding Reasoning LLMs
Methods and Strategies for Building and Refining Reasoning Models
|
|
2025-01-15 |
![]()
Noteworthy AI Research Papers of 2024 (Part Two)
Six influential AI papers from July to December
|
|
2024-12-31 |
![]()
Noteworthy AI Research Papers of 2024 (Part One)
Six influential AI papers from January to June
|
|
2024-12-08 |
![]()
LLM Research Papers: The 2024 List
A curated list of interesting LLM-related research papers from 2024, shared for those looking for something to read over the holidays.
|
|
2024-11-03 |
![]()
Understanding Multimodal LLMs
An introduction to the main techniques and latest models
|
|
2024-08-31 |
![]()
Building LLMs from the Ground Up: A 3-hour Coding Workshop
If your weekend plans include catching up on AI developments and understanding Large Language Models (LLMs), I've prepared a 1-hour presentation on the…
|
|
2024-08-17 |
![]()
New LLM Pre-training and Post-training Paradigms
A Look at How Moderns LLMs Are Trained
|
|
2024-07-20 |
![]()
Instruction Pretraining LLMs
The Latest Research in Instruction Finetuning
|
|
2024-06-08 |
Developing an LLM: Building, Training, Finetuning
A Deep Dive into the Lifecycle of LLM Development
|
|
2024-06-02 |
![]()
LLM Research Insights: Instruction Masking and New LoRA Finetuning Experiments
Discussing the Latest Model Releases and AI Research in May 2024
|
|
2024-05-12 |
![]()
How Good Are the Latest Open LLMs? And Is DPO Better Than PPO?
Discussing the Latest Model Releases and AI Research in April 2024
|
|
2024-04-20 |
![]()
Using and Finetuning Pretrained Transformers
What are the different ways to use and finetune pretrained large language models (LLMs)? The most common ways to use and finetune pretrained LLMs…
|
|
2024-03-31 |
![]()
Tips for LLM Pretraining and Evaluating Reward Models
Discussing AI Research Papers in March 2024
|
|
2024-03-03 |
![]()
A LoRA Successor, Small Finetuned LLMs Vs Generalist LLMs, and Transparent LLM Research
Once again, this has been an exciting month in AI research. This month, I'm covering two new openly available LLMs, insights into small finetuned LLMs…
|
|
2024-02-18 |
![]()
Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch
Low-rank adaptation (LoRA) is a machine learning technique that modifies a pretrained model (for example, an LLM or vision transformer) to better suit a…
|