Sebastian Raschka • 12/3/2025

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

This article provides a detailed technical breakdown of the DeepSeek V3.2 large language model, covering its architecture evolution from V3, the implementation of Multi-Head Latent Attention (MLA) and sparse attention, and updates to its reinforcement learning training (RLVR/GRPO). It compares the model's performance to proprietary counterparts like GPT-5 and Gemini 3.0 Pro, based on the official technical reports.

0 comments

#Reinforcement Learning #Deepseek #LLM Architecture