tag

#llm-pretraining

총 2개의 글

AI 2026.05.03 · 10 min Advanced Llm Pretraining Deep Dive · 1

Kaplan의 power law부터 Chinchilla의 joint law, Broken Scaling Law, 그리고 scaling law의 본질적 한계까지 — LLM 사전학습의 수학적 의사결정을 추적한다.

AI 2026.05.03 · 12 min Advanced Llm Pretraining Deep Dive · 3

Loss spike의 4가지 근인부터 Embedding LR 분리, QK-norm, z-loss, RMSNorm, AdamW ε까지 — LLM 훈련 안정화 기법들이 공유하는 하나의 진단 프레임을 추적한다.