tag

#scaling-laws

총 3개의 글

AI 2026.05.03 · 13 min Advanced Llm Pretraining Deep Dive · 4

LLM 사전학습 데이터는 어떻게 설계되는가

말뭉치 구성과 품질 필터링부터 MinHash 중복 제거, DoReMi 도메인 가중치 최적화, Data Mixing Laws까지 — LLM 사전학습 데이터 파이프라인의 핵심 원리를 추적한다.

AI 2026.04.28 · 12 min Advanced Generalization Theory Deep Dive · 7

LLM의 스케일링은 예측 가능한가

Chinchilla compute-optimal ratio의 수학적 유도부터 Broken Scaling Law, Emergent Abilities 논쟁, ICL의 implicit gradient descent 이론까지, LLM 스케일링의 예측 가능성을 추적한다.

AI 2026.04.27 · 15 min Advanced Transformer Deep Dive · 7

LLM은 왜 클수록 똑똑한가 — Scaling Laws의 세계

Kaplan 2020의 power-law 발견부터 Chinchilla의 compute-optimal 역전, In-Context Learning의 출현, CoT의 emergence, 그리고 Transformer의 이론적 한계까지, 현대 LLM 설계의 과학적 토대를 추적한다.