DawsonChen's Blog
Archive
Search
Tags
Archive
2024
6
April
2
Megatron-LM解读:DDP的原理和实现
April 27, 2024
· 1 min · Dawson Chen
MoE to Dense介绍以及相关论文速览
April 22, 2024
· 1 min · Dawson Chen
March
1
Megatron-LM解读:MoE的实现方式
March 19, 2024
· 5 min · Dawson Chen
February
1
Megatron-LM解读:流水线并行原理和代码解读
February 5, 2024
· 2 min · Dawson Chen
January
2
Batch Size杂谈
January 22, 2024
· 1 min · Dawson Chen
Deepspeed-HybridEngine开发指南
January 7, 2024
· 2 min · Dawson Chen
2023
7
December
2
Moe的Scaling Law
December 22, 2023
· 2 min · Dawson Chen
Moe(Mixtrue of Experts)技术调研
December 20, 2023
· 5 min · Dawson Chen
November
1
PPO实践经验
November 14, 2023
· 1 min · Dawson Chen
August
1
Rope背后的数学想象力
August 6, 2023
· 1 min · Dawson Chen
July
2
Deepspeed原理(手写笔记)
July 5, 2023
· 1 min · Dawson Chen
混合精度训练
July 5, 2023
· 1 min · Dawson Chen
April
1
ChatGPT Plugins原理介绍和讨论
April 7, 2023
· 2 min · Dawson Chen