Alpa: Automating Model Sharding for Distributed Deep Learning
This led to an increased effort within the deep learning community to develop larger models. To keep up with the growing network size, the scaling of training compute has also accelerated ...
This led to an increased effort within the deep learning community to develop larger models. To keep up with the growing network size, the scaling of training compute has also accelerated ...
scaling, effort, training compute, larger models, growing network size, deep learning community