Microsoft releases DeepSpeed-FastGen for High-Throughput Text Generation
FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique.
FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique.
DeepSpeed-Inference, DeepSpeed-MII, LLMs, serving, deployment, system, FastGen, synergistic composition, Dynamic SplitFuse technique, large language models