Anyscale’s Post

View organization page for Anyscale, graphic

30,332 followers

🚀 Excited to announce Elastic Distributed Training is now on Anyscale! 🔍 With Elastic Training, you’ll see up to 60% lower cloud costs using spot instances and faster training with uninterrupted progress, even as computational resources come and go during training. Elastic training adjusts to dynamic computational resources during the training process. Training can recover from spot instances preemptions and hardware failures. Instead of waiting potentially hours for a fixed number of GPUs to be available, training can continue on the resources that are already available. ⚒ You can try this out on Anyscale with minimal code changes. This includes a simple one line code change to specify (min_workers, max_workers) as a tuple instead of a fixed worker group size and adding checkpointing. Read more in our announcement: https://lnkd.in/gK64MEiS

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics