Search
Alex Putnam

Build Transformative AI Applications at Scale with HPE Machine Learning Development Environment

November 4, 2021

Building and training optimized machine learning (ML) models at scale is considered the most demanding and critical stage of ML development. Doing it well requires researchers and data scientists to overcome many challenges typically encountered in High Performance Computing (HPC) environments.

These challenges often include properly setting up and managing a fast-moving ML software ecosystem and infrastructure spanning specialized compute, storage, network fabric, and tensor processors (e.g., GPUs). Additionally, users need to program, schedule, and train their models to maximize the use of the highly specialized infrastructure they have set up, which can create complexity and impede productivity.

To meet these challenges, ML Engineers and data scientists are on a never-ending search for novel and innovative solutions that help them focus on building better models and accelerate their time-to-production. HPE Machine Learning Development Environment is designed to help them specifically achieve this.

Built upon the widely popular open-source Determined Training Platform, HPE Machine Learning Development Environment reduces the complexity and cost associated with machine learning model development by removing the need to write infrastructure code and makes it easy for IT administrators to set up, manage, secure, and share AI compute clusters.

By adopting HPE Machine Learning Development Environment, model developers can:

  • Train models faster using state-of-the-art distributed training without changing the model code.

  • Automatically find high-quality models with advanced hyperparameter tuning from the creators of Hyperband.

  • Maximize GPU performance with smart scheduling and cut cloud GPU costs by seamlessly using spot/preemptible instances.

  • Track and reproduce work with experiment tracking that runs out-of-the-box, covering code versions, metrics, checkpoints, and hyperparameters.

HPE Machine Learning Development Environment integrates these features into an easy-to-use, high-performance machine learning environment — which means you can spend your time building models instead of managing infrastructure.

To learn more about HPE Machine Learning Development Environment visit our landing page and get in touch with our team of ML and distributed system experts.

To learn more about the open-source project that powers HPE Machine Learning Development Environment, we invite you to check out the Determined Training Platform on GitHub, read the Documentation, and join the Determined Community Slack to get started. Stay tuned to the HPE DEV blog for more informative articles on this subject.

Related

Garrett Goon, Kevin Musgrave

Activation Memory: A Deep Dive using PyTorch

Jun 12, 2024
Garrett Goon, Kevin Musgrave

Activation Memory: What is it?

May 15, 2024
Kevin Musgrave

AI News #14

Mar 11, 2024
Kevin Musgrave

AI News #16

Mar 26, 2024
Kevin Musgrave

AI News #18

Apr 11, 2024
Kevin Musgrave

AI News #19

Apr 15, 2024
Kevin Musgrave

AI News #21

Apr 29, 2024
Isha Ghodgaonkar

Announcing GenAI studio: Your generative AI playground built on Determined

Mar 13, 2024

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.