The Determined AI Training Platform is developed to empower Deep Learning Engineers to focus on the task at hand, training high-quality models. The platform tightly integrates all of the features that a DL engineer needs to train models at scale. Determined AI takes a pragmatic, results-driven approach to deep learning, intending to dramatically improve the productivity of deep learning developers.
The Features Include
1. High-performance: The distributed training support of Determined builds upon Horovod, which is a popular distributed training framework, but includes a suite of optimizations that results in twice the performance of stock Horovod. Besides, its distributed training support is easy to set up (no code changes are needed to move from single-GPU to distributed training), and it allows multiple users to seamlessly share the same GPU cluster.
2. State-of-the-art hyperparameter search: The search functionality of Determined builds on cutting-edge research over the past decade. The hyperparameter search integrates tightly with the job scheduler and is parallel by default. This helps in getting accurate models 100x faster than standard search methods and 10x faster than Bayesian Optimization methods.
3. DL tools for individuals and teams: Determined helps users excel in experiment management with experiment tracking, log management, metrics visualization, reproducibility, and dependency management. These tools boost productivity for individual DL engineers over the lifespan of a project and are important for growing teams to collaborate and scale efficiently.
4. Hardware-agnostic and integrated with the Open Source Ecosystem: It supports the public cloud and on-prem infrastructure. This enables users to avoid getting locked into proprietary solutions. Additionally, Determined works with the DL framework of user’s choice, exports to popular serving frameworks, and more generally integrate with a wide range of data prep and model serving technologies.
Determined Training Platform powers hundreds of GPUs at innovative companies and there are plans on building the future of AI-native infrastructure together. You can install the product and check out the GitHub page to get started.