Elastic Machine Learning Algorithms in Amazon SageMaker


There is a large body of research on scalable machine learning (ML). Nevertheless, training ML models on large, continuously evolving datasets is still a difficult and costly undertaking for many companies and institutions. We discuss such challenges and derive requirements for an industrial-scale ML platform. Next, we describe the computational model behind Amazon SageMaker which is designed to meet such challenges. SageMaker is an ML platform provided as part of Amazon Web Services (AWS), and supports online learning, automatic hyperparameter optimization, incremental training, as well as resumable and elastic learning. We detail how to adapt several popular ML algorithms to its computational model. Finally, we present an experimental evaluation on large datasets, comparing SageMaker to several scalable, JVM-based implementations of ML algorithms, which we significantly outperform with regard to computation time and cost.