Overview of compute Platform for generative AI in AWS
-
Amazon Elastic Compute Cloud (Amazon EC2) accelerated computing portfolio including instances powered by GPUs and purpose-built ML silicon offers the broadest choice of accelerators to power generative AI workloads.
-
Coupled with other managed services such as Amazon SageMaker HyperPod and Amazon Elastic Kubernetes Service Amazon EKS, these instances provide developers with the industry’s best platform for building and deploying generative AI applications. Gen AI on EKS
-
Amazon Elastic Compute Cloud (EC2) Trn1 instances, powered by AWS Trainium accelerators, are purpose built for high-performance deep learning (DL) training of generative AI models, including large language models (LLMs) and latent diffusion models
-
AWS Inferentia accelerators are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications.
-
AWS Neuron SDK helps developers deploy models on the AWS Inferentia accelerators (and train them on AWS Trainium accelerators). It integrates natively with popular frameworks, such as PyTorch and TensorFlow, so that you can continue to use your existing code and workflows and run on Inferentia accelerators.