Announcing smaller machine types for A3 High VMs
google cloudblogToday, an increasing number of organizations are using GPUs to run inference1 on their AI/ML models. Since the number of GPUs needed to serve a single inference workload varies, organizations need more granularity in the number of GPUs in their virtual machines (VMs) to keep costs low while scaling with user demand.
You can use A3 High VMs powered by NVIDIA H100 80GB GPUs in multiple generally available machine types of 1NEW, 2NEW, 4NEW, and 8 GPUs.
Accessing smaller H100 machine types
All A3 machine types are available through the fully managed Vertex AI, as nodes through Google Kubernetes Engine (GKE), and as VMs through Google Compute Engine.
The 1, 2, and 4 A3 High GPU machine types are available as Spot VMs and through Dynamic Workload Scheduler (DWS) Flex Start mode.
A3 VMs portfolio powered by NVIDIA H100 GPUs |
Machine Type(GPUs count, GPU ... |
Copyright of this story solely belongs to google cloudblog . To see the full text click HERE