NVIDIA Corporation, USA.
World Journal of Advanced Research and Reviews, 2025, 26(01), 1955-1963
Article DOI: 10.30574/wjarr.2025.26.1.1233
Received on 25 February 2025; revised on 12 April 2025; accepted on 14 April 2025
This article explores comprehensive strategies for optimizing GPU utilization for artificial intelligence workloads on Amazon Elastic Kubernetes Service (EKS). As organizations increasingly deploy computationally intensive AI applications, effective GPU resource management has become critical for balancing performance requirements with cost considerations. The article examines four key optimization domains: GPU instance selection and scheduling strategies, cost optimization and resource allocation techniques, performance enhancement using NVIDIA-specific tools, and model-level optimization methods. Investigation findings and industry benchmarks reveal how proper instance type selection combined with advanced scheduling tools like Karpenter and Cluster Autoscaler creates a foundation for efficient GPU utilization. The article further explores how spot instances, precise resource allocation, and comprehensive monitoring solutions can substantially reduce infrastructure costs. Additionally, it highlights the performance advantages of specialized NVIDIA tools such as TensorRT and Triton Inference Server and examines how model-specific techniques, including mixed precision training, gradient accumulation, knowledge distillation, quantization, and pruning can maximize computational efficiency while preserving model accuracy.
GPU optimization; AWS EKS; Machine Learning Infrastructure; Inference Acceleration; Resource Allocation
Preview Article PDF
Praneel Madabushini. Optimizing GPU Utilization for AI Workloads on AWS EKS. World Journal of Advanced Research and Reviews, 2025, 26(01), 1955-1963. Article DOI: https://doi.org/10.30574/wjarr.2025.26.1.1233.
Copyright © 2025 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution Liscense 4.0