Nvidia claims 10x cost savings with open-source inference models
Nvidia has released analysis showing a 4X to 10X reduction in cost per token for AI inferencing by switching to open source models. The cost reductions were achieved by pairing Nvidiaโs Blackwell GPU platform with open-source models from Baseten, DeepInfra, Fireworks AI, and Together AI. Their tests showed significant cost improvements across healthcare, gaming, agentic chat, and customer service. […