Problem deploying Llama2

Hello everyone.
I am trying to deploy Llama2 from Vertex AI Model Garden and I am getting an error.
When I try to deploy it on us-central1 (Iowa) with a2-highgpu-2g machine with 2 Nvidia Tesla A100, I get the following error:
Error Messages: The following quotas are exceeded:
CustomModelServingA2CPUsPerProjectPerRegion,CustomModelServingA100GPUsPerProjectPerRegion

Do you know what could be the cause?

As far as I can see it has something to do with the selected region, but I don't understand why.

Best regards

2 REPLIES 2

It looks like you are currently hitting a quota limit in the selected resource (A2 CPUs) You can try and request a quota increase for this resource here. But please be aware that There are also limits on Vertex AI resources. These limits are unrelated to the quota system.

Please visit the quota page for Vertex AI for more information about quota and limits: https://cloud.google.com/vertex-ai/docs/quotas

Great, thank you very much.
I'm going to contact to increase the quota.