Create DWS (Flex Start) VMs [Coming soon by end of March 2025]#
Flex Start uses the TPU queued resources API to request TPU resources in a queued manner. When the requested resource becomes available, itโs assigned to your Google Cloud project for your immediate, exclusive use. After the requested run duration, the TPU VMs are deleted and the queued resource moves to the SUSPENDED state. For more information about queued resources, see Manage queued resources.
To request TPUs using Flex Start, use the gcloud alpha compute tpus queued-resources create command with the --provisioning-model flag set to FLEX-START and the --max-run-duration flag set to the duration you want your TPUs to run.
gcloud alpha compute tpus queued-resources create \
<your-queued-resource-id> \
--zone=<your-zone> \
--accelerator-type=<your-accelerator-type> \
--runtime-version=<your-runtime-version> \
--node-id=<your-node-id> \
--provisioning-model=FLEX-START \
--max-run-duration=<run-duration>
Replace the following placeholders:
<your-queued-resource-id>: A user-assigned ID for the queued resource request.<your-zone>: The zone in which to create the TPU VM.<your-accelerator-type>: Specifies the version and size of the Cloud TPU to create. For more information about supported accelerator types for each TPU version, see TPU versions.<your-runtime-version>: The Cloud TPU software version.<your-node-id>: A user-assigned ID for the TPU that is created when the queued resource request is allocated.<run-duration>: How long the TPUs should run. Format the duration as the number of days, hours, minutes, and seconds followed byd,h,m, ands, respectively. For example, specify72hfor a duration of 72 hours, or specify1d2h3m4sfor a duration of 1 day, 2 hours, 3 minutes, and 4 seconds. The maximum is 7 days.
You can further customize your queued resource request to run at specific times with additional flags:
--valid-after-duration: The duration before which the TPU must not be provisioned.--valid-after-time: The time before which the TPU must not be provisioned.--valid-until-duration: The duration for which the request is valid. If the request hasnโt been fulfilled by this duration, the request expires and moves to theFAILEDstate.--valid-until-time: The time for which the request is valid. If the request hasnโt been fulfilled by this time, the request expires and moves to theFAILEDstate.
For more information about optional flags, see the gcloud alpha compute tpus queued-resources create documentation.
Get the status of a Flex Start request#
To monitor the status of your Flex Start request, use the queued resources API to get the status of the queued resource request using the gcloud alpha compute tpus queued-resources describe command:
gcloud alpha compute tpus queued-resources describe <your-queued-resource-id> \
--zone <your-zone>
A queued resource can be in one of the following states:
WAITING_FOR_RESOURCES: The request has passed initial validation and has been added to the queue.
PROVISIONING: The request has been selected from the queue and its TPU VMs are being created.
ACTIVE: The request has been fulfilled, and the VMs are ready.
FAILED: The request could not be completed. Use the
describecommand for more details.SUSPENDING: The resources associated with the request are being deleted.
SUSPENDED: The resources specified in the request have been deleted.
For more information, see Retrieve state and diagnostic information about a queued resource request.
Monitor the run time of Flex Start TPUs#
You can monitor the run time of Flex Start TPUs by checking the TPUโs termination timestamp:
Get the details of your queued resource request using the steps in the previous section, Get the status of a Flex Start request.
If the queued resource is waiting for resources: In the output, see the
maxRunDurationfield. This field specifies how long the TPUs will run once theyโre created. If the TPUs associated with the queued resource have been created: In the output, see theterminationTimestampfield listed for each node in the queued resource. This field specifies when the TPU will be terminated.
Delete a queued resource#
Important: Queued resources consume quota regardless of their state. Delete queued resources after use to avoid blocking future requests on quota limits.
You can delete a queued resource request and the TPUs associated with the request by deleting the queued resource request and passing the --force flag to the queued-resource delete command:
gcloud alpha compute tpus queued-resources delete <your-queued-resource-id> \
--zone <your-zone> \
--force
If you delete the TPU directly, you also need to delete the queued resource, as shown in the following example. When you delete the TPU, the queued resource request transitions to the SUSPENDED state, after which you can delete the queued resource request.
To delete a TPU, use the gcloud alpha compute tpus tpu-vm delete command:
gcloud compute tpus tpu-vm delete <your-node-id> \
--zone <your-zone>
Then, to delete the queued resource, use the gcloud alpha compute tpus queued-resources delete command:
gcloud compute tpus queued-resources delete <your-queued-resource-id> \
--zone <your-zone>
For more information see Delete a queued resource request.