Unable To Run Tasks On Azure Batch:nodes Go Into Unusable State After Starting-up
I am trying to parallelize a Python App using Azure Batch.The workflow that I have followed in the Python client-side script is: 1) Upload Local files to Azure Blob Container using
Solution 1:
Answers and a few observations:
- It is unclear to me why you need a custom image. You can use a platform image, i.e.,
Canonical, UbuntuServer, 18.04-LTS
, and then just install what you need as part of the start task. Python3.6 can simply be installed via apt in 18.04. You may be prematurely optimizing your workflow by opting for a custom image when in fact using a platform image + start task may be faster and stable. - Your script is in Python, yet you are calling out to the Azure CLI. You may want to consider directly using the Azure Batch Python SDK instead (samples).
- When nodes go unusable, you should first examine the node for errors. You should see if the ComputeNodeError field is populated. Additionally, you can try to fetch
stdout.txt
andstderr.txt
files from thestartup
directory to diagnose what's going on. You can do both of these actions in the Azure Portal or via Batch Explorer. If that doesn't work, you can fetch the compute node service logs and file a support request. However, typically unusable means that your custom image was provisioned incorrectly, you have a virtual network with an NSG misconfigured, or you have an application package that is incorrect. - Your application package consists of a single python file; instead use a resource file. Simply upload the script to Azure Storage blob and reference it in your task as a Resource File using a SAS URL. See the
--resource-files
argument inaz batch task create
if using the CLI. Your command to invoke would then simply bepython3 pdf_processing.py
(assuming you keep the resource file downloading to the task working directory). - If you insist on using an application package, consider using a task application package instead. This will decouple your node startup issues potentially originating from bad application packages to debugging task executions instead.
- The
blobxfer
error is pretty clear. Your locale is not set properly. The easy way to fix this is to set the environment variables for the task. See the--environment-settings
argument if using the CLI and set two environment variablesLC_ALL=C.UTF-8
andLANG=C.UTF-8
as part of your task.
Post a Comment for "Unable To Run Tasks On Azure Batch:nodes Go Into Unusable State After Starting-up"