Configuring computes for Job runs is an important part of performing Data Science tasks using Azure Machine Learning. Let’s take a thorough look at how to configure these computes correctly when preparing for the DP-100 Designing and Implementing a Data Science Solution on Azure certification exam.
In the context of Azure Machine Learning, compute resources refer to the cloud-based resources used for model training, prediction, and automation routines. Compute resources also provide the processing power necessary to analyze large-scale datasets, run complex mathematical calculations, and deploy machine learning models.
Configuring Compute Instances
Compute instances in Azure are development workstations used for machine learning tasks. These workstations can provide a managed, single-user environment for tasks such as data exploration, data cleaning, and model development.
Here’s how to configure a compute instance:
from azureml.core.compute import ComputeInstance
ComputeInstance(workspace=workspace,
name="my-instance", # the instance's name
vm_size='STANDARD_D2_V2', # the type of machine
subnet='default', # subnet to use
ssh_public_access=False) # do not enable ssh
Where `workspace` is the Azure Machine Learning workspace where you want to create the compute instance.
Configuring Compute Clusters
Compute clusters in Azure are multi-use, shared compute resources where you can execute machine learning tasks such as model training and batch scoring.
Creating a compute cluster is a bit different since it requires a scale setting:
from azureml.core.compute import AmlCompute
my_cluster = AmlCompute(workspace=workspace,
name="my-cluster", # the cluster's name
vm_size='STANDARD_D2_V2', # the type of machine
max_nodes=4) # the max number of nodes
Here, `max_nodes` refers to the maximum number of nodes that the cluster can scale up to when demands are high.
Configuring Inference Clusters
For Model Deployments, you would want to configure an Inference Cluster using Azure Kubernetes Service (AKS). The AKS cluster then provides the necessary power and functionality for deployed machine learning models, enabling real-time inferencing.
Here’s how to create an AKS cluster:
from azureml.core.compute import AksCompute
my_aks = AksCompute(name="my-aks",
location="eastus",
vm_size="Standard_D2_v2",
agent_count=3)
In this configuration, `agent_count` refers to the number of agents or nodes you want your AKS cluster to have.
In conclusion, understanding how to configure computes for job runs in Azure Machine Learning is crucial for Data Scientists or any professional preparing for the DP-100 certification exam. Remember that each compute type (Compute Instance, Compute Cluster, Inference Cluster) has a specific purpose and is suitable for different types of tasks, so make sure to select the right compute resource for your job.
Also, take into account that compute resources in Azure Machine Learning are not static – you can scale them up or down as per your job requirements or budget constraints. Thus, mastering the configuration of these resources allows for flexible and efficient job runs in Azure.
Practice Test
True or False: The Azure Machine Learning SDK supports configuring compute instances for a job run.
- True
- False
Answer: True
Explanation: The Azure Machine Learning SDK provides scriptable interfaces to configure compute instances for a job run, allowing a high degree of customization and control.
Which of these are common computing resources in Azure Machine Learning? Select all that apply.
- a) Compute target
- b) Compute cluster
- c) Compute instance
- d) Compute adapter
Answer: a, b, c
Explanation: Azure Machine Learning offers multiple types of computing resources: a compute target, a compute cluster, and a compute instance. A compute adapter is not a valid computing resource in Azure.
True or False: Azure Managed Disks can be used to scale up the computational power for a job run.
- True
- False
Answer: False
Explanation: Azure Managed Disks are a storage service, not a compute service. They are used for disk storage, not for increasing computational power.
When configuring a compute cluster, what parameter is used to indicate the minimum number of nodes to be always on standby?
- a) min_nodes
- b) max_nodes
- c) standby_nodes
- d) constant_nodes
Answer: a) min_nodes
Explanation: In Azure Machine Learning, when creating a compute cluster, the min_nodes parameter is used to specify the minimum number of nodes to be always kept on standby.
True or False: A compute instance in Azure Machine Learning is fully managed and ideal for a notebook-style environment.
- True
- False
Answer: True
Explanation: Azure Machine Learning compute instances are fully managed cloud-based workstations optimized for data science notebook-based workflows.
What function should you use to reference an existing compute target in your workspace?
- a) ws.compute_targets(‘my_instance’)
- b) ws.get_compute(‘my_instance’)
- c) ws.get(‘my_instance’)
- d) ws.compute_targets.get(‘my_instance’)
Answer: a) ws.compute_targets(‘my_instance’)
Explanation: In the Azure Machine Learning SDK, ws.compute_targets(‘my_instance’) is used to reference an existing compute target in your workspace.
You can use Azure Machine Learning designer to configure compute resources for a job run. True or False?
- True
- False
Answer: True
Explanation: Azure Machine Learning designer provides a visual interface where you can drag and drop modules to create your workflows and also configure compute resources for a job run.
Which of the following can be used to scale up or down the computational power for a job run in Azure Machine Learning?
- a) Compute clusters
- b) Managed disk
- c) Cold storage
- d) Virtual networks
Answer: a) Compute clusters
Explanation: Compute clusters in Azure Machine Learning can be used to scale up or down the computational power depending on the workload.
True or False: For every job run, you need to configure and specify a new compute target in Azure Machine Learning.
- True
- False
Answer: False
Explanation: You can reuse already created compute targets across different runs and experiments. You don’t have to create a new one with each run unless it specifically requires different compute specifications.
Which command would you use to delete a compute instance in Azure Machine Learning?
- a) compute_target.delete()
- b) compute_instance.remove()
- c) compute_target.remove()
- d) compute_instance.delete()
Answer: a) compute_target.delete()
Explanation: You can delete a compute instance by calling the delete() method on a compute target.
Interview Questions
Which service in Azure should you use to configure a computer for a job run?
The service to use is Azure Batch, it allows you to run large-scale parallel and high-performance computing (HPC) applications efficiently in the cloud.
What is Azure Batch?
Azure Batch is a cloud-based job scheduling service that parallelizes and distributes the processing of large volumes of data across many computers.
How can you specify the type and size of compute resources that your jobs require in Azure Batch?
You can specify the type and size of compute resources your jobs require by creating a pool of compute nodes within Azure Batch.
Can Azure Batch automatically scale compute resources?
Yes, Azure Batch can automatically scale compute resources based on parameters you define.
What is the role of the Azure Machine Learning Compute in running jobs?
Azure Machine Learning Compute is a managed service that enables the ability to train machine learning models on clusters of Azure virtual machines, it is typically used to run jobs related to machine learning workloads.
Does Azure support distributing the computation for machine learning workloads across multiple machines?
Yes, Azure supports the distribution of computation across multiple machines through Azure Machine Learning service’s distributed training feature.
In Azure, what is a compute target?
A compute target in Azure is a designated compute resource/environment where you run your training script or host your service deployment.
How can you choose the size of the Azure Batch AI compute cluster for your jobs?
You can select the size of the Azure Batch AI compute cluster based on your need for CPU, GPU, memory, and storage resources. Azure provides a range of virtual machine sizes to select from.
What does VM priority in Azure Batch AI determine?
VM priority in Azure Batch AI determines the allocation of the VMs. Low-priority VMs are allocated when the capacity is available and may be preempted by high-priority requests.
Which Azure service would you use if you need to reserve compute capacity in advance?
You would use Azure Reservations if you need to reserve compute capacity in advance.
How do you monitor and manage Azure Batch AI cluster usage?
Azure Monitor and Azure Cost Management can be used to monitor and manage Azure Batch AI cluster usage.
What is the role of a Job Manager Task in Azure Batch?
Job Manager Task is a special task that is automatically started by the Batch service when the job to which it belongs starts. It is used to manage and monitor other tasks and it can also perform job preparation.
What are Task Dependencies in Azure Batch?
Task Dependencies in Azure Batch is a feature that allows one task to start only after specific other tasks have completed, which is especially useful when a task relies on results from previous tasks.
What tool can you use to automatically shut down idle compute instances in Azure?
You can use Azure Automation to automatically shut down idle compute instances.
How can you control access to the data that your jobs process in Azure Batch?
You can control access to the data your jobs process in Azure Batch using Azure Active Directory and role-based access control (RBAC).