A handy feature is the capability to include custom code components. This lends flexibility to your model by allowing you to design your pipeline that meets your unique needs, going beyond the standard, preconfigured components available in Azure ML.
Creating a Custom Code in Azure ML Designer
Within the Azure ML Designer, you can create your custom code components by writing Python or R scripts. You can embed these scripts directly into the designer’s modules. Two primarily used components are:
- “Execute Python Script”
- “Execute R Script”
The “Execute Python Script” module runs a Python script within your Machine Learning workflow. It supports plain Python code and can access features through Azure Machine Learning Workspaces. Likewise, “Execute R Script” allows the incorporation of an R script into the workflow operating on the provided inputs, producing outputs via Azure ML Workspaces.
The general steps to incorporate custom code into the designer included sourcing data, defining the script, and running the experiment. Inside the script, you may load requisite python modules, define functions, and handle data passed within the workflow.
Example: Using Custom Python Code in Azure ML Designer
Here’s a simple example of using a custom Python script to perform data cleaning.
First, place the “Execute Python Script” into your workflow:
[Your workflow > + Add > Designer > Custom Python Script]
In the Python Script window, use code like below to perform data cleaning:
import pandas as pd
# The script MUST contain a function named azureml_main
# which is the entry point for this module.
def azureml_main(dataframe1 = None, dataframe2 = None):
# Remove unwanted columns from the dataset.
dataframe1 = dataframe1.drop([‘Column_Name1’, ‘Column_Name2’, ‘Column_Name3’], axis=1)
# Execute data preprocessing or other operations here.
return dataframe1,
Once you’ve added this, connect your modules and datasets and run the experiment.
Performance of Custom Code Components
It’s essential to understand that custom code components can have a significant impact on the performance of your Machine Learning pipeline. Custom code components use whatever language you wrote them in and must compile and run this code each time the ML pipeline runs. So, while highly customizable, custom code components are usually not as performant as using the built-in modules provided by Azure ML.
Custom Code Component | Built-in Azure ML Component | |
---|---|---|
Customizable | Yes | Limited |
Performance | Slower | Faster |
Summary
By leveraging the possibility of custom code components in Azure ML Designer, you can personalize your workflow to accommodate unique requirements, though it can result in performance trade-offs. In essence, the benefit is in balancing the need for customization with the need for sensitive performance considerations.
Whether you’re studying for the DP-100 exam or implementing a data science solution on Azure, understanding and making use of this feature is undoubtedly beneficial.
Practice Test
In Azure Machine Learning Studio, the Designer supports the use of custom Python code in a pipeline.
- True
- False
Answer: True
Explanation: The designer provides the ‘Python Script’ module where you can enter Python code to process data in your pipeline.
You can use custom Python scripts in a pipeline that’s evaluated on Azure Machine Learning designer.
- True
- False
Answer: True
Explanation: You can use the ‘Python Script’ module to include Python code in your pipeline.
You can use custom R scripts in a pipeline on Azure Machine Learning designer.
- True
- False
Answer: True
Explanation: Azure Machine Learning designer supports both Python and R for writing custom code.
What are the modules in Azure Machine Learning designer that you can use to include your custom code?
- Python Script
- R Script
- SQL Script
- All of the above
Answer: All of the above
Explanation: Azure Machine Learning designer supports Python, R and SQL for writing custom code.
Azure Machine Learning designer uses Docker containers to encapsulate custom modules and their dependencies.
- True
- False
Answer: True
Explanation: Docker containers provide a way to package software into standardized units for development, execution, and distribution.
You cannot reuse custom modules across experiments in Azure Machine Learning designer.
- True
- False
Answer: False
Explanation: Custom modules, once created, can be reused across multiple experiments in Azure Machine Learning designer.
Custom code components should always be defined in the Python language.
- True
- False
Answer: False
Explanation: Custom code components can be written in Python, R, and SQL.
You cannot run multiple code versions concurrently on Azure Machine Learning designer.
- True
- False
Answer: False
Explanation: Docker containers allow for multiple versions of code to run concurrently.
Where can you save your custom modules for reuse in future experiments?
- Azure Blob Storage
- Custom Modules Library
- Python Notebook
- R Script Notebook
Answer: Custom Modules Library
Explanation: Custom Modules Library provides an option to save and reuse your modules.
Azure Machine Learning Designer has built-in support for version control of custom modules.
- True
- False
Answer: True
Explanation: Azure Machine Learning Designer allows you to version your custom modules.
You can only use the built-in Python libraries in your code components on Azure Machine Learning designer.
- True
- False
Answer: False
Explanation: You can install and use custom Python libraries by adding dependencies to your custom code components.
You cannot use custom SQL scripts in a pipeline on Azure Machine Learning designer.
- True
- False
Answer: False
Explanation: Azure Machine Learning Designer supports the use of custom SQL scripts in your pipeline.
Azure Machine Learning designer does not provide the ability to use Docker containers for custom modules.
- True
- False
Answer: False
Explanation: Docker containers are used to package and distribute the software and its dependencies of custom modules.
Which of the following are valid types of code components you can add to a pipeline in the Azure Machine Learning designer?
- Designer Components
- Command Line Components
- Python Components
- All of the above
Answer: All of the above
Explanation: Azure Machine Learning designer can incorporate Designer, command line and Python components.
The ‘Python Script’ module in Azure Machine Learning designer only supports Python
- True
- False
Answer: False
Explanation: The ‘Python Script’ module in Azure Machine Learning designer supports both Python 2 and Python
Interview Questions
What is the purpose of custom code in Azure Machine Learning Designer?
Custom code components, also known as modules, are used to perform custom tasks or functions that are not available among the built-in components of Azure Machine Learning Designer.
Can you import external libraries in custom Python/R code?
Yes, it’s possible to import external libraries using custom Python or R code in Azure Machine Learning Designer.
How can you use custom code components in Azure Machine Learning Designer?
Custom code components in Azure Machine Learning Designer can be used by creating a Python/R script and then converting that script into a module that can be used in the Azure ML Designer pipeline.
What types of input and output ports can you define in a custom code module?
You can define both dataset ports (for input and/or output of data) and parameter ports (for input of parameters) in a custom code module.
What types of scripts can you use to create a custom code module?
You can either use Python or R scripts to create a custom code module.
Do the custom modules created in Azure Machine Learning Designer support versioning?
Yes. Custom modules in Azure Machine Learning Designer do support versioning.
Are there any limitations on the size of the data that can be processed by a custom code module?
Each module can process up to 10GB of data. For larger datasets, Azure Machine Learning service provides various techniques such as data chunking.
Can you use the custom code module to create machine learning models?
Yes. You can use custom code to create machine learning models within the Azure Machine Learning Designer.
What should you do to make a Python or an R script compatible with Azure Machine Learning Designer?
To make a Python or an R script compatible with Azure Machine Learning Designer, you need to follow the module schema defined by Azure ML, which includes defining input and output ports and the operation expected of the module.
In case of a Python script in an Azure Machine Learning Designer module, which method in the script usually contains the module’s operations?
The method named azureml_main usually contains the operations of the module.
Can you reuse custom modules in other pipelines?
Yes. Once you’ve created a custom module, you can save it to your workspace and then reuse it in other pipelines.
How can you debug a custom code module in Azure Machine Learning Designer?
You can use the standard Azure Machine Learning logs to debug a custom code module, which can be accessed by clicking on the module and viewing its outputs.
Can you share a custom code module with other users?
Yes. Custom modules can be shared within the workspace they have been created in. Other users with necessary permissions can reuse these modules in their designs.
What should you do if external libraries used in custom code are not preinstalled in Azure Machine Learning Designer?
If external libraries used in your custom code are not preinstalled, you can specify them and their versions in the ‘Conda dependencies file’ when creating your Python/R modules.
Can custom code modules in Azure Machine Learning Designer be scheduled to run at specific times?
Yes, once the pipeline including custom code is published, it can be scheduled to run at specific times.