Training a Custom Translator model involves preparing your data, creating a project, and then training the model.
- Prepare Your Documents: Before you start the process, ensure that your documents are in a TMX, XLIFF, or parallel file format. Moreover, your files should be UTF-8 encoded, no larger than 100 MB, and have a minimum of 10,000 aligned sentence pairs.
- Create a Project: Upon preparing your document, you can create a project on the Custom Translator portal. This project will hold your training data
- Train Your Model: After uploading your documents, you can train your model using the Microsoft Azure portal. The training process may take a few hours, depending on the size of your document.
A simple example of training a model could be:
USE [translator-text]
DECLARE @Text NVARCHAR(4000) = N'Text you want translated.';
-- Set the default language for translation.
DECLARE @FromLanguage NVARCHAR(10) = N'auto-detect';
EXEC sp_execute_external_script
@language = N'Python',
@script = N'import requests,json, os
subscription_key = os.environ["TRANSLATOR_TEXT_SUBSCRIPTION_KEY"]
endpoint = os.environ["TRANSLATOR_TEXT_ENDPOINT"]
path = "/translate?api-version=3.0"
params = "&to=de"
constructed_url = endpoint + path + params
headers = {
"Ocp-Apim-Subscription-Key": subscription_key,
"Content-type": "application/json",
"X-ClientTraceID": str(uuid.uuid4())
}
# You can pass more than one object in body.
body = [{
"text" : "Hello World!"
}]
request = requests.post(constructed_url, headers=headers, json=body)
response = request.json()
output = json.dumps(response, sort_keys=True, indent=4, separators=(",", ": "))
print(output)'
WITH RESULT SETS UNDEFINED;
Improving Your Model
Improving your translation model mainly entails providing feedback to the Microsoft Translator using the ‘alternativeTranslations’ data structure in the ‘Text Translator API’. This can improve the model by continually learning and adapting to your preferred translations.
import os, requests, uuid, json
subscription_key = os.environ["TRANSLATOR_TEXT_SUBSCRIPTION_KEY"]
endpoint = os.environ["TRANSLATOR_TEXT_ENDPOINT"]
path = "/translate?api-version=3.0"
params = "&to=de&includeAlignment=true"
constructed_url = endpoint + path + params
headers = {
"Ocp-Apim-Subscription-Key": subscription_key,
"Content-type": "application/json",
"X-ClientTraceID": str(uuid.uuid4())
}
# You can pass more than one object in body.
body = [{
"text" : "Hello World!"
}]
request = requests.post(constructed_url, headers=headers, json=body)
response = request.json()
output = json.dumps(response, sort_keys=True, indent=4, separators=(",", ": "))
print(output)
Publishing Your Model
After training and improving your model, you can now publish it. The publishing process involves selecting your model, clicking the ‘Publish’ button on your workspace, then entering the category and defining a clear name for your model. Your model is now ready to be used in translation requests.
The following is an example code of how you can use your model:
import os, requests, uuid, json
subscription_key = os.environ["TRANSLATOR_TEXT_SUBSCRIPTION_KEY"]
endpoint = os.environ["TRANSLATOR_TEXT_ENDPOINT"]
path = "/translate?api-version=3.0"
params = "&to=de&includeAlignment=true&modelId=your-model-id"
constructed_url = endpoint + path + params
headers = {
"Ocp-Apim-Subscription-Key": subscription_key,
"Content-type": "application/json",
"X-ClientTraceID": str(uuid.uuid4())
}
# You can pass more than one object in body.
body = [{
"text" : "Your text"
}]
request = requests.post(constructed_url, headers=headers, json=body)
response = request.json()
output = json.dumps(response, sort_keys=True, indent=4, separators=(",", ": "))
print(output)
In conclusion, to pass AI-102: Designing and Implementing a Microsoft Azure AI Solution exam, you need to understand how to train, improve, and publish custom translation models. Starting with the preparation phase, where data preparation is key, you move on to the execution phase, where your model is built, improved, and eventually published. Implementing custom translations in Azure AI is, therefore, a crucial skill in your AI-102 exam preparation journey.
Practice Test
True or False: Custom Translator needs a sufficient amount of parallel sentence for training a custom model.
- True
- False
Answer: True.
Explanation: Custom Translator requires your translated documents to be in a parallel sentence format for effective training of a custom model.
Which of the following is NOT a step in training a custom translation model?
- a) Data collection
- b) Preprocessing
- c) Coding
- d) Evaluating
Answer: c) Coding.
Explanation: The process of training a custom model does not involve coding but requires steps like data collection, preprocessing, and evaluating the model’s performance.
True or False: A well-trained custom model can be published without verifying its accuracy.
- True
- False
Answer: False.
Explanation: Before publishing, the model’s accuracy should be verified by tuning the hyperparameters and testing the model.
Which Azure service is used for creating, training and deploying your custom translation models?
- a) Azure Language Understanding
- b) Azure Translator Text
- c) Azure QnA Maker
- d) Azure Bot Service
Answer: b) Azure Translator Text.
Explanation: Azure Translator Text allows you to create, train and publish custom translation models.
True or False: Azure’s Custom Translator does not support testing the trained model directly on the portal.
- True
- False
Answer: False.
Explanation: Azure’s Custom Translator provides the testing function directly on the portal itself.
Which of the following is NOT an advantage of using custom translation models on Azure?
- a) Lower cost
- b) Better accuracy for domain-specific translations
- c) Need for manual intervention
- d) Improved model performance
Answer: c) Need for manual intervention.
Explanation: Custom translation models on Azure reduce the need for manual intervention by empowering users with automated machine learning capabilities.
In custom translation, what does the process of improving a trained model mainly entail?
- a) Collecting more data
- b) Hyperparameter tuning
- c) Both a & b
- d) None of the above
Answer: c) Both a & b.
Explanation: Improving a trained model often involves collecting more data and tuning the model’s hyperparameters.
True or False: After publishing a trained model, it’s not possible to unpublish it.
- True
- False
Answer: False.
Explanation: It’s always possible to unpublish a trained model if you want to update it or if you’re not satisfied with its performance.
What kind of data can be used to train a custom translation model?
- a) Parallel sentences
- b) Monolingual data
- c) Both a & b
- d) None of the above
Answer: a) Parallel sentences.
Explanation: Custom translation models are generally trained on parallel sentence pairs.
True or False: Azure Custom Translator is used to perform both training and inferencing operations.
- True
- False
Answer: True.
Explanation: Azure Custom Translator can be used to both train a custom model and use that model to perform translation tasks.
Which language is used to make requests to the Azure Custom Translator API?
- a) JavaScript
- b) Python
- c) Both a & b
- d) None of the above
Answer: c) Both a & b.
Explanation: Requests to Azure Custom Translator API can be made using both JavaScript and Python, among other languages.
True or False: You can use the same dataset for both training and testing a custom translation model.
- True
- False
Answer: False.
Explanation: To get a realistic measure of a model’s performance, different datasets should be used for training and testing the model.
The model identifier for custom trained translator models is prefixed by which of the following?
- a) CUSTOM_
- b) TRANSLATOR_
- c) MODEL_
- d) All of the above
Answer: a) CUSTOM_.
Explanation: Custom Translator assigns a unique identifier to each trained model that is prefixed by “CUSTOM_”.
True or False: You can use more than one custom translator model in a single API call.
- True
- False
Answer: False.
Explanation: You cannot specify more than one model in a single API call. Separate calls need to be made for each custom translator model.
For creating a custom translation model on Azure, which of the following is a prerequisite?
- a) Subscription to Azure
- b) Translator Text resource created
- c) Both a & b
- d) None of the above
Answer: c) Both a & b.
Explanation: To create a custom translation model on Azure, you need a subscription to Azure and a Translator Text resource.
Interview Questions
How do you train a custom translator model in Azure AI?
Custom Translator models are trained using parallel documents that are translations of each other. The documents can be in the form of sentence-aligned parallel documents, bilingual dictionary, or a monolingual corpus.
What are some ways to improve the accuracy of your custom translation model?
The accuracy of a custom translation model can be improved by increasing the amount of training data, ensuring the quality of the training data, or by tuning the model based on the errors it is making.
What should be the format of training data for implementing custom translation?
Training data for implementing custom translation should be in text or TMX format, and it should contain sentence-aligned, parallel text pairs in two languages.
What are some prerequisites for implementing a custom model in Azure AI?
The prerequisites for implementing a custom model in Azure AI include having an Azure subscription, creating an instance of the Translator Text API, obtaining a training dataset, and preparing the training data.
Can you use your own data to train a custom translation model on Azure AI?
Yes, you can train a custom translation model using your own data. The data should contain sentence-aligned, parallel text pairs in two languages.
What is the next step after training a custom model in Azure AI?
Once the training of a custom model in Azure AI is completed, you can evaluate the model using test data to determine how well it performs before publishing it for use.
Can a custom model be tested without publishing in Azure AI?
Yes, after training a custom model you can evaluate it using test data without publishing it. This allows you to verify the quality and improve the model before it goes into production.
How do you publish a custom translation model in Azure AI?
Once you’re satisfied with the performance of your model, navigate to the models page in Custom Translator Portal, select the model you want to publish, then click the Publish button.
What happens to the custom model after it is published in Azure AI?
After the model is published, it becomes available to use in production. It can be referenced in the API call via its model ID.
Can a published custom model be unpublished in Azure AI?
Yes, a published model can be unpublished by selecting the model and clicking the ‘Unpublish’ button. This will remove the model from the list of models that can be used in production.
Can a custom model be deleted in Azure AI?
Yes, a trained custom model can be deleted. However, once a model is deleted, all information associated with it is permanently lost.
Is it possible to use multiple custom models in a single translation request in Azure AI?
No, multiple custom models cannot be used in a single translation request. Only one model can be specified in a translation request.
What are the resource requirements for training a custom model in Translator Text API?
The resource requirements vary based on the size of the model and the amount of training data. Microsoft recommends having at least 2 vCPU and 4 GB RAM for training a small model.
Can Azure Custom Translator be used in offline scenarios?
No, Azure Custom Translator is a cloud-based service and doesn’t support offline scenarios.
Is there a limit to the number of custom models that can be trained in Azure AI?
Yes, there is a limit to the number of custom models that can be trained. The exact number may vary based on your subscription level and conditions.