Microsoft 365 trainable classifiers are machine learning models designed to recognize a variety of patterns across different data types, which aids administrators in effective data classification, protection, and governance.
Understanding Trainable Classifiers
In this context, a classifier is essentially a mathematical model employed to assign items into categories. Initially, a classifier learns from a training data set, where the items are pre-classified. By observing these patterns, it can categorize unclassified items with a high degree of confidence.
Trainable classifiers in Microsoft 365 use machine learning to distinguish content that falls into particular categories such as sensitivity labels or retention labels.
Testing A Trainable Classifier
After creating and training these classifiers, it’s imperative to test them before using them in production to ensure they accurately classify your data.
Steps involved in testing a classifier are broadly similar to those involved in its creation:
- Prepare the test dataset: Collect a broader range of data that represents various cyber security scenarios to evaluate the classifier. The bigger and more diverse your sample set, the more accurate your test results will be.
- Run the test: Apply the classifier to the test dataset and capture the output. This process helps to determine how well the classifier differentiates among the data it’s intended to distinguish.
- Review the results: Use these results to review the effectiveness of the classifier. If the classifier misclassifies data, it may need retraining with a more comprehensive training dataset.
Triggers and SharePoint Syntex Integration
Microsoft 365 enables administrators to leverage their trainable classifiers in policies or SharePoint Syntex. A content understanding service, SharePoint Syntex uses advanced AI and machine learning to amplify human expertise, automate content processing, and transform content into knowledge.
Use Cases for Trainable Classifier
- Document Identification: Documents that require identification often have small distinctive features. Trainable classifiers enable the identification of such documents.
- Compliance Requirement: Classifiers can cluster specific data types, ensuring the proper handling of sensitive data for compliance purposes.
Comparison: Default and Custom Classifier
MS 365 incorporates default classifiers (like resumes, source code, profanity, etc.) as well as custom classifiers to allow for specific organizational requirements. While default classifiers cater to the generic patterns of data, custom trainable classifiers can recognize more specific patterns of data according to the needs and context of the organization.
Default Classifier | Custom Classifier |
---|---|
Designed for easy out-of-the-box use. | Tailored to specific needs. |
Broad applications across various data sets. | Niche applications for specific content types. |
Similar level of accuracy across data types. | Highly accurate for specific data types. |
No need to handle training data. | Requires adequate training data. |
Reliable Testing Aids Confidence
Testing a trainable classifier is an essential step in gaining confidence in your data classification process. Good testing practices help ensure that the classifier is ready for production and minimize the risk of incorrectly classifying sensitive or valuable data.
In essence, as you prepare for SC-400 Microsoft Information Protection Administrator exam, understanding the role of trainable classifiers in protecting sensitive data, as well as how to test and refine these classifiers, is a vital part of information protection and compliance. Utilizing the testing process in a systematic, thorough manner allows your organization to reap the maximum benefits of data classification and protection.
Practice Test
True or False: The trainable classifier in Microsoft 365 uses machine learning to reapply intellectual property tags.
- Answer: False
Explanation: The trainable classifier in Microsoft 365 with its machine learning approach doesn’t reapply intellectual property tags; it recognizes specific types of content and classifies suitable documents accordingly.
How do trainable classifiers work in Microsoft 365?
- a) By using regular expression patterns
- b) By using machine learning
- c) By manual configuration
- d) By using built-in data patterns
Answer: b) By using machine learning
Explanation: Trainable classifiers classify content by using machine learning. They’re trained to recognize certain types of data.
Which of the following is not a step in creating a trainable classifier?
- a) Predictive modeling
- b) Training
- c) Relevance Feedback
- d) Review Set creation
Answer: a) Predictive modeling
Explanation: Predictive modeling is not a step in the creation of a trainable classifier. The steps are: Train, Test, and then Use.
Multiple Select: What roles you need to test a trainable classifier?
- a) Compliance Administrator
- b) Compliance Data Administrator
- c) E-Discovery Manager
- d) Global Administrator
Answer: a) Compliance Administrator, b) Compliance Data Administrator, d) Global Administrator
Explanation: To test a trainable classifier, you need to have roles such as Compliance Administrator, Compliance Data Administrator, or Global Administrator.
True or False: To create a custom trainable classifier in Microsoft 365, you need 75 to 200 individual examples of the item type.
- Answer: True
Explanation: You need between 75 to 200 individual examples of the specific item type in order to create a custom trainable classifier in Microsoft
Which one of the following is used by classifiers to categorize data?
- a) Sensitivity labels
- b) Encryption
- c) Retention labels
- d) Malware detection
Answer: c) Retention labels
Explanation: Classifiers use retention labels to categorize data, which help in managing and enforcing different types of retention policies for classified content.
True or False: You can modify a built-in classifier.
- Answer: False
Explanation: Built-in classifiers are provided by Microsoft and cannot be modified. You can create and manage only custom classifiers.
What are the prebuilt classifiers Microsoft provides?
- a) Offensive Language
- b) Harassment
- c) Source Code
- d) All of the above
Answer: d) All of the above
Explanation: Microsoft provides some pre-built classifiers like Harassment, Offensive Language, and Source Code.
How many types of trainable classifiers in Microsoft 365 compliance center?
- a) One
- b) Two
- c) Three
- d) Four
Answer: b) Two
Explanation: There are two types of classifiers: Built-in and Custom.
True or False: You are able to use a trainable classifier in the Compliance Center before training it.
- Answer: False
Explanation: Before you use your classifier, you must train it to recognize the kinds of content you want to find.
Multiple Select: What are the three principal stages for producing a trainable classifier?
- a) Training
- b) Testing
- c) Tuning
- d) Using
Answer: a) Training, b) Testing, d) Using
Explanation: The three principal stages for producing a trainable classifier are Training, Testing, and Using. Tuning is not a stage in creating a trainable classifier.
True or False: It is possible to test a trainable classifier in Microsoft 365 with a minimum of 30 items.
- Answer: False
Explanation: When testing a classifier, you need to provide a minimum of a 50-item test set for validation.
Which tool is used in the testing of a trainable classifier?
- a) Compliance Manager
- b) Compliance Administrator
- c) Microsoft Information Protection
- d) Microsoft 365 compliance center
Answer: d) Microsoft 365 compliance center
Explanation: Microsoft 365 compliance center provides the functionality to review and test the trainable classifiers.
What makes a trainable classifier different from sensitive info types or pattern detection?
- a) Its use of machine learning
- b) It can apply encryption
- c) It can include retention labels
- d) All of the above
Answer: a) Its use of machine learning
Explanation: The distinguishing feature of a trainable classifier is that it uses machine learning to categorize and classify data; it does not apply encryption or include retention labels.
True or False: A new trainable classifier must pass a tuning phase before it can be published.
- Answer: True
Explanation: The classifier must first be trained, then tested, and based on the results of the accuracy test, it will move to a tuning phase to improve its precision. After passing this phase, it can be published and used.
Interview Questions
What is a trainable classifier in Microsoft 365 Compliance?
A trainable classifier in Microsoft 365 is a tool that helps identify and categorize data based on the pattern it fits. By teaching it using examples of the specific type of content, organizations can classify and protect sensitive information in their entities.
How do you train a trainable classifier in Microsoft 365 Compliance?
You train a trainable classifier in Microsoft 365 Compliance by providing and tagging a series of examples, then running an accuracy test and reviewing the outcome. Re-train the classifier if the accuracy isn’t sufficient, then publish the classifier when it is ready.
What is the minimum number of items you need to train a Microsoft 365 classifier?
The minimum number of items required to train a Microsoft 365 classifier is 50. However, it’s recommended to provide at least 200 items for higher accuracy.
What are the pre-populated categories of trainable classifiers available in Microsoft 365?
The pre-populated categories include Harassment, Profanity, Threat, and Offensive Language. Additionally, there is a Custom category available for unique organizational needs.
What are some common use cases for trainable classifiers in Microsoft 365?
Trainable classifiers can be used for various purposes, such as identifying sensitive data, data classification for information protection, information governance, records management, and data loss prevention.
What is the process called when you manually review and correct predictions by a trainable classifier?
The process of manually reviewing and correcting the predictions made by a trainable classifier is known as ‘Tuning’.
Can you use trainable classifiers for locations that use Double Byte Character Set (DBCS)?
No, currently Microsoft 365 does not support the use of trainable classifiers for locations that use DBCS.
What happens when a trainable classifier does not meet the necessary accuracy?
If the trainable classifier doesn’t meet the necessary accuracy, it needs to be retrained. This involves adding more representative items, reviewing the originally tagged items for accuracy, and then testing again.
Are there any limitations to the size of files when creating a trainable classifier in Microsoft 365 Compliance?
Yes, currently, Microsoft 365 Compliance supports files up to 4 MB when creating a trainable classifier.
What are some of the best practices while training a classifier?
Some of the best practices include: Providing as many examples as possible for the classifier to learn accurately, revising and retraining your classifier regularly as the nature of data changes over time, and clearly defining your labels to ensure they are not ambiguous.
What tools or services in Microsoft 365 can you use to apply classifications made by trainable classifiers?
You can use policies in Microsoft 365 solutions like Data Loss Prevention, Information Protection, Records Management, Advanced eDiscovery to apply the classifications made by trainable classifiers.
In relation to the trainable classifiers, what is sensitivity?
Sensitivity is a statistical measure that refers to the ability of a classifier to correctly identify true positives. In other words, a model with high sensitivity correctly identifies a high proportion of actual positive instances.
What do you need to consider when training and testing a classifier?
When training and testing a classifier, it’s essential to consider the relevancy and diversity of your training data. Including a diverse range of examples in your training data can improve the accuracy of your classifier. Moreover, the dataset for testing must be different from the training set.
Before deploying a trainable classifier, what test must be conducted?
Before deploying a trainable classifier, an accuracy test must be conducted to evaluate how well the classifier is able to correctly identify and categorize data.
Can I delete a published classifier in Microsoft 365?
No, once a classifier is published, it cannot be deleted. However, you can stop it from classifying content by disabling or removing any policies or rules that are using it.