Computer Vision is one of the artificial intelligence (AI) services provided by Microsoft Azure. It uses advanced algorithms to process and analyze visual content in images and videos, offering insights and automations that would be otherwise time-consuming or impossible to accomplish by human observation alone. For those preparing for the AI-900 Microsoft Azure AI Fundamentals exam, understanding the capabilities of the Computer Vision service is crucial.

Table of Contents

Overview of Computer Vision Capabilities

As a tool, Computer Vision service allows your application to understand visual content in a more detailed and comprehensive manner. The key capabilities of this service include:

  • Image Analysis: Computer Vision can identify and categorize objects in images, from general objects such as trees and animals to highly specific items such as electronic devices and individual foods.
  • Spatial Analysis (Preview): This involves the assessment of the movements and interactions of people in a physical space, such as a retail store or a community park.
  • Read: The Computer Vision service can recognize and read printed and handwritten text in various languages.
  • Forms Recognizer: This allows the extraction of key-value pairs and tables from documents and forms, which can be both printed and handwritten.
  • Custom Vision (Preview): This feature allows you to build, deploy, and improve your own image classifiers.

Deep Dive into Computer Vision Capabilities

Let’s go ahead and explore each of these capabilities in more detail:

Image Analysis

The image analysis capability of the Computer Vision service identifies and categorizes visual features in images, providing a detailed understanding of the content. With it, you can perform operations such as:

  • Object Detection: Detects distinct objects within an image and returns their coordinates.
  • Brand Detection: Can identify various brands within an image, like logos.
  • Adult or Racy Content Detection: It can detect potentially offensive or explicit content present in an image.
  • Face Detection: Can locate faces within an image and provide additional information like age, gender etc.
  • Color Scheme Detection: It detects the dominate colors present in any image.
  • Image Type Detection: It determines the type of image like Clip Art or Line Drawing.

Spatial Analysis

The Spatial Analysis feature enables understanding of people’s movement and interaction in a physical space. This can prove highly beneficial for scenarios like:

  • Monitoring social distancing in a location.
  • Counting the number of people in an area.
  • Understanding movement patterns for better space utilization.

Read

With this capability, the Computer Vision service can extract printed and handwritten text from images. This is useful in numerous scenarios such as:

  • Reading license plates from security footage.
  • Extracting data from printed or written invoices.
  • Processing documents – both printed and handwritten text, into editable formats.

Forms Recognizer

The Forms Recognizer extracts key-value pairs and tables from documents. By doing so, it can:

  • Automate data entry from paper-based forms into a system.
  • Automatically extract and analyze business data from forms like invoices and receipts.

Custom Vision

With the Custom Vision feature, you are provided with a set of robust tools to design, implement, and train customized models for image classification. Some of these tasks can include:

  • Identifying defective parts in a manufacturing process.
  • Differentiating between different types of wildlife animals in a conservation study.

This brief overview should provide a solid understanding of the Computer Vision Service offered by Microsoft Azure and help in preparing for the AI-900 Microsoft Azure AI Fundamentals exam. Along with theoretical knowledge, hands-on experience is crucial. Thus, it is recommended to explore these capabilities practically as well.

Practice Test

True or False: Computer Vision can analyze imagery to extract information about objects, colors, and locations.

  • True
  • False

Answer: True

Explanation: Microsoft Azure’s Computer Vision service has the ability to analyze images and extract key information, such as what objects are present, their colors, and their locations.

Which of these are capabilities of Microsoft Azure’s Computer Vision Service?

  • A) Handwriting detection
  • B) Optical Character Recognition (OCR)
  • C) Image classification
  • D) Speech recognition

Answer: A,B,C

Explanation: Handwriting detection, OC and Image classification are capabilities of Azure’s Computer Vision Service. Speech recognition is the capability of Azure’s Speech service, not Computer Vision.

True or False: Azure’s Computer Vision service can identify celebrities and landmarks in images.

  • True
  • False

Answer: True

Explanation: Azure’s Computer Vision service includes a model for identifying thousands of recognized celebrities and landmarks in an image.

Can Microsoft Azure’s Computer Vision service understand video contexts?

  • Yes
  • No

Answer: No

Explanation: As of now, Microsoft Azure’s Computer Vision service only works with images. It cannot understand or work with video content.

Which of the following can Azure’s Computer Vision provide as part of the output of image analysis?

  • A) Existing objects and scenes
  • B) Colors and Categories
  • C) Popular brands and Faces
  • D) Image format

Answer: A,B,C

Explanation: The output of Azure’s Computer Vision analysis includes identification of objects and scenes, color scheme, categorization, detection of faces, and recognition of popular brands. It does not provide information on the image format.

True or False: The Computer Vision API in Azure uses a modifiable machine learning model.

  • True
  • False

Answer: False

Explanation: The Computer Vision API uses a pre-trained machine learning model that can’t be modified by users.

Can Azure’s Computer Vision service generate a description of an image in natural language automatically?

  • Yes
  • No

Answer: True

Explanation: Azure’s Computer vision can extract rich information from images to categorize and process visual data and provide a description of the image in a human-readable, natural language.

Is Optical Character Recognition (OCR) capability of Microsoft Azure’s Computer Vision limited only to English?

  • Yes
  • No

Answer: False

Explanation: While OCR capability is a part of Azure’s Computer Vision, it’s not limited only to English. It supports extracting printed text from multiple languages and various image formats.

Can Computer Vision API be used for Area of Interest (AOI) extraction from an image?

  • Yes
  • No

Answer: True

Explanation: Azure Computer Vision service includes features that can find areas of interest within an image, such as significant foreground objects, people, or relevant parts to draw the viewer’s attention.

True or False: Image tagging in Azure’s Computer Vision Service is always 100% accurate.

  • True
  • False

Answer: False

Explanation: While Azure’s Computer Vision Service provides a confidence score with each tag it assigns, it’s not 100% accurate as it’s dependent on the quality of images and the learning model’s capability.

Interview Questions

What is Computer Vision in Azure AI?

Computer Vision in Azure AI is a service that analyses and catalogues visual data. Using machine learning models, it can identify and classify objects, people, text, scenes, and activities in images and videos.

What are the capabilities of Azure’s Computer Vision Service?

Azure’s Computer Vision Service provides several capabilities such as image analysis, spatial analysis, form recognition, face detection, object detection, Optical Character Recognition (OCR), custom vision, and more.

What is the use of the ‘analyze image’ feature in Azure’s Computer Vision Service?

The ‘analyze image’ feature is used to extract visual features based on image content. It can identify objects, actions, and generate human-readable sentences to describe the image.

What is the Optical Character Recognition (OCR) capability in Azure’s Computer Vision Service?

Optical Character Recognition (OCR) is a service in Azure’s Computer Vision that extracts printed text from images. It can be used in various languages and for both printed and handwritten text.

What kind of scenario can the Spatial Analysis feature of the Computer Vision service be used for?

The Spatial Analysis feature can be used in scenarios such as people counting, queue management, and safety compliance where you need to analyze the movements of people within a specific space.

What is the purpose of the ‘Read’ feature in Azure’s Computer Vision Service?

The ‘Read’ feature is used to extract printed and handwritten text from images and documents with multiple pages. It is designed for text-heavy images and large documents.

Can Azure’s Computer Vision Service identify celebrities and landmarks?

Yes, Azure’s Computer Vision Service can recognize around 200,000 celebrities from business, politics, sports and entertainment sectors, and 9,000 natural and man-made landmarks worldwide.

What is the Custom Vision capability in Azure’s Computer Vision Service?

Custom Vision in Azure’s Computer Vision Service is a feature where you can train your own classifier by uploading and labeling images according to your needs.

What is the purpose of the Face detection capability in Azure’s Computer Vision Service?

The Face detection capability can be used in scenarios where there is a requirement to detect, recognize, and analyze human faces in images.

Can Azure’s Computer Vision Service generate thumbnail images?

Yes, Azure’s Computer Vision Service can generate high-quality thumbnail images from a larger image while maintaining its original aspect ratio.

What is Read API in Azure’s Computer Vision Service?

The Read API is used in Azure Computer Vision Service to extract printed and handwritten text from images and documents in a variety of languages.

Can Azure’s Computer Vision Service be used to analyze videos?

Yes, Video Indexer is a feature provided by Azure’s Computer Vision Service to analyze the visual and auditory channels of a video, and can catalog, extract, and index information.

How does the Computer Vision Service support working with batch of images?

The batch Read operation can be utilized to analyze a large volume of images. It is an asynchronous operation where you send a batch of images and poll for the results.

How does the Computer Vision Service handle noise in text extraction?

The Noise Reduction mechanism in the Computer Vision service handles the noise in text extraction. It works by improving the distinction between text and non-text regions, hence enhancing the text extraction accuracy.

Can Azure’s Computer Vision Service extract 3-D spatial information?

Yes, Azure’s Computer Vision Service can extract 3-D spatial information using an advanced feature called Spatial Analysis. It processes video from cameras to understand people’s movements in physical spaces.

Leave a Reply

Your email address will not be published. Required fields are marked *