The Azure Synapse Link is a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables near real-time analytics over operational data in Azure Cosmos DB. It brings near real-time analytics closer to the Azure Cosmos DB by creating a tight integration between Azure Cosmos DB and Azure Synapse Analytics.
Azure Synapse Link solves the challenge of deriving real-time analytics from operational data in a way that is seamless, cost-effective, and efficient.
Enabling Azure Synapse Link
There are two main methods to enable Azure Synapse Link for Cosmos DB.
- From Azure Portal:
- Open the Azure portal and navigate to the Azure Cosmos DB account for which you want to enable the Azure Synapse Link.
- In the ‘Features’ section, look for ‘Azure Synapse Link’ and hit the ‘Enable’ button.
- Using Azure CLI:
In the Azure CLI, the steps are a bit more nuanced but straightforward. Below is a snip of how to enable Azure Synapse Link via Azure CLI.
az cosmosdb update \
–name “Your Cosmos DB Account Name” \
–resource-group “Your Resource Group Name” \
–enable-analytical-store true
Benefits of Azure Synapse Link
- No ETL operations necessary: Azure Synapse Link eliminates the need for cumbersome and resource-intensive ETL operations for operating analytics over operational data.
- Simplified Architecture: Azure Synapse Link simplifies the architecture for building modern applications that require both transactional and analytical analysis from a single data source.
- Improved Performance: By eliminating traditional ETL operations, Azure Synapse Link improves application performance.
- Cost-Effective: As there is no need for separate analytical stores or ETL operations, Azure Synapse Link results in lower total cost of operation (TCO).
When to Use Azure Synapse Link
Azure Synapse link might serve beneficial for applications that require transactional and analytical processing from a single data source. For example, a retail application that requires both transactional processing (e.g., sales transaction processing) and analytical processing (e.g., trend analysis or sales forecasting) from the same Cosmos DB.
Consider Azure Synapse Link when you need to run near real-time analytics on your operational data but don’t want to deal with the complexity and performance overhead of ETL operations.
In contrast, if you have a workload where there is a clear boundary separating your transactional and analytical processing, and that boundary is not expected to change frequently, then Azure Synapse Link might not add as much value.
Conclusion
Azure Synapse Link is an innovative service that brings together the worlds of transactional and analytical processing into a single, seamless solution integrated with Azure Cosmos DB.
It streamlines the traditional ETL operation, enabling real-time analytics capabilities on operational data while reducing operational overheads. This service could be a game-changer for many applications in various industries that require both transactional and analytical processing from a single data source.
Note: Enabling Synapse Link on your Azure Cosmos DB account is permanent and cannot be reversed. Therefore, it’s crucial to consider its usage carefully based on the implications and benefits it brings to your workload.
Practice Test
True or False: Azure Synapse Link is a cloud-native hybrid transactional and analytical processing (HTAP) capability.
- True
- False
Answer: True
Explanation: Azure Synapse Link for Azure Cosmos DB is indeed a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables you to run near real-time analytics over operational data.
Multiple Select: What kind of data sources does Azure Synapse Link support?
- a) Azure Cosmos DB
- b) SQL Server
- c) MongoDB
- d) MySQL
Answer: a) Azure Cosmos DB
Explanation: Currently, Azure Synapse Link is only supported with Azure Cosmos DB, a globally distributed, multi-model database service.
True or False: To implement Azure Synapse link, you need to first copy your transactional data to a separate analytical store.
- True
- False
Answer: False
Explanation: With Azure Synapse Link, you no longer need to manage the ETL process to copy your transactional data to a separate analytical store. It provides a tight integration between Azure Cosmos DB and Azure Synapse Analytics.
Single Select: Which of the following is NOT an advantage of Azure Synapse Link?
- a) Allows to perform analytics over near real-time data
- b) Eliminates cost, effort, and complexity of ETL
- c) Supports hybrid transactional and analytical processing (HTAP)
- d) Supports data replication across multiple geographical regions
Answer: d) Supports data replication across multiple geographical regions
Explanation: Azure Synapse Link does not directly support data replication across multiple geographical regions. This is a feature of Azure Cosmos DB itself.
True or False: Azure Synapse Link is turned on automatically for all Azure Cosmos DB accounts.
- True
- False
Answer: False
Explanation: By default, Azure Synapse Link is not enabled for any Azure Cosmos DB account. Users have to enable it manually for each required analytical store.
Single Select: What is the transactional data latency in Azure Synapse Link?
- a) Immediate
- b) Few seconds
- c) Few minutes
- d) Few hours
Answer: b) Few seconds
Explanation: The transactional data latency between Azure Cosmos DB and Azure Synapse Analytics in Azure Synapse Link is typically a few seconds.
True or False: Using Azure Synapse Link, you can use both Synapse SQL and Spark for perform analytics on your operational data.
- True
- False
Answer: True
Explanation: Azure Synapse Link allows you to use both Synapse SQL and Spark to perform data exploration, data preparation, and analytics on your operational data in Azure Cosmos DB.
Single Select: Azure Synapse link needs which Cosmos DB API for it to be enabled?
- a) MongoDB API
- b) SQL API
- c) Cassandra API
- d) Gremlin API
Answer: b) SQL API
Explanation: Currently, you can enable Azure Synapse Link for Azure Cosmos DB accounts that are using the SQL (Core) API.
Multiple Select: What are the pre-requisites to enable Azure Synapse Link?
- a) Enable Azure Synapse Workspace
- b) Enable Cosmos DB Account with Analytical Store
- c) Azure Subscription
- d) Python or Java SDK
Answer: a) Enable Azure Synapse Workspace, b) Enable Cosmos DB Account with Analytical Store, c) Azure Subscription
Explanation: To enable Azure Synapse Link, you need to have Azure Subscription, Azure Synapse Workspace enabled as well as a Cosmos DB Account with the Analytical Store turned on.
True or False: Azure Synapse Link for Cosmos DB supports the processing of data in JSON format.
- True
- False
Answer: True
Explanation: Yes, Azure Synapse Link for Cosmos DB, using SQL (Core) API, supports the processing of data in JSON format.
Interview Questions
What is Azure Synapse Link for Azure Cosmos DB?
Azure Synapse Link for Azure Cosmos DB is a hybrid transactional and analytical processing (HTAP) capability that allows you to run near real-time analytics over operational data in Azure Cosmos DB.
What are the benefits of using Azure Synapse Link for Azure Cosmos DB?
Azure Synapse Link for Azure Cosmos DB allows you to run analytics directly on your operational data without impacting the performance of your transactional workloads and without requiring any data movement.
How does Azure Synapse Link enable no-ETL analytics over operational data in Azure Cosmos DB?
Synapse Link creates a tight integration between Azure Cosmos DB and Azure Synapse Analytics that allows Azure Synapse Analytics to directly query Azure Cosmos DB data with no ETL (extract, transform, load) process required.
How is the decoupling of operational and analytical workloads ensured in Azure Synapse Link for Cosmos DB?
Azure Synapse Link creates a fully isolated column store over your operational data that is updated in real time. This separation ensures that running analytical queries does not impact the performance of transactional workloads.
What is the transactional-analytical latency provided by the Azure Synapse Link?
Azure Synapse Link provides near-real time transactional-analytical latency, typically under five minutes.
Can Azure Synapse Link be enabled or disabled on demand?
Yes, Azure Synapse Link can be enabled or disabled on demand on an Azure Cosmos DB container at any point in time.
What is the purpose of the analytical store in Azure Synapse Link?
The analytical store in Azure Synapse Link is a fully isolated column store for large-scale analytics against operational data. The analytical store is automatically and transparently populated without any impact on the transactional workload performance.
What are the two types of stores in Azure Cosmos DB when Synapse Link is enabled?
When Synapse Link is enabled, Azure Cosmos DB maintains two types of stores – a row store for transactional processing (OLTP) and a column store for analytical processing (OLAP).
How can I query the Azure Cosmos DB analytical store with Azure Synapse Analytics?
You can use the serverless SQL pool in Azure Synapse Analytics to send SQL queries directly to the Azure Cosmos DB analytical store.
Is there an additional cost for enabling Azure Synapse Link on Cosmos DB accounts?
There’s no additional cost for enabling Azure Synapse Link on Cosmos DB accounts. However, there are costs associated with using Azure Synapse Analytics and storing data in the Cosmos DB analytical store.
Is Azure Synapse Link for Cosmos DB available in all Azure regions?
Azure Synapse Link for Cosmos DB is available in all Azure regions where both Azure Cosmos DB and Azure Synapse Analytics are available.
Which type of Azure Cosmos DB APIs support Azure Synapse Link?
Currently, Azure Cosmos DB’s SQL (Core) API and Azure Cosmos DB’s API for MongoDB support Azure Synapse Link.
How to enable Azure Synapse Link on an existing Azure Cosmos DB container?
Azure Synapse Link can be enabled on an existing Azure Cosmos DB container using the Azure portal, REST API, or Azure Resource Manager.
Can analytical store hold the complete history of my transactional data in Cosmos DB?
No. By default, it retains data based on the default or user-defined Time to Live (TTL) settings. TTL in the Azure Cosmos DB analytical store separately manages the lifespan of the data in the analytical store without affecting transactional data’s lifespan.