In a nutshell, T-SQL is an extension of SQL (Structured Query Language) that includes procedural programming, local variables, various support functions for string processing, data processing, mathematics and changes in database schema, etc.
It becomes pivotal in scenarios where data needs to be transformed by executing sophisticated transaction control, error and exception handling, and row processing. T-SQL provides a series of key data transformation functions that can help you manipulate the data in various ways.
Key T-SQL Data Transformation Functions
1. CAST and CONVERT Functions
The CAST and CONVERT functions are quintessential in changing data types of column values. They are mainly used to convert a string data type into numerical or date data types.
Example:
SELECT CAST('25' as INT) AS Age, CONVERT(INT, '25') as Age;
The difference between the two lies in their usage. CONVERT function is more flexible and has additional styles for formatting dates and times.
2. String Functions
String functions are vital in data transformation where string values need to be transformed or manipulated. Some of the primarily used string functions are:
- `LEFT()`, `RIGHT()`, and `SUBSTRING()`: These functions are used to extract a specific portion of a string.
- `LEN()`: This function is used to retrieve the length of a string.
- `UPPER()` and `LOWER()`: These functions convert string values to upper case or lower case respectively.
- `TRIM()`, `LTRIM()`, `RTRIM()`: These functions are used to remove unnecessary spaces from the string.
Example:
SELECT UPPER('Azure') as Upper_Text, LOWER('Azure') as Lower_Text;
3. Date and Time Functions
A suite of date and time functions in T-SQL helps in manipulating and transforming data related to dates and times. Among these functions, `GETDATE()`, `SYSDATETIME()`, `DATEADD()`, `DATEDIFF()`, and `FORMAT()` are notable ones.
Example:
SELECT GETDATE() AS Current_Date, FORMAT(GETDATE(), 'dd-MM-yy') as Formatted_Date;
4. Mathematical Functions
T-SQL also provides mathematical functions like `ABS()`, `ROUND()`, `CEILING()`, `FLOOR()` etc., which are often used to perform mathematical transformations on your data.
Example:
SELECT ABS(-10) as Absolute_Value, CEILING(7.1) as Ceil_Value;
5. Aggregative Functions
Aggregative functions like `SUM()`, `COUNT()`, `AVG()`, `MIN()`, `MAX()` are quite frequently used in data transformation while performing mathematical operations on multiple rows.
Example:
SELECT SUM(Salary) as Total_Salary, COUNT(EmployeeId) as Employee_Count FROM Employees;
These transformations constitute an important part of preparing structured data for analysis, data modeling, and ultimately, deriving meaningful insights that can guide business decisions. Consequently, the knowledge of these transformations is highly beneficial for the DP-203 Data Engineering on Microsoft Azure exam.
Never overlook the power of T-SQL while working with data on Azure. It not only equips you with transforming data for further analysis but also helps you in manipulating data as per your requirements. As you gear up for the DP-203 exam, take some time to familiarize yourself with these essential T-SQL functions and remember to refer to the official Microsoft documentation to explore them in-depth.
Practice Test
True/False: T-SQL is the primary language used for managing and manipulating data in Microsoft SQL Server.
- True
- False
Answer: True
Explanation: Transact-SQL, or T-SQL, is the main language used for managing and manipulating databases in Microsoft’s SQL Server.
Multiple Select: Which of the following are common tasks using T-SQL?
- a) Data extraction
- b) Data transformation
- c) Managing database backups
- d) Data loading
Answer: a, b, d
Explanation: T-SQL is used for data extraction, transformation, and loading. Managing database backups is typically done using DBMS tools, not T-SQL.
Single Select: Which command is used to modify the data in an SQL Server database?
- a) SELECT
- b) UPDATE
- c) EXECUTE
- d) DELETE
Answer: b) UPDATE
Explanation: The UPDATE statement is used in SQL Server to modify the existing records in a table.
Multiple Select: What are the two types of subqueries in T-SQL?
- a) Correlated subquery
- b) Named subquery
- c) Scalar subquery
- d) Column subquery
Answer: a) Correlated subquery, c) Scalar subquery
Explanation: Correlated and scalar subqueries are the two types of subqueries are commonly used in T-SQL.
True/False: T-SQL can be used to handle complex processing and business logic.
- True
- False
Answer: True
Explanation: T-SQL is a powerful language with capabilities to handle complex data processing and incorporate advanced business logic.
Single Select: Which T-SQL command is used to retrieve data from the database?
- a) GET
- b) FETCH
- c) PULL
- d) SELECT
Answer: d) SELECT
Explanation: The SELECT statement is used in SQL to select data from a database.
True/False: T-SQL cannot create stored procedures.
- True
- False
Answer: False
Explanation: T-SQL is commonly used to write stored procedures in SQL Server.
Multiple Select: Which of the following are T-SQL command categories?
- a) Data Definition Language (DDL)
- b) Data Transfer Language (DTL)
- c) Data Manipulation Language (DML)
- d) Transaction Control Language (TCL)
Answer: a) Data Definition Language (DDL), c) Data Manipulation Language (DML), d) Transaction Control Language (TCL)
Explanation: T-SQL includes DDL, DML and TCL commands. DTL does not exist.
Single Select: Which T-SQL statement is used to remove rows from a table?
- a) DELETE
- b) REMOVE
- c) DROP
- d) CLEAN
Answer: a) DELETE
Explanation: The DELETE statement is used in SQL to delete existing records in a table.
True/False: CAST and CONVERT are T-SQL functions used to convert an expression of one data type to another.
- True
- False
Answer: True
Explanation: Both CAST and CONVERT functions in T-SQL are used to convert an expression from one data type to another.
Multiple Select: Which are the valid T-SQL mathematical functions?
- a) ABS()
- b) ROUND()
- c) ADD()
- d) SQRT()
Answer: a) ABS(), b) ROUND(), d) SQRT()
Explanation: ABS, ROUND, and SQRT are all valid mathematical functions in T-SQL. ADD is not a function but an operator.
Single Select: Which T-SQL command is use to create new database?
- a) MAKE
- b) CREATE
- c) BUILD
- d) NEW
Answer: b) CREATE
Explanation: CREATE DATABASE is the command used to create a new database.
True/False: Transforming data with T-SQL can make data more understandable and usable.
- True
- False
Answer: True
Explanation: T-SQL can be used to transform raw data into a more meaningful format, making this data more understandable and usable for end users.
Multiple Select: Which are the types of JOIN in T-SQL?
- a) LEFT JOIN
- b) COMBINE JOIN
- c) RIGHT JOIN
- d) INNER JOIN
Answer: a) LEFT JOIN, c) RIGHT JOIN, d) INNER JOIN
Explanation: LEFT JOIN, RIGHT JOIN, and INNER JOIN are types of joins in T-SQL. COMBINE JOIN does not exist.
Single Select: Which T-SQL command is used to execute a stored procedure?
- a) RUN
- b) START
- c) EXECUTE
- d) LAUNCH
Answer: c) EXECUTE
Explanation: The EXECUTE (or simply EXEC) command is used to execute a stored procedure.
Interview Questions
What is Transact-SQL (T-SQL)?
Transact-SQL (T-SQL) is Microsoft’s and Sybase’s proprietary extension to SQL. SQL, or Structured Query Language, is a standard language that provides a systematic way of accessing, manipulating, and controlling data in databases. T-SQL enhances standard SQL by adding procedural programming, local variable, various support functions for string processing, date processing, mathematics, etc.
How to extract the first name from a full name in a column using T-SQL?
You can use a combination of CHARINDEX and LEFT functions. Below example considers space as the separator between first name and last name:
SELECT LEFT(FullName, CHARINDEX(' ', FullName) - 1) as FirstName FROM TableName
How can you use T-SQL to modify data in Azure SQL Database?
By using the UPDATE statement. The UPDATE statement is used to change or modify the existing records in a database table. For example:
UPDATE tablename SET column1 = value1, column2 = value2,... WHERE condition
How can you sort data in descending order using T-SQL?
You can use the ORDER BY clause and specify DESC for descending order. For example:
SELECT * FROM TableName ORDER BY ColumnName DESC
How to remove spaces from a string in T-SQL?
The REPLACE function can be used to replace all the spaces from a string. For example:
SELECT REPLACE(columnName, ' ', '') AS NoSpaceColumnName FROM TableName
What is the purpose of the DISTINCT keyword in T-SQL?
The DISTINCT keyword is used in a SELECT statement to eliminate all the duplicate records and fetching only unique records.
What can you do with the GROUP BY clause in T-SQL?
The GROUP BY statement groups rows that have the same values in specified columns into aggregated data, like SUM, AVG, MAX, MIN, COUNT.
What is the function of the IF...ELSE statement in T-SQL?
The IF...ELSE statement in T-SQL is a way to conditionally execute SQL statements. If the condition evaluates to TRUE, then the SQL statements within IF block are executed. Otherwise, the SQL queries within ELSE block are executed.
How does the BETWEEN Operator work in T-SQL?
The BETWEEN operator in T-SQL is used to select values within a given range. It can be numbers, text or dates. The values can be included in the range. Syntax: column_name BETWEEN value1 AND value2.
How to get the current SQL Server Date and Time using T-SQL?
You can use GETDATE() function to get the current date and time in SQL Server. For example:
SELECT GETDATE() AS CurrentDateTime
What is the use of the JOINS clause in T-SQL?
In T-SQL, a JOIN clause is used to combine rows from two or more tables, based on a related column between them. There are different types of joins available like INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN.
What is the difference between UNION and UNION ALL in T-SQL?
UNION operator is used to combine the result-set of two or more SELECT statements and eliminates duplicate records. Whereas, the UNION ALL does the same, but it does not eliminate duplicate rows.
What is the INSERT INTO SELECT statement in T-SQL?
The INSERT INTO SELECT statement allows you to copy data from one table and insert it into another table.
How can you concatenate strings in T-SQL?
You can use the + operator to concatenate strings in T-SQL. For example:
SELECT Column1 + ' ' + Column2 AS FullName FROM TableName
How to delete a table in T-SQL?
You can use the DROP TABLE statement to delete a table in T-SQL. For example:
DROP TABLE TableName