Home » SQL Data Discovery and Classification

SQL Data Discovery and Classification

SQL Server Data Discovery and classification




SQL Server Management Studio 17.5 introduced a new tool for data discovery and data classification.

This tool uses a classification engine that scans your database for columns containing sensitive data and provides a list of recommended column classifications, that you can easily review and apply the appropriate classification recommendations, as well as it also provides a facility to add manually classify columns.

Here we have taken a database Adventurework2017 that we want to classify, and using the instance of SQL Server 2019.

Right click on Database > Next, Click on Tasks, then Data Discovery and classification > Classify Data..

 

Once you click on Classify Data.., The classification engine scans your database for columns containing sensitive data and generates the Data Classification report that contains the list of recommended column classifications.

Now you can see, the number of columns with the recommendations.

 

Now to to view the list of recommended column classifications. click on click to view button.

You can see, the list of columns with classification recommendations in Grid.

 

Now you have an option to classify the Information Type by default that is recommended by classification engine but you can change it.

Therefore, you can select whether the column contains credentials, credit card, banking or any other option that is available in dropdown lists.

 

Again you also have an option to classify the Sensitivity Label, as you did earlier for InformationType, So can select any of option available in dropdown to classify the sensitivity label for column.

 

However, you notice that any column that might have not been recommended by classification engine, and that is supposed to have a senstive data then you can also manually add that column as shown below.

Click on Add Classification tab >Then at the left side, you see a window Add classification opens.

Select schema, table, column, information type, and sensitivity label from dropdown list.

click on Add button.

 

Once you click on add button, then click on Save button to complete the classification and persistently label the database columns with the new classification metadata.

 

After saving the classifications, you see the list of columns in Grid that you have added maually.

So far we have seen, how to add the column classification manually, Now add the classifications that is recommended by the classification engine.

Just check the those columns that you want to add, if you want to change the information type , and senstivity label just change it from respective dropdown lists.

Next, click on Accept selected recommendations button.

After that you see the list of classified columns, including both classified that you have added manually and that are recommended by classification engine.

To complete the final step just click on save button.

Then click on View Report button to see  the SQL Data Classification Report that displays the information of database classification state such as classified columns, tables containing sensitive data, and unique information.

You can also review all the classified that you have just added using below T-SQL query on SQL Server 2019 instances:

SELECT 
schema_name(Obj.schema_id) AS schema_name,
Obj.NAME AS table_name,
C.NAME AS column_name,
information_type,
label
FROM sys.sensitivity_classifications sc
JOIN sys.objects Obj
ON sc.major_id = Obj.object_id
JOIN sys.columns C 
ON sc.major_id = C.object_id AND sc.minor_id = C.column_id

 

For instances of SQL Server 2017 and previous, you can use below T-SQL Query.

SELECT
schema_name(O.schema_id) AS schema_name,
O.NAME AS table_name,
C.NAME AS column_name,
information_type,
sensitivity_label 
FROM
(
SELECT
IT.major_id,
IT.minor_id,
IT.information_type,
L.sensitivity_label 
FROM
(
SELECT
major_id,
minor_id,
value AS information_type 
FROM sys.extended_properties 
WHERE NAME = 'sys_information_type_name'
) IT 
FULL OUTER JOIN
(
SELECT
major_id,
minor_id,
value AS sensitivity_label 
FROM sys.extended_properties 
WHERE NAME = 'sys_sensitivity_label_name'
) L 
ON IT.major_id = L.major_id AND IT.minor_id = L.minor_id
) EP
JOIN sys.objects O
ON EP.major_id = O.object_id 
JOIN sys.columns C 
ON EP.major_id = C.object_id AND EP.minor_id = C.column_id




Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.