Home » Profile Data in Power BI

Profile Data in Power BI

The data profiling allows a intuitive ways to clean, transform, and understand data in Power Query Editor, which helps you identifying the nuances of the data such as determining anomalies, examining and developing the underlying data structures, and querying data statistics such as row counts, value distributions, minimum and maximum values, averages, errors and so on.



Data profiling includes following options:

  1. Column quality
  2. Column distribution
  3. Column profile

Lets see how to use data profiling in Power BI step by step.

First load a sample data set in Power BI desktop, here we have used a dataset financials that comes by default within Power BI desktop (October 2020 and later version).

Once you load data in Power BI Desktop, lets click on Transform data tab in ribbon bar.





Once you click on Transform data, a Power Query Editor opens as shown below.

Next in Power Query Editor, go to View tab, there you will see the Column Quality, Column Distribution, and Column Profile options inside thee Data Preview Group.

Note that By default, Power Query will perform this data profiling over the first 1,000 rows of your data in dataset. To perform data profiling on entire dataset, you can check the lower left corner of Power query editor window to change how column profiling is performed.

Lets see understand about these option one by one.

Column Quality

Column quality shows you the percentages of data that is Valid shown in green, in Error shown in red, and Empty shown in dark grey.

When you enable the Column quality option, you will see a percentage of data in terms of Valid, Error and Empty are displayed directly underneath the name of the column shown below.

When you hover over any of the columns, you will get the information about the numerical distribution of the quality of values throughout the column. Also, selecting the ellipsis (…) opens some quick action buttons for operations on the values as shown below.

Column distribution

Column distribution shows you the distribution of the data within the column and the counts of distinct and unique values.

Distinct values are all values in a column, including duplicates and null values while unique values do not include duplicates or nulls. Therefore, distinct in this table tells you the total count of how many values are present, while unique tells you how many of those values are not duplicates or nulls.

When you enable Column distribution option you will see a small bar chart underneath the names of the columns which shows the the frequency and distribution of the values in each of the column, and data in these small chart is sorted in descending order from the value with the highest frequency.





When you hover over any of the columns, you will get the information about the overall data in the column in terms of distinct count and unique values. Also, selecting the ellipsis (…) opens some quick action buttons for operations on the values as shown below.

Column profile

Column profile shows you a more in-depth look into the statistics within the column such as different values distinct and unique values, empty string, including the count of rows which is important when verifying whether the importing of your data was successful.

When you enable this option, you will see the information of column statistics and a chart that shows a value distribution underneath the data preview section.

In below screenshot you can see, the column statistic and value distribution for segment column. similarly, you can select other column to see the information.

Filter by value

You can also interact with the value distribution chart on the right side and select any of the bars by hovering over the parts of the chart as shown below.

now either you can click on ellipsis (…) or simply right click on bar, it gives you some filter options that you can perform as per your requirement.

Copy Data

You can also copy the data for column statistics and value distribution chart, In the upper-right corner of both the column statistics and value distribution sections, you can select the ellipsis (…) to display a Copy shortcut menu. Select it to copy the data displayed in either section to the clipboard, notepad or excel.

Group by value

You can also perform grouping based on value and text length, when you select the ellipsis (…) in the upper-right corner of the value distribution chart, in addition to Copy you can select Group by. This feature groups the values in your chart by a set of available options.

By default you can see grouping is done values, you can also change that to Group by Text length.

Lets change the grouping to Text Length, and you can see, the column distribution is done now based on text length.

Enabling column profiling for entire dataset

BY default, Power Query perform the data profiling over the first 1,000 rows of your data in dataset as shown below.





In case, if you want to perform data profiling on entire dataset, you can check the lower left corner of Power query editor window to change how column profiling is performed.

Lets click on that button, and you will see there it gives you one more option that is column profiling based on entire data set, just click on that.

 

Also Read..

Group By the data using Power Query

Create Index column using Power Query

Merge Queries

Import text using examples feature Power BI

Pivot columns – turn rows into columns

Split Columns By Positions

Choose and Remove Columns from table

SQL Basics TutorialSQL Advance TutorialSSRSInterview Q & A
SQL Create tableSQL Server Stored ProcedureCreate a New SSRS Project List Of SQL Server basics to Advance Level Interview Q & A
SQL ALTER TABLESQL Server MergeCreate a Shared Data Source in SSRSSQL Server Question & Answer Quiz
SQL DropSQL Server PivotCreate a SSRS Tabular Report / Detail Report
..... More.... More....More
Power BI TutorialAzure TutorialPython TutorialSQL Server Tips & Tricks
Download and Install Power BI DesktopCreate an Azure storage accountLearn Python & ML Step by stepEnable Dark theme in SQL Server Management studio
Connect Power BI to SQL ServerUpload files to Azure storage containerSQL Server Template Explorer
Create Report ToolTip Pages in Power BICreate Azure SQL Database ServerDisplaying line numbers in Query Editor Window
....More....More....More

 1,054 total views,  1 views today

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.