In the last blog we saw how we can replace values using power query and change data type. Below is the link for the first part.
In continuation to the Query editor, we will be using Assignment-4 to understand the usage of query editor.
Assignment - 4
Use Case – Working with Date & Delimiter
Data Source – Game of Thrones
Analytics – The raw data - your task is to transform it into something more presentable.
Our ask is to: -
1. Split the author column in two and replace the resulting nulls with blanks.
2. The dates needs to be format as I am not happy they are appearing.
3. You should also filter the data to show only episodes directed by Daniel.
Before transforming our data, it looks like this: -
Now let us go step by step and transform the raw data before moving to dashboard.
Before doing any transformation, lets make changes to the column names. You can do it simply by double clicking on the column name.
I will change "Written by -> Author", "Created by -> Director". the first column is season, I will add a new column named episodes from "Add Column -> Index Column -> From 1". Remove the season column as we don't need it now.
Once this is done our data now looks like this.
Now let us split the author column in two and replace the resulting nulls with blanks.
Go to "Split Column -> by Delimiter". Choose custom. It should be like this: -
Press OK. Once this is done. you will see that your author column has now split into two separate columns. In the second column, some the cells are showing null.
We need to remove the word null and make it empty. Follow the steps as shown in the image below to remove null.
The second ask of the assignment is to change the date format from current "4/17/2011" to "17 April 2011".
We will extract the date, month and year one by one and then merge it into one column to get the desired output.
Right click on "release date" -> "Duplicate Column". Rename the new column as "Date".
Right click on "Date"-> "Transform" -> "Day" ->"Day. The column will now have only Date.
Repeat the same procedure for month and year as well.
Once you have three separate columns. Select three columns together and then right click & select merge. While merging columns, you need to choose separator. In this case I have used "space".
Please note that you need to select date, month and year in the same order in order to get the output as "17 April 2011". Once you have followed these steps your output should be like this:-
Now our data looks fine. Let us see how we can filter the data. Our third ask is to extract all the episodes directed by "Daniel". On the director column, click on the dropdown and select "Daniel". Our table will now show all the episodes directed by Daniel. It should look like this: -
I have used treemap and line chart to understand the most watched director and the viewers each day. Below is the image of the small analysis that I have done.
In the next series I will come up with something new.
Please comment if you have any questions!!
Share the blog and like it if you found this useful.