Visualizations are big part of analytics. You will need to produce visually engaging graphics for presentations, reports, and dashboards. You will also make graphs for your own use in data discovery and analysis. As bonus, unlike data cleaning, data viz can be pretty fun.
Matplotlab and Pyplot
Matplotlab is a module you can import into Python that will help you to build some basic graphs and charts. Pyplot is part of Matplotlab and the part we will be using in the following example.
**If you are using the Anaconda Python distribution, Matplotlab is already installed. If not, you may need to download it from another source.
Line Graph
Syntax
%matplotlib inline – this code allows you to view your graphs inside jupyter notebooks
from matplotlib import pyplot as plt – here we import pyplot from matplotlib into our program (note, we only want pyplot not all the functions in matplotlib).Adding “as plt” gives us a shorter alias to work with
age and height lines – fill our lists with age and height information for an individual
plt.plot(age, height, color = ‘blue’) – here we tell Python to plot age against height and to color our line blue
plt.show() – prints out our graph
Bar Chart
For this example, we will make a bar charting showing ages of 4 people.
Syntax
You should understand the first few lines from the first example
count_names = [i for i,_ in enumerate(name)] – since the name list is a list of strings, we cannot really graph that onto a chart. We need a way to convert these strings into numbers.
Wait? What does for i,_ mean? Let’s jump to the next code sample
While you don’t see it when making the list, a Python list is technically a list of tuples (index number, element). So if instead of i,_ we asked for both elements in the tuple, (i,j) we would get the following.
So by iterating by for i,_ we only return the first element in the tuple (the index)
** notice we are using a list comprehension. If you are unfamiliar with list comprehensions, check out my earlier post:Python: List Comprehension
Let’s clean up our bar chart a little now.
plt.ylabel(‘Age’) – label the y-axis Age
plt.title(‘Age of People’) – give the graph a title
plt.xticks([i+0.5 for i,_ in enumerate(name)], name) – this label function is using a list comprehension to first chose the position on the X-axis, and name provides the person’s name for the label.
If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT.
This lesson is a continuation of an earlier lesson. If you are already familiar with Tableau, feel free to continue on. Otherwise, check out my first Tableau lesson: Line and Bar Charts
If you want to add 3 or more measures to a line chart, you need to take a different approach than in regular charts.
Import the Data
Select Excel from the Connect menu and select the school lunch excel file you have downloaded.
If you are continuing on from the Line and Bar Charts lesson, you can skip this step, your data is already loaded.
Create a New Worksheet
Click the New Worksheet icon found on the bottom of your screen.
Drag Year to Columns and Measure Values to Rows
Get rid of Sum(Number of Records) by dragging it back into Measures
While holding down Ctrl drag Measure Names from the Dimensions slot to Color
This lesson is a continuation of an earlier lesson. If you are already familiar with Tableau, feel free to continue on. Otherwise, check out my first Tableau lesson: Line and Bar Charts
Import the Data
Select Excel from the Connect menu and select the school lunch excel file you have downloaded.
If you are continuing on from the Line and Bar Charts lesson, you can skip this step, your data is already loaded.
Create a New Worksheet
Click the New Worksheet icon found on the bottom of your screen.
Drag Year from Dimensions and Free from Measures into Columns and Rows respectively. You should now have a line chart. (if not, refer to Lesson 1 for troubleshooting tips)
Now, drag Full Price into Rows. You should now notice you have two graphs. Free up top and Full Price on the bottom.
Now you could just stop there. You do have both Measures graphed. But this really isn’t the best way analyze this data. It is hard to do a good comparison this way.
Dual Axis
For better analysis, we are going to create a Dual Axis Chart.
Right click on the Y Axis of the bottom chart and select Dual axis
Now you have both measures on one graph.
If you look closely at the Left and Right Y-Axis’s, you will notice they are not the same. This could skew how someone would interpret this data.
To fix this, right click on the Right Y-Axis and select Synchronize axis
Finally, since both of your Y-Axis match up, you don’t need them both. Right click on the Right Axis again and uncheck Show header.
When you start up Tableau, the first thing you need to do is select a data source.
In this case, select Excel and choose the file the you downloaded above (schoolLunch.xls)
Once loaded, the Data Source Page will open up.
a. Data Source File
b. Shows Sheets in the file (there is only one sheet in this particular file)
c. Shows data.
The Data
In this example, we are looking at the number of kids receiving Free, Reduced Priced, and Full Price lunches at American public schools from 1971 to 2015.
Line Chart
Start by making a new sheet
A Quick Note About Dimensions and Measures:
Notice on the new sheet that the columns from your imported Excel sheet have been placed into two boxes on the left of the screen: Dimension and Measures. Think of Dimensions as Factors or Labels. While Measures are columns you would perform calculations against (adding, averaging, etc).
Drag Year from Dimensions and Free from Measures into Columns and Rows respectively.
The line graph should appear automatically. If not, follow the next steps:
First, make sure your Row variable says SUM(Free) — this means we are summing up all numbers in the Free column— If it doesn’t, hover over the measure until a small downward arrow appears. Then go to Measure and select Sum.
If you don’t have a line chart, go to Marks and select Line from the drop down menu
Bar Chart
Now, go to Marks again and select Bar. Your chart will change over to a bar chart. Try a few of the other options like Area and Shape.
This visualization (made using Tableau) shows the CPI (Consumer Price Index) for common food items. While 2014 was bad year for staples such as dairy and meat, 2015 showed a nice recovery. The main exception being eggs. Look at the massive increase in egg prices caused by the bird flu epidemic of 2015. **note the purple dot represents the 20-Year Historical Average.
PM’s can be a real resource drain, especially on the heavy months. That has been one thing that has always confounded me as long as I have worked as an HTM professional (or Biomed — the name keeps changing). I have never understood why you would have one month with over 1000 scheduled inspections and only 200 scheduled inspections in another month.
The problem is, balancing the workload is a tedious job, sifting through pages of work order lists and moving schedules around. However, if you don’t feel like going cross eyed staring at all of that small text, try giving data visualization a try.
This graph above(produced using Tableau) shows the scheduled work order load for an imaginary hospital. Note that each color block represents a separate department.
Mousing over each colored block provides a fly out showing the Department and record count. You can go here to try the interactive visualization out for yourself: Click here to interact
Looking at this visualization, it is easy to see where departments can be quickly moved around to balance the load. This visualization, if connected to your database, can also become part of a dashboard – allowing you to keep an eye on the work load and prevent it from become unbalanced as departments open and closed and equipment comes and goes.
Now the final data model I created takes more into account than simply the number of records. Using historical data, I add a timing factor to each inspection (this takes into account that the inspection of a diagnostic ultrasound machine will take more time than the inspection of an IV pump). I also add in a personnel factor (if you only have one technician qualified to work on imaging equipment vs 8 technicians for standard patient care equipment, the imaging technician’s time should be weighted to represent that).
This visualization depicts Free, Reduced, and Full priced lunches served by the the National School Lunch Program (NSLP) in United States Public and Private Non-Profit Schools.
The sharp rise Free lunches, in conjunction with the sharp decline in Full priced lunches since 2008 hints that the effects of the 2008 economic recession are still being felt.