Analytics: An Introduction

So exactly what is Analytics? Everyone is talking about it. Colleges and Universities are scrambling to develop programs in it. But what exactly does it mean?

Definition

The the definition I like the best is this:

Analytics: Discovering and communicating meaningful patterns in data.

Analytics are traditionally broken down into the following catagories:

  • Descriptive Analytics: Most people are familiar with this form. So familiar in fact, they probably do not refer to it as analytics. This is looking at past and current data to describe what is going on. Most standard business reporting falls into this category.
  • Predictive Analytics: This is using available data to help predict future events or to provide best guess answers to fill in gaps in data. Using predictive analytics, you can predict how much a house will sell for or what items you should stock near the registers based on current conditions (example: Walmart discovered Pop-Tarts tend to sell well during hurricanes).
  • Prescriptive Analytics: This is the cutting edge of analytics. Prescriptive analytics not only makes predictions about future events, but it utilizes decision making algorithms to determine how to respond to the events. Prescriptive analytics engines could, using the Pop Tarts example above, automatically reroute the shipment of Pop Tarts to stores in hurricane affected areas without any human intervention.

It should be noted that most companies today are still spending most of their time in the descriptive analytics world. That is not necessarily a bad thing. Being able to get the right information in front of a decision maker, in a format that is easily digestible, is a talent all within itself. 

Components

Analytics is not a 1 step process. It is actually a series of steps, often performed in an iterative manner. And just as each business problem is unique, so are the steps to the analytics process used to find the solution.

While the statement above is 100% percent true, I find it very unsatisfying. This is the kind of information I would find when I first developed an interest in analytics. So while I cannot give you a one size fits all answer, I feel that I at least owe you a better explanation than that.

For me, perhaps the best way to understand analytics, is to look at some of the more common tasks performed.

  • Data Management: While designing, building, and maintaining databases and data warehouses may not typically fall under the responsibility of an analytics professional, having a general understanding of how they work is none the less important. Databases and data warehouses are where most businesses keep their data. If you want to be taken seriously as a data professional, you need to have a fundamental understanding of how data is stored and how to query the stored data. (Example Technologies: Hadoop, SQL Server, Oracle)
  • Data Modeling: Data modeling is organizing data into logical structures so that is can be understood and manipulated by a machine. As a simple exercise, make a quick spreadsheet for sales amounts for  5 salespeople across 4 quarters. When you are done, look at the table you created. You have just modeled data. (Example Technologies: Excel, SQL Server, Oracle, Visio)
  • Data Cleaning: While this may not be the sexiest part of the job, it is the part you will spend the most time on. 60-80% of your time will be spent in this phase of the job. And while there are some third party software applications out there that can help ease the pain (Alteryx comes immediately to mind), they are expensive and not every boss will be willing to spring for it. My suggestion is to put sometime aside to become very familiar with Excel. I do 90% of my data cleaning work in Excel and MS SQL Server. (Example Technologies: Excel, SQL Server, Oracle, Alteryx)
  • Data Mining (Machine Learning): Now this is the cool stuff everyone is talking about. Data mining or machine learning, whichever you prefer to call it,  is the Artificial Intelligence (AI) portion of analytics. Data mining is difficult to provide a simple explanation for, but I will try anyway: In traditional programming, the programmer provides explicit instructions to the computer as to how to perform a task. With data mining, data sets are fed through an algorithm. The computer then determines the best way to solve the problem based on the data provided. 

 To help make this a little clearer, how about you try your hand at being the machine.

spam

Look at the pattern above. Without me providing you with any more information,                  you should be able to determine, that two blue squares in a row = SPAM. This is, at                 the most fundamental level, how data mining works. It pours over data and finds                   patterns. Knowing this pattern, if you were now shown only the first three columns               you would be able to predict whether the last column would be red or green.(Example Technologies: R, Python, SAS, XLMiner)

  • Data Visualization: DataViz is fun. It is the real show stopper in the data world. Visualizations make the patterns pop off the page. There are a lot of great programs out there for data visualization. (Again, do not discount Excel — it has some great DataViz features). Now DataViz should rightfully be broken into two separate categories. The first is Exploratory. This is visualizations used by the data professional to help analyze and understand the data. The second is Production. This the finished product that ends up on reports and dashboards for the business users to see. (Example Technologies: Excel, Tableau, R, SAS)
  • Optimization and Simulation: How often is there truly only one solution for a problem? Reality is sometimes the hardest part isn’t coming up with a solution to a problem, but deciding which solution to use. Building optimization models and running simulations helps to provide decision makers with quantitative data as to which solutions will be most effective. (Example Technologies: CPLEX, SAS, Solver)

So I have to learn all of this…

That depends – If your goal to is be a Data Scientist, then yes, you need to learn everything mentioned above and then some (I hope you love Statistics). However, if you are a business user just trying to add analytic skill to your toolbox, my recommendation is to focus your efforts on becoming efficient in data cleaning. In the real world, when trying to put a report together, you often are given data from multiple sources and you have to cobble it together to make sense of it. Learning some data cleaning skills can save you hours on tasks like that.

Once you have workable data, take some time to learn some visualization techniques. An eye popping chart will always garner more attention than pages of numeric columns. Also, take a little time to learn some data mining skills. No one is expecting you to write the complex algorithms the PhD’s at Stanford and MIT are kicking out, but there actually are some pretty user friendly data mining programs out there that help you cull some real insight out of your data.

However you decide to go about it, Analytics is a fascinating, fast growing field. It truly is a 21st century skill. Here at Analytics4All.org, the philosophy is that everyone should develop some analytical talent. Computers were once the sole territory of the science geeks of the world and now they are in everyone’s pockets and purses. Analytics and data driven decision making should also be a accessible to all.

Leave a Reply