SSRS: Grouping

In our last lesson we made the basic report seen below:

2018-04-06_10-25-55

**Note, you can follow the link below to the first lesson:

SSRS: Introdution: 1rst Report

Grouping

Our data is currently in table form, but is otherwise still nothing more than a raw data dump. Let’s make our report a little nicer with some grouping. Right click on the data row (not header) in your table. Mouse over to Add Group and select Parent Group…

2018-04-06_10-34-35.png

In the new window, select [Name] from the drop down.

2018-04-06_10-35-12

Click Add group header and Add group footer boxes. Now click OK

2018-04-06_10-35-27.png

Now a group has been added to your report.

2018-04-06_10-36-00

If you click on preview, you will now see the table is grouped by Names

2018-04-06_10-36-27.png

But you will notice we now have 2 columns showing the Name, one – our new grouping column and the other – the original column. This is redundant. To get rid of it, go back to Design, right click on the second name column and select Column Visibility…

2018-04-06_10-36-52

When the new window opens up, click Hide.

2018-04-06_10-37-11.png

Now when you look at the report now, you will see the second Name column is now hidden.

2018-04-06_10-38-16.png

Running Total:

Next, let’s set up a running total for Hours spent on each job. To do so, right click on the Hours text box and select Add Total

2018-04-06_10-39-04

Now when we go back to preview, we will see at total in the group footer for each person

2018-04-06_10-39-36

Average:

Now what if we wanted an average instead? Right click on the textbox that says [Sum(Hours)] and select Expression

2018-04-06_10-40-15

You can just type = Avg(Fields!Hours.Value) in the expression builder box, but if you don’t know the code, you can use information in the boxes below. As you can see in the example below, if you go to Common Functions > Aggregate you will see the code for lots of functions like Average, count, standard deviation.

2018-04-06_10-41-25

Now when you go to preview, you will see an average.

But now we have a new problem. If you are trying to average something like work hours, odds are you will not need to go out to 10 decimal places. So a number like 8.272727272727 is pretty much ridiculous for a report like this.

2018-04-06_10-42-19.png

Now go to Number > Number and set the Decimal Places to 2

2018-04-06_10-43-26.png

So if you look at it again, you will see you only have 2 decimal points now.

2018-04-06_10-46-41

 

 

Advertisements

Blockchain: P2p distributed networks

Another concept you need to be familiar with to understand Blockchain is the concept of a P2p distributed network. This is the physical architecture that allows Blockchain to work and provides a blockchain with redundancy.

The P2P in P2P distributed network stands for peer to peer, indicating a network comprised of peers. What do I mean by that? The majority of computer networks in place right now are what is known as Server/Client networks.

In the picture below, the center square represents a server, with the boxes around it representing nodes (or in your case, the computer/tablet/phone you are reading this on). When you want to view a web page, you send a request from your node to a server. The server will then respond with the requested information.

p2p1.png

While this works well, it does have some drawbacks.   First off, since the server is central point of communication and the holder of all the information (webpages, databases, etc), if the server goes down, the network is essentially dead. This is the whole idea behind one of the more successful methods of cyber attack – the Denial of Service in which a server is targeted with more traffic than it can handle, shutting it down. You will often see it called a Distributed Denial of Server of DDoS as in order to hit the server with enough traffic to break it, hackers use multiple computers synced to deliver enough requests to the server all at the same time, overwhelming it. In other words, the attack is “distributed” across multiple computers.

p2p2

Blockchain does not use a server client approach. Instead it uses a P2p or peer to peer network to function.  In a peer to peer, the nodes (laptops, tablets, etc) all talk directly to each other. Instead of a server holding all the information, the data that makes up the blockchain is instead distributed across all the different nodes. So the more nodes that are part of the blockchain, the more copies of it that exist.

p2p3.png

This works great for redundancy as even if you took out a couple of nodes in the network, it would still be able to function as normal. And as we will see a future lesson, even if you were able to hack in and corrupt the blockchain in one of the nodes, the fact that copies of it exist on all the other nodes protect it from corruption.

This architecture is also at the heart of philosophy around crypto-currencies like BitCoin. Unlike traditional banking systems that have centralized management, BitCoin is programmed with a deflationary policy that no one person (or group of person) can control. This is an interesting economic experiment unfurling before all of us. And I for one am curious to see how it plays out. While organizations like the Fed (in the United States) have done a relatively good job of keeping the US dollar strong, poor centralized economic management has spelled disaster in countries like Venezuela and Zimbabwe.

I’ll discuss more on the economic theory behind BitCoin in later lessons. It is enough for now for you to know that the P2P decentralized nature of the network is all part of the design in ensuring no one person can make such drastic changes.

If you want to learn more about networks and how they interact, here are some further resources:

https://www.khanacademy.org/computing/computer-science/internet-intro

https://www.techsoup.org/support/articles-and-how-tos/networking-101-concepts-and-definitions

http://www.tomshardware.com/reviews/local-area-network-wi-fi-wireless,3020.html

SSRS: Introdution: 1rst Report

SSRS stands for Sql Server Reporting Service. This is Microsoft’s BI reporting tool integrated into their Sql Server platform. SSRS allows you to create, deploy, and manage reports from a server platform.

SSRS comes as part of the SQL Server suite. It is not available as part of Express, but if you buy the developers edition, you will get SSRS (as well as SSIS and SSAS). You may need to download and install it separately. You’ll find it under the title, SQL Server Data Tools (SSDT).

SSDT download link

In this tutorial, I will walk you through creating your first SSRS report. In this example, I created two SQL Server tables from the following Excel files:

JobDataSet

JobDesc

If you aren’t sure how to upload an Excel file to SQL Server, you can go to my tutorial on how to do it. The SQL Server upload tutorial was actually created to support the SSRS lessons.

SQL Server: Importing Excel File to SQL Server

To create a new SSRS project, open Visual Studios (the platform SSDT runs on) and go to File->New->Project

2018-04-06_8-56-18

Select Reporting Services -> Report Server Project

Name your project. I typically leave Create directory for solution checked.

2018-04-06_8-58-27.png

Your new “solution” will open up. I still haven’t really figured out why MS changes the name from Project when creating and opening it to Solution once you are working on it. I am sure someone, somewhere had a reason for it. That person is probably retired now and won’t return any email requests as to why he decided on the wording choice.  We’ll just have to chalk it up to another one of life’s mysteries. Like why is Regis Philbin famous?

But I digress…

Now our “solution” is open, we will see 3 sub folders. Shared Data Sources, Shared Datasets and Reports

2018-04-06_9-01-16

To create a new report, right click on Reports -> Add -> New Item.  Don’t click Add New Report unless you want to meet the world’s most unhelpful Wizard. I call him Gandalf the Drunk…

2018-04-06_9-01-47

After clicking add new item, click Report and name your report.

2018-04-06_9-02-32

Now your new report will open up.

2018-04-06_9-33-25.png

Data Source

So, before we can actually report on anything, we are going to need data. And in order to get data, we are going to need a Data Source. Now you will notice you have Shared Data Sources on the right and Data Sources on the left. Shared Data Sources, once established, can be used in all reports you create. If you create an “embedded” data source in the report, you will only be able to use it in that report.

For this example, we will make a shared data source.

Go to the right under Solution Explorer and right click Shared Data Sources. This time you can click Add New Data Source.

2018-04-06_9-33-48

This wizard is just Gandolf the Tipsy. While I harbor a general dislike for most wizards, this one isn’t completely useless at least.

First name your data set something you will remember later.

Select Type ->Microsoft SQL Server

Click Edit

2018-04-06_9-37-25

Copy and paste your server name in the 2nd box. In this example, my SQL Server is locally installed on my computer, so I just used localhost as my server name. Next select the Database you want to work with from the drop down. I created a database call SSRSTraining for this example

2018-04-06_9-40-23.png

Hit Test Connection, you should get a success message.

2018-04-06_9-40-42

Click Okay, you’ll new see your Data Source in the Solution Explorer

2018-04-06_9-41-33.png

Now go to the left and right click on Data Source for your report. Select Add New Data Source

2018-04-06_9-43-10

Name your data source and click on the Use shared data source reference radio button.

Pick you data source from the drop down. There should only be one to choose from

2018-04-06_9-43-36

Datasets:

Now click okay, go back to the left and right click on Dataset.

2018-04-06_9-44-02

Select New Data Set,

Name the Data Set

Select Use a dataset embedded in my report. This is generally how I do things, as Data Sources are usually reusable, Datasets are more designed for specific reports, so I leave them embedded.

Select your Data source from the drop down

For Query type we are using Text in this example

I am using a simple select all statement from the dbo.JobDataSet table I created

2018-04-06_9-47-16.png

If you click on fields in the upper right, you’ll now see the columns that will be feeding in from the query. You can rename the columns if you wish

2018-04-06_9-48-00.png

For now, let’s just click Okay, now you will see your data set expanded on the left side of your screen.

2018-04-06_9-48-46

Now to the far left, you should see the word Toolbox, click on that. This is a list of the tools we can work with within SSRS. Let’s start with a table

2018-04-06_9-49-09

Click on the table and drag it into the design window in the middle of the screen

2018-04-06_9-49-27.png

Now you can simply drag and drop columns from your dataset into your new table.

2018-04-06_9-49-53.png

By default, a table comes with 3 columns. You can add columns to this table by dragging a field over to the end of the table (note you will see a blue bar indicator letting you know your mouse is in the right spot. The column will populate to the right of the blue bar).

2018-04-06_9-50-24

This will add a new column to the end of your table. You can also use this method to insert a column in between existing table columns

2018-04-06_9-50-39

Now click on Preview button above your table

2018-04-06_10-23-01

When you do, you will get to see your table complete with data. Notice how the job column is too small for the job description.

2018-04-06_9-51-14

Step 5: Formatting

To fix the job column, let’s go back to our Design screen. Do this by clicking on the Design tab in the upper left.

2018-04-06_10-23-01

Now hover your mouse over the table until you get the double arrow icon seen below

2018-04-06_10-23-28

Once you have that, simply click and drag the column over to make it wider

2018-04-06_10-24-09

Since we are in the design window anyway, let us do a little more formatting. Click on the gray box to the left the header row to highlight the entire row. Now we can do things like Bold the font or change the background color

2018-04-06_10-24-49.png

Go back to the preview window to check out your results.

2018-04-06_10-25-55

There you have your very first SSRS Report from top to bottom.

SQL Server: Importing Excel File to SQL Server

Working with data inside a database has many advantages to working with data in an Excel spreadsheet. Luckily SQL Server makes it relatively easy to import data from an Excel File into the database.

We will be using the Excel files below:

JobDataSet

JobDesc

Let’s start by opening SSMS (SQL Server Management Studio)

Next, let’s create a database to hold these files. You don’t need to create a new database to import data, but I am building this tutorial as part of a series on SSRS, so I am building a new database for that purpose.

To create a new Database, right click on Databases on the upper left and click New Database…

2018-04-09_8-55-16.png

Now name your new database, we will just accept the defaults

2018-04-09_8-59-56

Now go your newly created database, right click, and go to Tasks

2018-04-09_9-00-48

From the Tasks sub-menu, select Import Data

2018-04-09_8-57-20.png

The import Wizard will open, simply click Next

2018-04-09_9-01-12.png

Next, select Excel from the drop down

2018-04-09_9-01-41.png

Next, click Browse

2018-04-09_9-02-13

Select your file

2018-04-09_9-05-24.png

Make sure First row has column names is checked and click Next

2018-04-09_9-07-04.png

On the next screen select SQL Server Native Client 11.0 (If you don’t have 11.0 – 10.0 should work)

2018-04-09_9-11-31.png

Make sure the database you want is selected and click next

2018-04-09_9-11-52.png

In this example, we are going to use Copy data from one or more tables or views

2018-04-09_9-12-29.png

Make sure to name the table you want to create in SQL Server (red arrow)

2018-04-09_9-13-23

If you click Preview you can get a look at what the new table will be loaded with

2018-04-09_9-15-34.png

Click Okay on the preview window and click Next on the Import Wizard

Leave the default Run Immediately checked and click next

2018-04-09_9-16-07.png

Review info on the next window and click Finish

2018-04-09_9-16-31

The package will run

Note the blue lettering will let you see how many rows transfer from the Excel file to SQL Server

2018-04-09_9-16-53.png

If you check your database, you will see your tables. (I loaded both spreadsheets in to the database for the upcoming SSRS tutorial)

2018-04-09_9-19-58.png

Finally, run a select * on your new table to see the data you transferred into SQL Server

2018-04-09_9-20-35.png

Blockchain: Immutable Ledger

The next concept we need to cover to properly understand blockchain is the concept of an Immutable Ledger.  To translate this term into something that looks more like English, an Immutable Ledger simply means a record that cannot be changed.

The idea behind all of this is data security and proof that the data has not been altered. Why are we so concerned here? In a blockchain application like BitCoin, we are tracking transactions of money. Imagine if you sent me an electronic funds transfer for $100. How would you feel if I hacked into your bank and changed that $100, to $100,000? (For the record, if you tried that with my account, the computer would just laugh at you. The only time you’ll see a number that big associated with me is when you are looking at my student loans LOL).

Anyway, back to the point, you want to make sure that when you set up a transfer for $100, no one can alter it. With blockchain, that is true even if you want it to be altered (you made a mistake). If you want to fix an error, you will have to add another transaction to the blockchain to correct the issue. This is actually good accounting practice, as once an entry is made into a ledger, it should never be removed or altered.

Think of this like purchasing a car. If you go to your neighbor and buy his used car for $2000. You give him the money, and he signs over a title to you. The title is proof of ownership. To ensure that your neighbor cannot just take the car back, you take the title down to the Department of Motor Vehicles and have the title registered in your name. Now you have a record of your transaction should the ownership of the vehicle ever come into question.

So how does blockchain ensure immutability of the ledger? It all resides in the concept of the hash. If hacker tries to alter anything in the block below, its hash will change. Now the hash will no long match the previous hash in the second block. So, the hacker would have to change the next block, and the block after that, etc.

2018-04-04_13-21-53

And even if they were able to pull that off, remember that the blockchain resides on multiple computers. The hacker would need to make all of these changes simultaneously. When you consider the millions of nodes that make up a blockchain environment like BitCoin, you will see that would be impossible.

2018-04-04_14-50-00

In the next lesson, we will be looking at peer to peer distributed networks

Blockchain: Cryptographic Hash

To fully understand blockchain, it helps to have a good understanding of what is known as a cryptographic hash. It is this hash that is at the very core of how blockchain works.

If you read my introduction to blockchain lesson, you would see that the Hash is part of every block. It serves as a unique identifier for the block. That is the general idea for all Hashes, whether cryptographic or not.

2018-04-04_12-39-34

It would actually be more proper for me to refer to it as a Hash Function. It is a function where you pass in text, a document, a picture, anything digital, and the function will return a unique identifying number.

Hash functions were used even before cryptography was an issue. You may have heard the term Hash Tables, this is where computer programmers would store a table full of hashes indicating text or documents, instead of having to store large documents in memory.

The Hash tables worked kind of like this:

You would pass some text to the Hash Function and it would transform it to a hash.

Ben -> 0123
Data Science -> 4871
Analytics is a great field of study -> 2580

These hashes were then placed in a table and when the program needed to access the information, it used the Hash to look it up – kind of like an index in the back of a book.

Now the Hash Functions used by Blockchain are cryptographic, so they are a little different than just a simple hash table. In order for a hash to be cryptographic, it needs to follow some basic rules.

  1. It has to be 1 way. If you pass the text “Analytics” to a cryptographic hashing function it will return a hash(let’s say 1234). It will return the same hash every time you pass the text “Analytics” to it (again: 1234). So the hash function knows what hash to create for that text. But, we want to ensure that if someone has the hash 1234, they cannot reverse it to obtain the text (they can’t reverse engineer it)

Think of it like a finger print. If I have person, I can always obtain a fingerprint.                  However, if all I have is a fingerprint, I cannot produce a person from it. A finger               print on its own won’t let me derive information like eye color or hair color of the             individual who left print behind.

  1. The hash function needs to be fast. You will understand better when we get to mining, but blockchain miners are passing millions of a hashes a second. With a slow hash function, the entire concept would fail.
  2. The hash needs to ensure that similar items do not receive similar hashes. The example below shows what we do not want.
Text Hash
Analytics 1234
Analytics4 1235

 

With the hashes 1234 and 1235 being so close as well as the text being so close, it would make it possible to reverse engineer a hash. For example, if you knew that Analytics4all was 1236, you might be able to back track the hash until you hit analytics at 1234

Instead we want something like this:

Text Hash
Analytics 1234
Analytics4 8476

 

You see the similar text do not get similar hashes. Let’s look at it through numbers

Number Hash
20 8463
21 1258
22 6581
23 0874

SHA-256

Now different blockchain applications will use different cryptographic hash functions, For example, Etherium uses MD-5. BitCoin, however uses SHA-256, so that is what I will focus on.

SHA-256 was created by the NSA and as of this writing, it has not been cracked. The code for SHA-256 is open source, so anyone can  make use of it.

A SHA-256 HASH is 64 hexidecimal digits long  (64 digits * 4 bits per digit = 256).

Here is a SHA-256 HASH

8BC775C7EFAACAD6AFF7CED25E0A793EF7DD2C5B0652EF5F85BA02FF57407A2B

If you want to try it out, go to this website

https://passwordsgenerator.net/sha256-hash-generator/

Below, I pass the text Analytics to the hash generator and I get a 64 digit hash returned

2018-04-06_14-09-56.png

Now when I just add the number 4 to the end of my text, the hash completely changes. The two hashes do not look anything alike. Again, this is designed to help reduce the chance of someone reverse engineering the hash.

2018-04-06_14-10-56.png

Just to drive the point home. Here is removed the s from the end of the text. Note the new hash is again completely different.

2018-04-06_14-11-23.png

Go to the website. Try out some hashing for yourself.

In the next lesson I will cover the concept of an imputable ledger. This will take us one step closer to understanding Blockchain.

Blockchain: An Introduction

Unless you’ve been living under a rock, you have undoubtedly heard the  term blockchain being batted around. Most likely you’ve heard of it in relation to cryptocurrency like BitCoin, but blockchain is quickly moving into many other areas. Much like the Big Data craze though, my personal experience has been the more you hear someone utter the term blockchain, the less that person actually knows about it. It is the Dunning-Kruger Effect in action.

In this lesson, I am going to introduce you to the concept of blockchain. I have boiled it down to its simplest concepts, and I will be speaking very broadly about the subject. In future lessons I will dive deeper into the more technical aspects of blockchain, providing much more in the way of specifics.

For those already familiar with blockchain, I am aware that I am glossing over some rather important concepts here, but my goal in this lesson is to provide a simple, easily understood tutorial. The goal of my website is to provide an accessible education into many of the complex concepts surrounding analytics and data science to everyone, regardless of past experience or education. I promise a deeper dive in the future, but for now, let us start simply.

What is a blockchain?

At the most basic level, a blockchain is a collection of data kept in a list.

2018-04-04_12-41-14

What makes these lists so interesting is that:

  1. They are connected using cryptography
  2. They are distributed amongst multiple computers, providing a redundant method of protection

To understand how they work, let us start by looking at a block

2018-04-04_12-39-34

Starting from the top

Block #: is just the number of the block in the chain, first block # is 1, second is 2, so on

Nonce: stands for “number used only once”. I am going to cover the Nonce in depth in the lesson on Mining, but for right now, just be aware that it is a number and every block needs a Nonce

Data: This is where the data is stored. In BitCoin this is often filled with transaction information

Previous Hash: The hash of the block before this one

Hash: I will cover hashes in depth in a future lesson two. For now, just know this the cryptographic part of blockchain. A hash is a code number that identifies the block. An easy way to conceptualize it is to think of it like a VIN on a car. The VIN (or vehicle identification number) can be used to tell you the make, model, color, and many other characteristics of a car. The hash (in blockchain) will tell you everything found in the block (I know I am way over simplifying here, but hey we have to start somewhere, and I promise a future lesson on hash)

Now let’s add a second block to our chain

Notice the previous hash in block 2 is the same as the hash in block 1. This, you will soon see, is part of what makes blockchain so secure.

2018-04-04_12-40-26_1.png

In the picture below, you can see if someone tried to go into Block 1 and make a change, the Hash for block 1 would change. Any change to the first four fields in a block will cause the Hash to chance since the block is no longer the same anymore. The Hash is like a VIN or a fingerprint, it can only represent a single individual block. And once you change any aspect of a block, it is no longer the same individual block anymore.

2018-04-04_13-21-53.png

So as you can see, if the Hash in the first block changes, it will no longer match the Previous Hash in the second block. When this happens the blockchain is broken. So if a hacker tried to alter a transaction in block 1, the chain would break.

Okay, so what is to prevent the hacker from just changing the second block? Aside from hashing issues that we will discuss in future lessons? The other deterrent is found in peer to peer sharing of blockchains

In most real world applications of blockchain, the chain will not reside on only one computer, instead will be replicated across multiple computers.

2018-04-04_12-41-29

So now if a hacker tries to change the third block on one computer.

2018-04-04_12-41-57

The third blocks hash will change, breaking its connection from the fourth block

2018-04-04_12-42-17.png

And even more importantly, the chain will no longer look like the one on the second computer. Now in real life, this will be spread across thousands of computers. So to determine which blockchain is correct, they look to see what iteration the majority of computers say is correct.

So as seen below, 3 of the 4 computers show 4 blue squares, while only one was a yellow square. So based on the vote of the majority, the final result will be 4 blue squares.

2018-04-04_14-50-00.png

So there you have it, blockchain in its most simplistic form. In future lessons I’ll dive deeper in the different concepts to show how it works from the inside out.

 

 

 

 

Click below for an interesting link for BitCoin information

Alexus Security

SQL: User Defined Functions

Functions in SQL work just like functions in most programming languages. If you aren’t familiar with a function, you should know that you are already using them without even knowing it.

Consider this for example:

Select Count(*)
From Table

COUNT() is a function. When you pass it rows from your query, it counts the rows and returns a value in the form of an integer. But COUNT() is a built in function, meaning it came as part of SQL Server, you did not have to create it.

SQL Server allows you the option of creating User Defined Functions, meaning functions you develop yourself. These are handy when you find yourself handling repeated tasks, such as date formatting or string manipulation. Instead of having to repeatedly code a complex command, you can just build a function once and then call on it whenever needed.

Let’s start with a basic example:

2018-04-03_14-23-36.png

Here I created a function called ADD_UP that accepts 2 numbers and outputs the SUM of the two numbers ( yes I know this already available as the built in function SUM(), but I want to start nice and easy)

Lets start by discussing the syntax. The basic syntax for creating a function is as follows:

CREATE FUNCTION name (@var data-type)
RETURNS data-type
AS
BEGIN
   RETURNS (some type of action)
END

In my example we are naming the function ADD_UP and supplying two integer variables: @NUM1 and @NUM2

CREATE FUNCTION ADD_UP (@NUM1 INT, @NUM2 INT)

Then we define the data-type our function will return. In this case, since we are adding 2 integers, our function will return an INT

RETURNS INT

Next we wrap out function in

AS
BEGIN
ENDS

Finally, we perform an action

RETURNS (@NUM1+ @NUM2)

Finally, when you want to call the Function, just use it in a select statement.

(**Note, user defined functions require you to use the schema prefix. Since I just used the default dbo schema, this example uses dbo.ADD_UP)

select dbo.ADD_UP(2,3) as ADDED

and as you see, we get 5 as our answer.

2018-04-03_14-23-36

Now, let’s try something different. Here we are going to work with a date function. In this example I built a function called MNTH that accepts one variable @DT – a date data-type and returns an Integer representing the month of the date passed to it.

Again, all I am really doing is duplicating the built-in function MONTH(), but I wanted to show different data-types

2018-04-03_14-32-54.png

(** getdate() is a built-in function that returns the current date. I ran this SQL on 4/2/2018, so it returns 4 as a result)

Now finally here is an example of how you might use a function in real life. Let’s say you have lots of reports that call for your date to be represented in MM-YYYY format. Instead of having to repeatedly type a complex date formatting, you can build it into a User Defined Function and pass a regular date to the function.

2018-04-03_14-34-36

If you are not familiar with cast(concat(month(@DT),’-‘,year(@DT))as varchar(8))) statement, I’ll break it down here:

Let’s go from the inside out:

concat is a string function meaning to concatenate or “string together” – so

concat(month(@DT),’-‘,year(@DT))

concat(4, ‘-‘, 2018)

4-2018

Cast allows us to convert the output of the concat statement into a string (varchar) data-type

cast(4-2018 as varchar(8)) = ‘4-2018’

Finally, if you want to find your functions after you create them, they are located under your database -> Programmability -> Functions

In this case, I only built Scalar-valued Functions, I’ll cover the other types in future lessons.

2018-04-03_14-39-21.png