Relational Database

Relational databases are one of the most widely used types of databases in the world. They are a type of database management system that organizes data into one or more tables, each with a unique identifier, and enforces relationships between them. In this article, we will explore what relational databases are, how they work, and their benefits.

What is a Relational Database?

A relational database is a type of database that uses tables to organize and store data. Each table contains rows of data, and each row represents a single record. The columns of the table represent different fields, or attributes, of the record.

The relationships between the tables are defined through the use of keys, which are unique identifiers that allow data to be linked between tables. The most common type of key used in a relational database is the primary key, which is a unique identifier for each row in a table. Other types of keys, such as foreign keys, can be used to link tables together.

How Does a Relational Database Work?

A relational database works by storing data in tables, each with a unique identifier. The tables are linked together through the use of keys, which allow data to be linked between tables. For example, a database for a library might have one table for books, another table for authors, and a third table for borrowers. Each of these tables would have its own set of fields, such as book title, author name, and borrower ID.

To retrieve data from a relational database, a user writes a query, which is a request for specific data from one or more tables. The database management system then executes the query and returns the requested data. The user can also modify or add data to the database by issuing update or insert commands.

Benefits of Relational Databases

Relational databases have several advantages over other types of databases. First, they are highly scalable, which means they can handle very large amounts of data. This makes them ideal for use in large enterprise systems or other applications that require the storage and retrieval of large amounts of data.

Relational databases are also highly flexible. They allow for complex relationships between data, which means they can be used to model many different types of systems. This flexibility makes them useful in a wide range of applications, from financial systems to social media platforms.

Finally, relational databases are highly secure. They include features like access control and encryption to ensure that data is protected from unauthorized access. This is particularly important in applications that handle sensitive data, such as medical records or financial data.

Conclusion

Relational databases are one of the most widely used types of databases in the world. They are highly scalable, flexible, and secure, and they allow for complex relationships between data. If you are working with large amounts of data, or if you need to store data in a secure and flexible way, a relational database is definitely worth considering.

SQL Intro – SELECT, FROM, WHERE, and Order By

Probably the most commonly used SQL keywords, Select and From are the minimum commands you would need to see data from a table.

Select Statement is used to tell the database what data you would like to see

The Select Statement is made up of the following parts:

If you are not in my class, but want to follow along, here is a data file you can import into SQL Server:

Instructions for importing the Excel File to SQL Server : Instructions

SQL: Common Data Types in SQL Server

Here is a table of the most commonly used data types in SQL Server

Data TypeDescription
Char()Fixed length string, unused spaces get padded and eat up memory: size 0-255
Varchar()Variable length string, unused spaces don’t use memory: 8000 chars
Nvarchar()Designed to handle Uni Code data (UFT-8): 4000 chars
nvarchar(max)536-870-912 characters
TextUp to 2GB of text data
Identity(x,y)Auto incrementing number with x being starting point and y = steps, so Identity(1,1) starts and1 and counts by 1
INTinteger (whole number, no decimals)
Decimal(x,d)floating point decimal number, x is size, d is number of places after the decimal
float(n)floating precision number, Float(24) = 4-bytes, Float(53) = 8-bytes — float(53) is default
Bool or BooleanBinary choice, 0 = False and 1 = True
DateDate data type “YYYY-MM-DD” (if set to US settings)
DATETIMEdatetime data type “YYYY-MM-DD HH:MM:SS” (if set to US settings)
TIMETime “HH:MM:SS”
YEARyear in for digit representation (ex 1908,1965,2011)

SQL: Intro to SQL Server – Create Database and Tables

We will be using SQL Server Management Studio in the following lessons. If you have SQL Server installed on your machine, search for MS SQL Server Management Studio in programs or search for SSMS. If you need to install MS Sql Server: click here

Once it opens, enter the server you are looking to connect to and pick your authentication method (I’m using Windows Authentication, but you could set up a SA account and use Server Authentication)

If you properly connect to the server, you should get an object explore like the one seen below

Create a database

If you are working on work or school SQL Server, you may not have rights to create a database, you will most likely have a database assigned to you that you can build tables in. You can skip to the table creation part of the lesson.

Method 1: Using the Gui

Right click on database in the object explore, click New Database

Next name your new database, leave all other settings as is. Click Ok

Your new database’s name will appear in the list of databases now

Method 2: Use SQL

This is my preferred method. And again, we will just be using the default settings here to make this lesson easier.

Click the New Query Button to open a new query window

In the new window, type the following (note the semicolon at the end of the line, this is standard SQL and used by most system. SQL Server allows you to replace ; with the word GO. It is completely legit, I just don’t use it because no other system does either)

Create Database Test2;

Then click Execute

If you don’t see your new database appear in the Object Explorer, right click Database ,and select Refresh

Select Database to Work With

Method 1: Gui

From your query workspace, select your database from the drop down menu

Method 2: SQL Code

Go to a query workspace and type in the following code

use Test;

I tend to like this method because you can put it on the top of code you might share and it will guide people to the right database

Create Table

Method 1: Gui

Hit the + next to your database to expand

Right Click Tables > New > Table…

Now manually enter column names and datatypes for your new table

Once you are done. Click the X to close this tab. You will be first asked to save changes (yes) then you will be asked to Name you new table

Method 2: Sql Code

From your query window, use the following code to create a table:

Create Table tableName (
    Column 1  datatype,
    Column 2  datatype));

The syntax is pretty straightforward. The code below with create a table names Contractor with 6 columns

create table Contractor (
ContractorID int primary key,
CompanyNM nvarchar(255),
LastNM	nvarchar(255),
FirstNM nvarchar(255),
Phone nvarchar(50),
email nvarchar(255));

Note I am able to assign the primary key to the first column by putting primary key after the datatype

Copy this into your SQL Server — Note you can run segments of code by highlighting them first and then hitting execute. Only the highlighted code is run.

To see if it runs successfully, expand your tables segment out on your object explorer

Let’s add another table. Copy the following code over to SQL Server and execute just like before

create table Permit (
PermitID nvarchar(255) primary key,
StartDate date,
ProjectTitle nvarchar(255),
[Location] nvarchar(255),
Fee money,
ContractorID int);

Now lets connect the two tables with a foreign key/primary key relationship. To create this relationship, use the following code

alter table Permit
add foreign key(ContractorID) References Contractor(ContractorID);

Note I am working with the table Permit, I am saying the Column ContractorID is the foreign key in the Permit table related to (References) the ContractorID column in the Contractor table

Add data to tables

Use the following code to add data to the two tables

insert into Contractor
values (1, 'Front Poarch Construction', 'Poarch','Ken', '555-1234', 'poarch@fpc.com'),
       (2, 'Mikrot Construction', 'Mikrot', 'Kim', '555-5678', 'MK@mikrot.com'),
	   (3, 'Sobaba Construction','Sobata', 'Jeri', '555-9012', 'SJ@sobaba.com');

Insert into Permit 
values ('B12345','2022-01-01','My Deck','Branchburg',550.00,3);

The syntax is basically

Insert into <tableName>

Values (data separated by commas, rows wrapped in parathesis, again separated by columns)

Add Data from Excel File

You can download the following file if you want to play along

Right click on your database, Tasks> Import Data

Click Next on the first Window to pop up

Choose Microsoft Excel as Data source, browse for your file, make sure First row has column names is selected, click Next

Select SQL Server Native client as destination. If you have more than one to pick from, choose the higher number. Click Next

Leave default options – Click Next

Select the top option, You can change the destination table name if you choose.

I choose to change it and then click Next

Leave default selections, click Next

Click Finish

Make sure you got 75 rows Transferred and click close

Move data to production tables

Permit_Landing is a Landing Table. That means a table you load data into initially before it is verified and moved to production tables

Refresh your database to see the new added tables. Right click on Permits_Landing and Select Top 1000 Rows

A query window should pop up and give you the following results

The goal is to move this data to the Permit table. But note, the Permit Table has a column ContractorID that is not present in Permits_Landing. So we have to use code like seen below.

  insert into permit (PermitID, StartDate, ProjectTitle, [Location], Fee)
  select * from Permit_Landing;

Note, we have Insert Into Permit (like before) — but we now include a list columns. We only list the columns we want to load data into. Since we don’t have ContractorID column in the landing table, we will not include it here.

Also, notice the [] around Location. This is because location is a SQL key word. To let SQL Server know we are talking about a column and not a keyword, we put square brackets around it

Finally, we choose the data to load into the table using a simple select statement: Select * from Permit_Landing

SQL: Rollback and Commit – undo mistakes in SQL

If you ever want to experience a heart attack, may I advise accidently deleting a production table from a database. And even if you survive the heart attack, your job may not.

This is, IMHO, this single most important piece of SQL code you will ever learn. It is called Rollback and Commit.

What Rollback does, is reverse and changes you have made to the database, allowing you to undelete the table you accidentally sent to the data afterworld.

Here is how it works. You start by letting the system know to you are making changes that you may want to reverse. The syntax is the as follows

Start transaction;

delete from <table> where <column> = <Value>;

Now comes the life saving part. If you realize you made a mistake, simply type Rollback;

If, on the other hand, you like the results, then type Commit;

The results will be saved and the transaction instance will be closed out.

Now lets try adding a new row, using start transaction
If that is not what you want, just type Rollback; and your mistake is gone
On the other hand, if you want to save the results, simply type commit; and the new record will stay

Important NOTE: Rollback only works if you first Start Transaction: — it is two words that will save your job. Start every instance of adding or deleting data or object with Start Transaction: – trust me on this one.

SQL: What is SQL and its 5 Subgroups DQL, DML, TCL, DDL, DCL?

SQL – short for Sequel Query Language is a programming language designed to work with data stored in RDBMS (relational database management systems). The data managed by SQL is tabular, meaning it formatted in rows and columns, very much like an Excel spreadsheet.

SQL is broken down into 5 sets of command groups:

DQL = Data Query Language

Querying (SELECT, FROM, etc)

DML = Data Manipulation Language

Data manipulation (INSERT, DELETE, etc)

TCL = Transaction Control Language

Transaction mgt. (COMMIT, ROLLBACK, etc)

DDL = Data Definition Language

Data definition (CREATE, DROP, etc)

DCL = Data Control Language

Data control (GRANT, REVOKE, etc)

SQL: Add, rename, or delete a column in an existing table

To add a column here is the syntax

alter table <table name>
add <column name> <datatype>;

example: (add a column named location to a table emp_info)

alter table emp_info
add location varchar(255);

To rename a column in sql, here is the syntax

alter table <table name>
rename column <old name> to <new name>;

example: (change name of column location to office)

alter table emp_info
rename column location to office;

To delete or remove a column, use the drop command

alter table <table name>
drop column <column name>;

Example:

alter table emp_info
drop column office;

Return to SQL page:

SQL: Create a temporary table

Temporary tables are a great way of working on complex data requests. They are easy to create and they delete themselves after every session, so you do not have to worry about creating a big mess with a bunch of tables you need to go clean up later.

In this tutorial, I am going to use a real world example from my work in Verizon’s Cyber Security Department. This is a simplified version of ask, and I am using completely made up data. There is no data from Verizon on my website every. I simply discuss use cases to make learning analytics more grounded in the real world

Below is a list of dates, PhoneNum: phone numbers called about, and the CallerNum: the number the person is calling from. While there are many legitimate reasons for someone to call customer support from another number (I drop and break my phone so I borrow my co-workers phone and call customer support to request a replacement), a number that calls in repeatedly about many different numbers is a red flag of someone that could be a fraudster.

If you want to play along, you can download the data set here:

I am using MySql in this example as my database, but I will include the code for SQL Server, Teradata, and Oracle platforms as well.

So the ask is find CallerNum that is calling about many different PhoneNum

While I am sure you can make a complex subquery to do this job, but I’m going to show you how to use temporary tables to make this ask very simple:

As you can see above, I loaded the data into a table called dbtest.numberslist

Now to find out how many CallerNum are calling about multiple PhoneNum, a simple solution is to get a list of all distinct combinations of PhoneNum and CallerNum and then do a count of CallerNums from this distinct list. Since the list is distinct, a CallerNum calling in about the same PhoneNum will only appear once, so a CallerNum calling about multiple PhoneNums will appear multiple times.

So using temporary tables, I will create a table that holds the distinct call combinations

MySql (code is create temporary table <table name> then query to fill table

create temporary table distCalls
select distinct phonenum, callerNum from dbtest.numberslist;
Select * from distCalls -- shows what is in the table now

Now, lets see if we can find potential fraud callers, let us do a count of callerNum from the distinct temporary table

As you can see above, there are 4 numbers that have called about 4 distinct phone numbers during this time period. Again, this could be for legitimate reasons, but this is still something we look at when trying to find questionable activity.

MySQL Code

create temporary table distCalls select distinct phonenum, callerNum

from dbtest.numberslist;

SQL Server

Select distinct PhoneNum, CallerNum 
into #distCalls
from dbtest.numberslist
go 

#tableName -- indicated temporary tables in SQL Server

Teradata

Create volatile table distCalls as (
select distinct PhoneNum, CallerNum
from dbtest.numberslist)
with data
on commit preserve rows;

with data and on commit preserve rows are needed at the end if you want any data to be in your table when you go to use it

Oracle

Create private temporary table distCalls as
select distinct PhoneNum, CallerNum
from dbtest.numberslist; 

Remember, temp tables delete themselves after each session (each time you log off the database). If you are working in the same session and need to recreate the temp table for some reason, you can always drop the table just as you would any other table object in SQL.

R: Connecting to SQL Server Database

You can query data from a SQL Server database directly from R using the RODBC package.

install.packages("RODBC")

First you need to form a connection

library(RODBC)
##connection string
cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client 11.0};server=localhost; database=SSRSTraining;trusted_connection=yes;")

We use the odbcDriverConnect() function. Inside we pass a connection = value

Driver = {SQL Server Native Client 11.0};  — this is based on the version of SQL Server you have

server=localhost;  — I used localhost because the SQL Server was on the same computer I was working with. Otherwise, pass the server name

database=SSRSTraining; — name of database I want to work with

trusted_connection=yes; — this means I am able to pass my Windows credentials.

If you don’t have a trusted connect pass the user Id and password like this

uid = userName; pwd = Password;

Note each parameter is separated by a semicolon

Query the database

> ##passes query to SQL Server
> df <- sqlQuery(cn, "select * FROM [SSRSTraining].[dbo].[JobDataSet]")
> head(df)

    Name              Job Hours Complete
1  Sally Predictive Model     1        n
2 Philip      Maintanence    10        n
3    Tom    Ad-hoc Report    12        y
4    Bob             SSRS     3        y
5 Philip         Tableau      7        n
6    Tom         Tableau      9        n

using sqlQuery() – pass through the connection string (cn) and enclose your query in ” ”