Relational databases are one of the most widely used types of databases in the world. They are a type of database management system that organizes data into one or more tables, each with a unique identifier, and enforces relationships between them. In this article, we will explore what relational databases are, how they work, and their benefits.
What is a Relational Database?
A relational database is a type of database that uses tables to organize and store data. Each table contains rows of data, and each row represents a single record. The columns of the table represent different fields, or attributes, of the record.
The relationships between the tables are defined through the use of keys, which are unique identifiers that allow data to be linked between tables. The most common type of key used in a relational database is the primary key, which is a unique identifier for each row in a table. Other types of keys, such as foreign keys, can be used to link tables together.
How Does a Relational Database Work?
A relational database works by storing data in tables, each with a unique identifier. The tables are linked together through the use of keys, which allow data to be linked between tables. For example, a database for a library might have one table for books, another table for authors, and a third table for borrowers. Each of these tables would have its own set of fields, such as book title, author name, and borrower ID.
To retrieve data from a relational database, a user writes a query, which is a request for specific data from one or more tables. The database management system then executes the query and returns the requested data. The user can also modify or add data to the database by issuing update or insert commands.
Benefits of Relational Databases
Relational databases have several advantages over other types of databases. First, they are highly scalable, which means they can handle very large amounts of data. This makes them ideal for use in large enterprise systems or other applications that require the storage and retrieval of large amounts of data.
Relational databases are also highly flexible. They allow for complex relationships between data, which means they can be used to model many different types of systems. This flexibility makes them useful in a wide range of applications, from financial systems to social media platforms.
Finally, relational databases are highly secure. They include features like access control and encryption to ensure that data is protected from unauthorized access. This is particularly important in applications that handle sensitive data, such as medical records or financial data.
Conclusion
Relational databases are one of the most widely used types of databases in the world. They are highly scalable, flexible, and secure, and they allow for complex relationships between data. If you are working with large amounts of data, or if you need to store data in a secure and flexible way, a relational database is definitely worth considering.
We will be using SQL Server Management Studio in the following lessons. If you have SQL Server installed on your machine, search for MS SQL Server Management Studio in programs or search for SSMS. If you need to install MS Sql Server: click here
Once it opens, enter the server you are looking to connect to and pick your authentication method (I’m using Windows Authentication, but you could set up a SA account and use Server Authentication)
If you properly connect to the server, you should get an object explore like the one seen below
Create a database
If you are working on work or school SQL Server, you may not have rights to create a database, you will most likely have a database assigned to you that you can build tables in. You can skip to the table creation part of the lesson.
Method 1: Using the Gui
Right click on database in the object explore, click New Database
Next name your new database, leave all other settings as is. Click Ok
Your new database’s name will appear in the list of databases now
Method 2: Use SQL
This is my preferred method. And again, we will just be using the default settings here to make this lesson easier.
Click the New Query Button to open a new query window
In the new window, type the following (note the semicolon at the end of the line, this is standard SQL and used by most system. SQL Server allows you to replace ; with the word GO. It is completely legit, I just don’t use it because no other system does either)
Create Database Test2;
Then click Execute
If you don’t see your new database appear in the Object Explorer, right click Database ,and select Refresh
Select Database to Work With
Method 1: Gui
From your query workspace, select your database from the drop down menu
Method 2: SQL Code
Go to a query workspace and type in the following code
use Test;
I tend to like this method because you can put it on the top of code you might share and it will guide people to the right database
Create Table
Method 1: Gui
Hit the + next to your database to expand
Right Click Tables > New > Table…
Now manually enter column names and datatypes for your new table
Once you are done. Click the X to close this tab. You will be first asked to save changes (yes) then you will be asked to Name you new table
Method 2: Sql Code
From your query window, use the following code to create a table:
Note I am able to assign the primary key to the first column by putting primary key after the datatype
Copy this into your SQL Server — Note you can run segments of code by highlighting them first and then hitting execute. Only the highlighted code is run.
To see if it runs successfully, expand your tables segment out on your object explorer
Let’s add another table. Copy the following code over to SQL Server and execute just like before
Now lets connect the two tables with a foreign key/primary key relationship. To create this relationship, use the following code
alter table Permit
add foreign key(ContractorID) References Contractor(ContractorID);
Note I am working with the table Permit, I am saying the Column ContractorID is the foreign key in the Permit table related to (References) the ContractorID column in the Contractor table
Add data to tables
Use the following code to add data to the two tables
Choose Microsoft Excel as Data source, browse for your file, make sure First row has column names is selected, click Next
Select SQL Server Native client as destination. If you have more than one to pick from, choose the higher number. Click Next
Leave default options – Click Next
Select the top option, You can change the destination table name if you choose.
I choose to change it and then click Next
Leave default selections, click Next
Click Finish
Make sure you got 75 rows Transferred and click close
Move data to production tables
Permit_Landing is a Landing Table. That means a table you load data into initially before it is verified and moved to production tables
Refresh your database to see the new added tables. Right click on Permits_Landing and Select Top 1000 Rows
A query window should pop up and give you the following results
The goal is to move this data to the Permit table. But note, the Permit Table has a column ContractorID that is not present in Permits_Landing. So we have to use code like seen below.
insert into permit (PermitID, StartDate, ProjectTitle, [Location], Fee)
select * from Permit_Landing;
Note, we have Insert Into Permit (like before) — but we now include a list columns. We only list the columns we want to load data into. Since we don’t have ContractorID column in the landing table, we will not include it here.
Also, notice the [] around Location. This is because location is a SQL key word. To let SQL Server know we are talking about a column and not a keyword, we put square brackets around it
Finally, we choose the data to load into the table using a simple select statement: Select * from Permit_Landing
If you ever want to experience a heart attack, may I advise accidently deleting a production table from a database. And even if you survive the heart attack, your job may not.
This is, IMHO, this single most important piece of SQL code you will ever learn. It is called Rollback and Commit.
What Rollback does, is reverse and changes you have made to the database, allowing you to undelete the table you accidentally sent to the data afterworld.
Here is how it works. You start by letting the system know to you are making changes that you may want to reverse. The syntax is the as follows
Start transaction;
delete from <table> where <column> = <Value>;
Now comes the life saving part. If you realize you made a mistake, simply type Rollback;
If, on the other hand, you like the results, then type Commit;
The results will be saved and the transaction instance will be closed out.
Now lets try adding a new row, using start transactionIf that is not what you want, just type Rollback; and your mistake is goneOn the other hand, if you want to save the results, simply type commit; and the new record will stay
Important NOTE: Rollback only works if you first Start Transaction: — it is two words that will save your job. Start every instance of adding or deleting data or object with Start Transaction: – trust me on this one.
SQL – short for Sequel Query Language is a programming language designed to work with data stored in RDBMS (relational database management systems). The data managed by SQL is tabular, meaning it formatted in rows and columns, very much like an Excel spreadsheet.
Temporary tables are a great way of working on complex data requests. They are easy to create and they delete themselves after every session, so you do not have to worry about creating a big mess with a bunch of tables you need to go clean up later.
In this tutorial, I am going to use a real world example from my work in Verizon’s Cyber Security Department. This is a simplified version of ask, and I am using completely made up data. There is no data from Verizon on my website every. I simply discuss use cases to make learning analytics more grounded in the real world
Below is a list of dates, PhoneNum: phone numbers called about, and the CallerNum: the number the person is calling from. While there are many legitimate reasons for someone to call customer support from another number (I drop and break my phone so I borrow my co-workers phone and call customer support to request a replacement), a number that calls in repeatedly about many different numbers is a red flag of someone that could be a fraudster.
If you want to play along, you can download the data set here:
I am using MySql in this example as my database, but I will include the code for SQL Server, Teradata, and Oracle platforms as well.
So the ask is find CallerNum that is calling about many different PhoneNum
While I am sure you can make a complex subquery to do this job, but I’m going to show you how to use temporary tables to make this ask very simple:
As you can see above, I loaded the data into a table called dbtest.numberslist
Now to find out how many CallerNum are calling about multiple PhoneNum, a simple solution is to get a list of all distinct combinations of PhoneNum and CallerNum and then do a count of CallerNums from this distinct list. Since the list is distinct, a CallerNum calling in about the same PhoneNum will only appear once, so a CallerNum calling about multiple PhoneNums will appear multiple times.
So using temporary tables, I will create a table that holds the distinct call combinations
MySql (code is create temporary table <table name> then query to fill table
create temporary table distCalls
select distinct phonenum, callerNum from dbtest.numberslist;
Select * from distCalls -- shows what is in the table now
Now, lets see if we can find potential fraud callers, let us do a count of callerNum from the distinct temporary table
As you can see above, there are 4 numbers that have called about 4 distinct phone numbers during this time period. Again, this could be for legitimate reasons, but this is still something we look at when trying to find questionable activity.
Select distinct PhoneNum, CallerNum
into #distCalls
from dbtest.numberslist
go
#tableName -- indicated temporary tables in SQL Server
Teradata
Create volatile table distCalls as (
select distinct PhoneNum, CallerNum
from dbtest.numberslist)
with data
on commit preserve rows;
with data and on commit preserve rows are needed at the end if you want any data to be in your table when you go to use itOracle
Create private temporary table distCalls as
select distinct PhoneNum, CallerNum
from dbtest.numberslist;
Remember, temp tables delete themselves after each session (each time you log off the database). If you are working in the same session and need to recreate the temp table for some reason, you can always drop the table just as you would any other table object in SQL.
We use the odbcDriverConnect() function. Inside we pass a connection = value
Driver = {SQL Server Native Client 11.0}; — this is based on the version of SQL Server you have
server=localhost; — I used localhost because the SQL Server was on the same computer I was working with. Otherwise, pass the server name
database=SSRSTraining; — name of database I want to work with
trusted_connection=yes; — this means I am able to pass my Windows credentials.
If you don’t have a trusted connect pass the user Id and password like this
uid = userName; pwd = Password;
Note each parameter is separated by a semicolon
Query the database
> ##passes query to SQL Server
> df <- sqlQuery(cn, "select * FROM [SSRSTraining].[dbo].[JobDataSet]")
> head(df)
Name Job Hours Complete
1 Sally Predictive Model 1 n
2 Philip Maintanence 10 n
3 Tom Ad-hoc Report 12 y
4 Bob SSRS 3 y
5 Philip Tableau 7 n
6 Tom Tableau 9 n
using sqlQuery() – pass through the connection string (cn) and enclose your query in ” ”