Python: Numpy

First off, CONGRATS for making it this far. Numpy really signifies the first step in real data science with Python.

Numpy is a library that adds advanced mathematical capabilities to Python. If you are using the Anaconda release of iPython, you already have numpy installed. If not, you will need to go over to their site and download it. Link to numpy site: www.numpy.org

Numpy adds multi-dimensional array capabilities to Python. It also provides mathematical tools for dealing with these arrays.

Array

What is an array? In different programming languages, arrays are equivalent to lists in Python. In fact, a single dimension array in Python can be used interchangeably with lists in most cases.

What makes arrays special in Python is the ability to make multidimensional arrays. Those who have taken math courses like Linear Algebra will better know multidimensional arrays by the term matrix. And if you paid attention in Linear Algebra, you will know there is a lot of cool things you can do with matrices.

For those who didn’t take Linear Algebra (you probably went out and made friends and had a social life), don’t worry, I will fill you in on what you need to know.

Numpy

Let’s start by creating a single dimension array

  • import numpy as np   — imports numpy into your notebook, we set the alias to np
  • x = np.array([8,7,4])  — array is a method of np, so the syntax is np.array. **note the [] inside the ()
  • x[1] = you call array just like lists
  • x.shape = shows you shape of your array, since we are single dimension, the answer is a single number – **note 3L – the L indicates integer in Python

pythonnumpy.jpg

Now let’s make a 2 x 3 matrix. When describing a matrix (or multi-dim array) the first number indicates the number of elements across and the second number indicates down. This is standard mathematical notation.

You can see I create a new row by enclosing the number sets in [] separated by a ‘,’

Now when want to call an item from the multi-dim array, you need to use 2 index numbers. y[0,1] = y[row, column]

pythonnumpy2

np.arange()

np.arange() is a command we can use to auto populate an array.

  1. np.arang(5) –create a one – dim array with 5 elements
  2. np.arange(10).reshape(2,5) – reshape lets you convert a one-dim array into a multi-dim array

pythonnumpy4

Transpose

Transposing a matrix means flipping it on its axis. This comes in handy in applications like multi-variate regressions. To transpose our array in Python, we use the “.T” method

pythonnumpy5.jpg

dot Product

Performing a dot product on a matrix is a useful calculation, but it is very time consuming. Numpy makes it easy though.

pythonnumpy6.jpg

If you need a brush up on dot products, this is a great link: matrix multiplication


If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT. 

Follow this link for more Python content: Python

 

 

Advertisements

Python: Install Python

 

While Python 3 has been out now for a while, Python is not backwards compatible and most data science based libraries run on old reliable Python 2.7.  So that is the version I will be teaching.

If you are an advanced computer user, you can run both versions of Python on your computer simultaneously. Feel free to do so if you want to see the differences in the versions. While version 3 is  becoming more widely used in some areas, most of the libraries and information you will find involving Data Science or Analytics will still use version 2.7.

Downloading Python

Now, you can always download Python at http://www.python.org. However, I recommend downloading Anaconda. This distribution comes with more than 400 of the more popular Python packages in math, science, engineering, and more importantly – data analysis. The link to download Anaconda is: Anaconda Download.

For detailed installation instructions: Anaconda Install

Again, you do not need to use the Anaconda distribution of Python, but it will make following along with my tutorials much easier.

Another great advantage of Anaconda is that is comes with iPython already installed, which is a very popular IDE used by Data Scientists.

Running Python

Once you have Anaconda installed, open the Anaconda Prompt

pythonInstall.jpg

It will open like a Command Prompt / Terminal Window

At the prompt type: jupyter notebook

The Jupyter Notebook will open your default browser.

To start using Python, go to New in the upper right corner and select Python 2

pythonInstall1.jpg

Double Click Untitled at the top of your new notebook to change the name. Let’s  call this one Fundamentals

pythonInstall2

pythonInstall3

Using the Notebook

In Jupyter Notebooks, we work in the shaded rectangles marked In[]. To see the output of your command, you press Shift+Enter. Enter alone adds another line to the code block you are working on, but does not execute the code.

pythonInstall4

** note in the second example, Python only executes 1-2. Make sure when you have separate executable, you hit Shift+Enter each one.

 


If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT

Follow this link for more Python content: Python

Next: Fundamentals