Python zip and unpack

On April 19, 2016July 25, 2022 By Ben Larson Ph.D.In Python, Python: Learn Python Course1 Comment

zip

Zip is a quick way to take multiple lists can combine them.

unpacking

To reverse the effect, turn a list of tuples into separate lists, use * in the parenthesis: zip(*x)

Try your code in our online Python console:

If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT.

Last Lesson: lambda, map, reduce, filter

Next Lesson: list comprehension

Back to Python Course: Course

Python: Regular Expressions

On April 19, 2016July 25, 2022 By Ben Larson Ph.D.In Python, Python: Learn Python CourseLeave a comment

Regular Expressions are used to parse text. In the world of Big Data, being able to work with unstructured text is a big advantage.

To use regular expressions, first you must import the module. This is done by placing the command import re at the top of your code.

re.search()

Now, let us examine this code below:

We want to see if dog (assigned to x) is in the sentence ‘I just a saw dog. He was chasing a cat.'(assigned to y)

Using the search() method from re, we ask if re.search(x,y). Note you place the item you are searching by first in the parenthesis. re.search() returns a boolean value (True, False).

You can use re.search with lists of search items as well.

Here z is taking one item from the list x at a time and running it through re.search. Notice ‘one’ returns True, while ‘two’ returns false.

Try your code in our online Python console:

re.findall()

re.findall returns all instances of your search term. Notice it found water whether it was a stand alone word, or part of a larger word.

re.split()

The re.split() method does pretty much what you would think it does. You can pick a delimiter and the method will split your string at that delimiter.

In the example below, ‘;‘ is my delimiter. Notice how it split my string in two, plus removed the delimiter for me.

pythonreg3

Try your code in our online Python console:

use re.search() to find position

You can use re.search() to find the starting and ending position of a search item in a string

pythonreg4

exclusion

If you want to exclude characters, use the ^ between square brackets [].

This example excludes the letter s = [^s] and puts the remaining characters in a list

In the second example, I add + after the []. This keeps all the characters together.

This next example is a useful tool you will find yourself using in text mining. Here we use [^?!. ]+ to remove punctuation.

pythonreg6

Try your code in our online Python console:

If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT.

Last Lesson: Generators

Next Lesson: kwargs and args

Back to Python Course: Course

Python: Generators

On April 18, 2016July 25, 2022 By Ben Larson Ph.D.In Python, Python: Learn Python Course1 Comment

I apologize in advanced as this topic is going to get a little computer sciency. Unfortunately there is no way around it. At first look, generators are going to resemble all the other iterables we have covered so far. But unlike loops we have used so far, generators produce iterables “lazily”.

So what do I mean by “lazily”?

Let’s consider this infinite loop I created below. This will run forever, producing a List (at least until your computer’s memory runs out). And that is the issue. Eventually the computer will run out of memory.

pythonGen

**note if you actually decide to run the code above, you will need to force it to stop. Just closing the notebook won’t do it. Go to File>Close and Halt to stop this loop from running and eventually crashing your PC.

pythonGen1

This becomes are consideration when working with large data sets. The more memory you chew up, the worse you performance. So a work around is to use generators. Generators produce the data lazily, meaning they produce the iterator, yield it, and then forget it – they don’t put the values into a List like regular iterators do. They only yield one iterator at a time.

Notice my continued use of the world yield? There is a reason for that. Look at the code for a generator below:

Note that generators have to be functions. The yield command – which signifies a generator, cannot be used outside of a function.

Try your code in our online Python console:

Now, I know this looks like I am actually repeating work here. I mean I am using two loops to do what I could do with one loop. This issue arises when we have large numbers. If you are only iterating a few hundred numbers, you probably don’t need to worry about generators. However, if you are going to be iterating 100,000 elements, you will see some major performance improvements by using the code above.

If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT.

Last lesson: list comprehension

Next Lesson: regular expressions

Back to Python Course: Course

R: Intro to Statistics – Central Tendency

On April 16, 2016 By Ben Larson Ph.D.In R, UncategorizedLeave a comment

Central Tendency

One of the primary purposes of statistics is to find a way to summarize data. To do so, we often look for numbers known as collectively as measurements of central tendency (mean, median, mode).

Look at the list below. This is a list of weekly gas expenditures for two vehicles. Can you tell me if one is better than the other, or are they both about the same?

rCentral

How about if I show you this?

rCentral1

Using the average, you can clearly see car2 is more cost efficient. At approx $21 a week, that is a savings of $630 over the course of the 30 weeks in the chart. That is a big difference, and one that is easy to see. We can see this using one of the 3 main measures of central tendency – the arithmetic mean – popularly called the average.

Mean

The mean – more accurately the arithmetic mean – is calculated by adding up the elements in a list and dividing by the number of elements in the list.

In R, finding a mean is simple. Just put the values you want to average into a vector (**note to make a vector in R: var <-c(x,y,z)) We then put the vector through the function mean()

rCentral3

Median

The median means, simply enough, the middle of an ordered list. This is also easy to find using R. As you can see, you do not even need to sort the list numerically first. Just feed the vector to the function median()

Mode

This is the last of 3 main measure. Mode returns the most common value found in a list. In the list 2,3,2,4,2 – the mode is 2. Unfortunately R does not have a built in mode function, so for this, we will have build our own function.

For those familiar with functions in programming, this shouldn’t be too foreign of a concept. However, if you don’t understand functions yet, don’t fret. We will get to them soon enough. For now, just read over the code and see if you can figure any of it out for yourself.

rCentral5

Python: List Comprehension

On April 15, 2016July 25, 2022 By Ben Larson Ph.D.In Python, Python: Learn Python Course4 Comments

List comprehensions are a method for taking a list of elements and performing a transformation on each element.

In our first example, we want to take numbers 0-9, square them, and have the result end up in a list.

ln[5] shows how you would perform this task using for a loop to iterate.

ln[1] does the same thing, but it does it in one line.

pythonListcomp

Try your code in our online Python console:

As you can see, the list comprehension basically crunches the for loop into one line. The syntax is simple enough:

S = [x**2 for x in range(10)]

Assign a variable = [operation for loop]

Find even numbers in a list:

Here we add an if statement inside the iteration. (x%2 is modulus, meaning it returns remainder. So 4%2 returns a remainder of 0 and 5%2 returns are remainder of 1)

You can use list comprehensions with more complex formulas:

pythonListcomp3

You can even use functions from within a list comprehension

Try your code in our online Python console:

If you enjoyed this lesson, click LIKE below, or even better, leave me a COMMENT.

Last Lesson: Zip and unpack

Next Lesson: Generators

Back to Python Course: Course

	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…
	Anonymous on Python Web Scraping / Automati…

	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…
	Anonymous on Python Web Scraping / Automati…

	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…
	Anonymous on Python Web Scraping / Automati…

	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…
	Anonymous on Python Web Scraping / Automati…

	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…
	Anonymous on Python Web Scraping / Automati…

Analytics4All

Tag: programming

Python zip and unpack

zip

unpacking

Python: Regular Expressions

re.search()

re.findall()

re.split()

use re.search() to find position

exclusion

Python: Generators

R: Intro to Statistics – Central Tendency

Central Tendency

Mean

Median

Mode

Python: List Comprehension