Python: Webscraping using BeautifulSoup and Requests

I covered an introduction to webscraping with Requests in an earlier post. You can check it out here: Requests

As a quick refresher, the requests module allows you call on a website through Python and retrieve the HTML behind the website. In this lesson we are going to add on to this functionality by adding the module BeautifulSoup.

BeautifulSoup

BeautifulSoup provides an useful HTML parser that makes it much easier to work with the HTML results from Requests. Let’s start by importing our libraries we will need for this lesson

The syntax is BeautifulSoup(HTML, ‘html.parser’)

The HTML I am sending to BeautifulSoup comes from my request.get() call. In the last lesson, I used r.text to print out the HTML to view, here I am passing r.content to BeautifulSoup and printing out the results.

Note I am also using the soup.prettify() command to ensure my printout is easier to read for humans

BeautifulSoup makes parsing the HTML code easier. Below I am asking to see soup.title – this returns the HTML code with the “title” markup.

To take it even another step, we can add soup.title.string to just get the string without the markup tags

soup.get_text() returns all the text in the HTML code without the markups or other code

In HTML ‘a’ and ‘href’ signify a link

We can use that to build a for loop that reads all the links on the webpage.

	Vaibhav on Data Modeling
	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…

	Vaibhav on Data Modeling
	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…

	Vaibhav on Data Modeling
	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…

	Vaibhav on Data Modeling
	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…

	Vaibhav on Data Modeling
	Anonymous on Python: Accessing a SQL databa…
	Anonymous on Top 7 skills a Data Analyst ha…
	lovingfox4e1d0e653e on Data Jobs: What does a Data An…
	Anonymous on Top 7 skills a Data Analyst ha…

Analytics4All

Python: Webscraping using BeautifulSoup and Requests

BeautifulSoup

Like this:

Related

Leave a ReplyCancel reply

BeautifulSoup

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Analytics4All