Web Scraping with Python - Beautiful Soup Crash Course

1h 08m video Transcribed Jun 30, 2026 F freeCodeCamp.org

Beginner 25 min read For: Beginner Python developers interested in learning web scraping with Beautiful Soup.

1.8M

Views

39.2K

Likes

1.1K

Comments

526

Dislikes

2.2%

📈 Moderate

AI Summary

This tutorial teaches web scraping with Python using the Beautiful Soup library. It covers basic HTML parsing, then progresses to scraping a real job listing website (TimesJobs) with requests, filtering results, and saving data to files. The course is structured for beginners and includes hands-on coding examples.

Chapters

1 Introduction and Setup 00:00 2 Scraping Basic HTML 05:36 3 Scraping Real Website with Requests 16:35 4 Extracting Job Details and Filtering 30:10 5 Saving Results and Automating 48:30

[00:33]

Goal of Tutorial

Will teach web scraping using Beautiful Soup, starting with basic HTML page, then real website, and finally storing data in files.

[05:36]

Installing Libraries

Install beautifulsoup4 and lxml parser using pip install.

[07:20]

Reading HTML File

Open home.html with with open and read its content into a variable.

[12:50]

Finding Tags with Beautiful Soup

Use find() to get first match and find_all() to get all matches. Access tag text via .text.

[16:35]

Inspecting Real Websites

Use browser inspect (right-click) to view HTML structure and identify elements to scrape.

[24:47]

Scraping TimesJobs with Requests

Use requests.get(url).text to fetch page HTML, then parse with Beautiful Soup.

[33:10]

Extracting Job Details

Find company name (h3.job-list-comp-name), skills (span.srp-skills), and posted date (span.sim-posted) using find with class_ parameter.

[42:00]

Filtering Jobs by Date

Only include jobs with 'few' in the posted date text to get recent postings.

[56:00]

Saving Results to Files

Write each job's info to a separate text file in the 'posts' directory using with open and f.write.

[68:00]

Conclusion

Final wrap-up: encouraged to subscribe and check the channel for more content.

This tutorial provides a comprehensive introduction to web scraping with Python, covering local HTML parsing, real website scraping with requests, filtering, and saving data to files.

Clickbait Check

90% Legit

"Delivers exactly what the title promises: a full crash course on web scraping with Beautiful Soup, from basics to advanced filtering and saving."

Mentioned in this Video

Beautiful Soup 4

tool

lxml parser

tool

requests library

tool

freeCodeCamp

service

GymShape Coding (presenter's channel)

person

TimesJobs

service

Tutorial Checklist

1 00:33 Understand the goal: learn web scraping with Beautiful Soup.

2 05:36 Install beautifulsoup4 and lxml using 'pip install beautifulsoup4 lxml'.

3 07:20 Open a local HTML file using 'with open('home.html', 'r') as html_file:' and read its content.

4 11:10 Create a Beautiful Soup object: soup = BeautifulSoup(content, 'lxml').

5 12:50 Use soup.find('tag') to get the first element or soup.find_all('tag') to get all.

6 15:57 Access text inside a tag using .text attribute.

7 20:02 Filter by class: use class_='class-name' in find/find_all.

8 24:47 For a real website, use requests.get(url).text to get HTML.

9 30:10 Parse the fetched HTML with Beautiful Soup and find list items (li) with a specific class.

10 33:10 Within each job element, find company name (h3), skills (span), and posted date (span) using find with class_.

11 42:00 Filter jobs by checking if 'few' is in the posted date text.

12 56:00 Save each job's info to a separate text file using 'with open(f'posts/{index}.txt', 'w') as f: f.write(...)'.

Study Flashcards (11)

What library is used for web scraping in this tutorial?

easy Click to reveal answer

Beautiful Soup (bs4).