Web scraping Python Html to JSON/Dict Project with Code

In this post, we’ll see how to create Web scraping Python Projects with code. So, we are going to Scrap Covid-19 statistics data and Convert the scraped Html table to python dict/json using Beautifulsoup, List Comprehension, Python Requests and lxml library.

If you like videos like this consider donating $1, or simply turn off AdBlocker. Either helps me to continue making tutorials.

Code below follows the video to help :

Transcript / Cheat Sheet :

Step 1: Search for the Required Link:

So searching a relevant link plays a important role in our Web scraping Python Projects as whole code depends on the link to be scrapped. If in later stage you decides the change the link then there will lot of mess.

Best Practices :

Authenticity of Link to be scrapped.
Data you wanna scrap shouldn’t be in encrypted form in Inspect Element/ Page Source.
Site shouldn’t block you for scraping (major issue) : If site blocks you need to find a workaround by modifying the requests with User Agent and Adding Proxy.

Step 2: Analyze the Data :

Once we got required link, next step is to analyze the classes, tags and id’s where our data resides. For this post we are scraping the data from Html table. So we are interested in finding the id attached to html table. Later we’re going to use this Html table id for filtering out the table from whole page source and Convert the Html Table to JSON / Python Dict using list comprehension and beautifulsoup.

Step 3: Install the required Libraries:

pip install requests
pip install bs4
pip install lxml

Step 3: Get your Hands dirty with Coding :

import requests
import bs4

page=requests.get("https://covid-19.hackanons.com/test.html")

soup= bs4.BeautifulSoup(page.text,'lxml')

table=soup.find('table',id="main_table_countries_today")

headers=[ heading.text.replace(",Other","") for heading in table.find_all('th')]

table_rows=[ row for row in table.find_all('tr')]

results=[{headers[index]:cell.text for index,cell in enumerate(row.find_all("td")) }for row in table_rows]

for i in results:
    if "Country" in i:
        if i["Country"]=="India":
            print(i)

Categorized in:

Uncategorized

Tagged in:

Web scraping Python Projects with code

Web scraping Python Html to JSON/Dict Project with Code

Other Stories

TOP IOT Platforms

How to make Laptop Online on Skype Sapience WFH

Press ESC to close

Or check our Popular Categories...

Related Articles

How to Install Python Latest Version?

Top 6 Android Emulator Online Solutions for App Testing

Exploring the Best Python Online Compiler

How to Sign Up for Google Cloud Skills Boost?

Other Stories

TOP IOT Platforms

How to make Laptop Online on Skype Sapience WFH