Website URL Excel Task

This package contains a Python script for both tasks:

Task 1: Broken links report

Read website page URLs from Excel
Open each page
Extract all <a href=""> links
Check which links are broken/inaccessible
Export results to a new Excel file

Task 2: Download all images

Read website page URLs from Excel
Open each page
Extract all <img src=""> image links
Download all images into one folder

Input Excel format

Keep the first column header as:

URL

Example rows:

Install requirements

Open terminal / command prompt:

pip install requests beautifulsoup4 openpyxl

Run Task 1

python scraper_task.py --task broken_links --input input_urls.xlsx --output broken_links_report.xlsx

Run Task 2

python scraper_task.py --task download_images --input input_urls.xlsx --output downloaded_images

Notes

Some websites block scraping or block HEAD requests. The script tries GET if needed.
Relative links like /about are automatically converted to full URLs.
mailto:, tel:, javascript:, and #anchor links are ignored.
If a page itself does not open, the script adds that page as an error in the report.

Output for Task 1

Excel columns:

pageURL
Broken Links
Status

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
downloaded_images		downloaded_images
README.md		README.md
broken_links_report.xlsx		broken_links_report.xlsx
input_urls.xlsx		input_urls.xlsx
scraper_task.py		scraper_task.py
website_link_validator_documentation.docx		website_link_validator_documentation.docx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Website URL Excel Task

Task 1: Broken links report

Task 2: Download all images

Input Excel format

Install requirements

Run Task 1

Run Task 2

Notes

Output for Task 1

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Website URL Excel Task

Task 1: Broken links report

Task 2: Download all images

Input Excel format

Install requirements

Run Task 1

Run Task 2

Notes

Output for Task 1

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages