Create your very own web scraper and crawler using Go and Colly!
📂 makescraper
├── README.md
└── scrape.go-
Visit github.com/new and create a new repository named
makescraper. -
Run each command line-by-line in your terminal to set up the project:
$ git clone git@github.com:Make-School-Labs/makescraper.git $ cd makescraper $ git remote rm origin $ git remote add origin git@github.com:YOUR_GITHUB_USERNAME/makescraper.git $ go mod download -
Open
README.mdin your editor and replace all instances ofYOUR_GITHUB_USERNAMEwith your GitHub username to enable the Go Report Card badge.
Complete each task in the order they appear. Use GitHub Task List syntax to update the task list.
- IMPORTANT: Complete the Web Scraper Workflow worksheet distributed in class.
- Create a
structto store your data. - Refactor the
c.OnHTMLcallback on line16to use the selector(s) you tested while completing the worksheet. - Print the data you scraped to
stdout.
- Add more fields to your
struct. Extract multiple data points from the website. Print them tostdoutin a readable format.
- Serialize the
structyou created to JSON. Print the JSON tostdoutto validate it. - Write scraped data to a file named
output.json. - Add, commit, and push to GitHub.
- BEW 2.5 - Scraping the Web: Concepts and examples covered in class related to web scraping and crawling.
- Colly - Docs: Check out the sidebar for 20+ examples!
- Ali Shalabi - Syntax-Helper: Command line interface to help generate proper code syntax, pulled from the Golang documentation.
- JSON to Struct: Paste any JSON data and convert it into a Go structure that will support storing that data.
- GoByExample - JSON: Covers Go's built-in support for JSON encoding and decoding to and from built-in and custom data types (structs).
- GoByExample - Writing Files: Covers creating new files and writing to them.