Company
Date Published
Author
Sam Agnew
Word count
1207
Language
English
Hacker News points
None

Summary

With Python tools like Beautiful Soup, developers can scrape and parse data directly from web pages to use for their projects and applications. This allows them to access a wide variety of information programmatically, even if it's not available through a dedicated REST API. The article uses an example of scraping MIDI data from the internet to train a neural network with Magenta that can generate classic Nintendo-sounding music. It covers setting up dependencies, getting familiar with Beautiful Soup, parsing and navigating HTML, downloading files, and provides a practical example of using Beautiful Soup to scrape data from the Video Game Music Archive. The article emphasizes the importance of keeping code up-to-date as changes to web pages' HTML might break it, and encourages readers to explore possibilities for working with scraped data, such as cleaning it up or training neural networks.