更新时间:2021-06-30 18:45:06
coverpage
Title Page
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Packt Upsell
Why subscribe?
PacktPub.com
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Reviews
Getting Started with Scraping
Introduction
Setting up a Python development environment
Getting ready
How to do it...
Scraping Python.org with Requests and Beautiful Soup
Getting ready...
How it works...
Scraping Python.org in urllib3 and Beautiful Soup
How it works
There's more...
Scraping Python.org with Scrapy
Scraping Python.org with Selenium and PhantomJS
Data Acquisition and Extraction
How to parse websites and navigate the DOM using BeautifulSoup
Searching the DOM with Beautiful Soup's find methods
Querying the DOM with XPath and lxml
Querying data with XPath and CSS selectors
Using Scrapy selectors
Loading data in unicode / UTF-8
Processing Data
Working with CSV and JSON data
How to do it
Storing data using AWS S3
Storing data using MySQL
Storing data using PostgreSQL
Storing data in Elasticsearch
How to build robust ETL pipelines with AWS SQS
How to do it - posting messages to an AWS queue