Skip to content

edwardbickerton/Fed-Scraper

Repository files navigation

Fed Scraper

This web scraper, built using the Scrapy framework, collects text data from various documents surrounding Federal Open Market Committee (FOMC) meetings found on the Federal Reserve website.

Spiders

This scrapy project consists of the following spiders:

  1. beige_book_archive
  2. beige_book_current
  3. fomc_calendar
  4. historical_materials

Usage

The spiders can be run with the scrapy command line tool by running the scrapy crawl command from the scrapy project directory. I recommend running the spiders in the order listed above.

Alternatively, the data will be made available on kaggle at https://www.kaggle.com/datasets/edwardbickerton/fomc-text-data.

Output

The scrapy spiders save each document into a row of the csv file, data/fomc_documents.csv, which has the following columns:

  1. document_kind
    • A list of document kinds in the dataset can be found here.
  2. meeting_date
    • The date of the FOMC meeting associated with the document.
    • For Beige Books scraped from the beige_book spiders, release_date but not meeting_date is made available. For these documents I set the meeting date to the closest subsequent meeting via the scrapy pipeline MeetingDatesPipeline.
  3. release_date
  4. url
    • The web address of the document.

The documents are then grouped based on their document_kind and split up into different csv files found in the data/documents_by_type directory.

Details of the csv files can be found in this table.

About

Scrapes documents surrounding FOMC meetings from the Federal Reserve website.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages