Quantcast
Channel: Blog – Center for Data Innovation
Viewing all articles
Browse latest Browse all 1221

Documenting Historical Newswire Articles

$
0
0

Researchers at Harvard University have created a dataset that contains almost three million articles from newswire services, which are services that distribute news stories and content to media outlets, published between 1878 and 1977. The researchers built the dataset by extracting roughly 140 million articles from the front pages of local U.S. newspapers and using a deep learning model to analyze their image scans. For each newswire article, the dataset lists the newspapers that covered it, the publication dates, the dispatch location, people mentioned in the text, and the general topic.

Get the data.

Image credit: Annie Spratt


Viewing all articles
Browse latest Browse all 1221

Trending Articles