Page 1 of 1

We are excited to scale up our

Posted: Thu Jul 10, 2025 3:27 am
by asimm22
“Our first stage of the Archives Unleashed Project,” explains Professor Milligan, “built a stand-alone service that turns web archive data into a format that scholars could easily use. We developed several tools, methods and cloud-based platforms that allow researchers to download a large web archive from which they can analyze all sorts of information, from text and network data to statistical information. The next logical step is to integrate our service with the Internet Archive, which will allow a scholar to run the full cycle of collecting and analyzing web archival content through one portal.”

“Researchers, from both the sciences and the buy sales lead humanities, are finally starting to realize the massive trove of archived web materials that can support a wide variety of computational research,” said Bailey. “ collaboration with Archives Unleashed to make the petabytes of web and data archives collected by Archive-It partners and other web archiving institutions around the world more useful for scholarly analysis.”

The project begins in July 2020 and will begin releasing public datasets as part of the integration later in the year. Upcoming and future work includes technical integration of Archives Unleashed and Archive-It, creation and release of new open-source tools, datasets, and code notebooks, and a series of in-person “datathons” supporting a cohort of scholars using archived web data and collections in their data-driven research and analysis. We are grateful to The Andrew W. Mellon Foundation for their support of this integration and collaboration in support of critical infrastructure supporting computational scholarship and its use of the archived web.