About This Site
This site is a repository of indices for PyTerrier and Terrier.
This site is a repository of indices for PyTerrier and Terrier.
The Vaswani NPL corpus is a small test collection of 11,000 abstracts has been used by the Glasgow IR group for many years (created 1990). Due to its small size, it is used for many test cases used in both Terrier and PyTerrier.
A document ranking corpus containing 3.2 million documents. Also used by the TREC Deep Learning track.
A passage ranking task based on a corpus of 8.8 million passages released by Microsoft, which should be ranked based on their relevance to questions. Also used by the TREC Deep Learning track.
A new version of the MSMARCO document ranking corpus, containing 11.9 million documents. Also used by the TREC 2021 Deep Learning track.
A revised corpus of 138M passages released by Microsoft in July 2021, which should be ranked based on their relevance to questions. Also used by the TREC 2021 Deep Learning track.
A collection of scientific articles related to COVID-19. This uses the 2020-07-16 version of the CORD-19, which is used by the TREC COVID complete benchmark.