- Download and unzip The CLEF Test Suite for the CLEF 2000-2003 Campaigns β Evaluation Package.
- Set environment variable
CLEF_HOMEto point to the location of the unzipped dataset. - (Optional) Download and unzip Swahili (SW) and Somali (SO) CLEF queries here.
- (Optional) Set
CLEF_LOWRES_DIRinclef_paths.pyto where you unzipped the dataset.
Dataloaders expect the following structure after downloading and unzipping CLEF:
clef/
βββ clef-low-resource
β βββ long_paper
βββ DocumentData
β βββ dutch
β βββ english
β βββ finnish
β βββ french
β βββ german
β βββ italian
β βββ russian
βββ RelAssess
β βββ 2001
β βββ 2002
β βββ 2003
βββ Topics
βββ 2001
βββ 2002
βββ 2003@inproceedings{Bonab2019swahiliclef,
author = {Bonab, Hamed and Allan, James and Sitaraman, Ramesh},
title = {Simulating CLIR Translation Resource Scarcity Using High-Resource Languages},
year = {2019},
url = {https://doi.org/10.1145/3341981.3344236},
booktitle = {Proceedings of ICTIR},
pages = {129β136},
}@inproceedings{braschler2003clef,
title={{CLEF 2003--Overview of results},
author={Braschler, Martin},
booktitle={Workshop of the Cross-Language Evaluation Forum for European Languages},
pages={44--63},
year={2003},
organization={Springer}
}