This repo provides the source code and data of our paper.
Please install the main requirements by running pip install -r requirements.txt
.
- Download all the processed data from the link under
processed_data
and place it in the./processed_data/
folder. - To reproduce our method's experimental results in open-domain QA (Table 1 of the paper), please refer to
./scripts/odqa_scripts/
. - To reproduce our method's experimental results on synthetic data (Figure 2 of the paper), please refer to
./scripts/synthetic_scripts/
.
If you want to construct retrieval data from scratch, you first need to download four original QA datasets from this link and place them in the original_data
folder. Then refer to ./original_data/download_data.sh
to download Wikipedia passages, and then refer to ./scripts/retrieval_scripts/
to construct retrieval-augmented queries.
Our code primarily refers to NBCE and lost-in-the-middle. Thanks for their awesome implementations.