Visconde: Multi-document QA with GPT-3 and Neural Reranking

In this repository, we maintain the code used in the paper Visconde: Multi-document QA with GPT-3 and Neural Reranking, accepted for publication at the European Conference on Information Retrieval ECIR2023.

Abstract: This paper discusses a question-answering system that can answer questions whose supporting evidence is spread over multiple (potentially long) documents. The system, called Visconde, uses a three-step pipeline to perform the task: decompose, retrieve, and aggregate. The first step decomposes the question into simpler questions using a few-shot large language model (LLM). Then, a state-of-the-art search engine is used to retrieve candidate passages from a large collection for each decomposed question. In the final step, we use the LLM in a few-shot setting to aggregate the contents of the passages into the final answer. The system is evaluated on three datasets: IIRC, Qasper, and StrategyQA. Results suggest that current retrievers are the main bottleneck and that readers are already performing at the human level as long as relevant passages are provided. The system is also shown to be more effective when the model is induced to give explanations before.

We evaluated our proposal on three datasets: IIRC, QASPER and StrategyQA.

Reproducing

Download datasets

sh setup.sh

IIRC

QASPER

Rerank paragraphs by question
Generate explanations for training examples
Testing
Compute metrics For computing metrics download run:

python qasper_evaluator.py --predictions PREDICTIONS_FILE --gold data/qasper-test-v0.3.json --text_evidence_only

StrategyQA

Create indices
Decompose questions
Create lists for reranking
Rerank items (GPU required)
Testing
Compute metrics For computing metrics clone this repository and run the evaluator.

How to Cite

@inproceedings{10.1007/978-3-031-28238-6_44,
author = {Pereira, Jayr and Fidalgo, Robson and Lotufo, Roberto and Nogueira, Rodrigo},
title = {Visconde: Multi-Document QA With&nbsp;GPT-3 And&nbsp;Neural Reranking},
year = {2023},
isbn = {978-3-031-28237-9},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
url = {https://doi.org/10.1007/978-3-031-28238-6_44},
doi = {10.1007/978-3-031-28238-6_44},
booktitle = {Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part II},
pages = {534–543},
numpages = {10},
location = {Dublin, Ireland}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
iirc_create_indices.ipynb		iirc_create_indices.ipynb
iirc_generate_and_evaluate.ipynb		iirc_generate_and_evaluate.ipynb
iirc_generate_explanations.ipynb		iirc_generate_explanations.ipynb
iirc_prepare_to_rerank.ipynb		iirc_prepare_to_rerank.ipynb
iirc_query_decomposition.py		iirc_query_decomposition.py
iirc_rerank.ipynb		iirc_rerank.ipynb
qasper_evaluator.py		qasper_evaluator.py
qasper_generate.ipynb		qasper_generate.ipynb
qasper_generate_explanations.ipynb		qasper_generate_explanations.ipynb
qasper_rerank.ipynb		qasper_rerank.ipynb
setup.sh		setup.sh
strategyqa_create_indices.py		strategyqa_create_indices.py
strategyqa_create_rerankable.py		strategyqa_create_rerankable.py
strategyqa_decompose_query.ipynb		strategyqa_decompose_query.ipynb
strategyqa_generate.ipynb		strategyqa_generate.ipynb
strategyqa_rerank.ipynb		strategyqa_rerank.ipynb

neuralmind-ai/visconde

Folders and files

Latest commit

History

Repository files navigation

Visconde: Multi-document QA with GPT-3 and Neural Reranking

Reproducing

IIRC

QASPER

StrategyQA

How to Cite

About

Resources

Stars

Watchers

Forks

Languages