Abstract: Analyzing SEC filings is a highly professional task in business and finance. However, existing general-purpose large language models (LLMs) have poor performance on analyzing SEC filings, e.g., GPT-4-Turbo (closed-book setting) only correctly answered
There are six major reasoning failures: information extraction error, misuse of concepts, miscalculation of financial ratios, comparison errors of values, misunderstanding of context, and wrong unit. Our findings highlight the reasoning challenges current LLMs still face when analyzing SEC filings, challenges that may directly affect valuation, compliance assessments, risk management, etc.
This project uses the open-source dataset and SEC filings from the FinanceBench.
@misc{islam2023financebench,
title={FinanceBench: A New Benchmark for Financial Question Answering},
author={Pranab Islam and Anand Kannappan and Douwe Kiela and Rebecca Qian and Nino Scherrer and Bertie Vidgen},
year={2023},
eprint={2311.11944},
archivePrefix={arXiv},
primaryClass={cs.CL}
}