This repository represent studies and Collection of LLM-powered reasoning frameworks π§ for Target Sentiment Analysis. It contains source code for paper @ LJoM journal titled as: Large Language Models in Targeted Sentiment Analysis for Russian.
Update February 23 2025: π₯ Batching mode support. See π Flan-T5 provider for bulk-chain project. Test is available here
Update November 01 2024: β Implemented a separated bulk-chain project for handling massive amount of prompts with CoT. This concept was used in this studies.
...
Update February 23 2025: π₯ Batching mode support. See π Flan-T5 provider for bulk-chain project. Test is available here
Update November 01 2024: β Implemented a separated bulk-chain project for handling massive amount of prompts with CoT. This concept was used in this studies.
Update 06 September 2024: Mentioning the related information about the project at BU-research-blog
Update 11 August 2024: π€ Announcing the talk on this framework @ NLPSummit 2024 with the preliminary ad and details in X/Twitter post π¦.
Update 05 June 2024: The frameworkless lauch of the THoR-tuned FlanT5 model for direct inference is now available in GoogleColab.
Update 02 June 2024: π€ Models were uploaded on
huggingface: π€ nicolay-r/flan-t5-tsa-prompt-xl and other models smaller sizes as well. Check out the related section.
Update 22 May 2024:
β οΈ inGoogleColabyou might find zero evaluation results (see #8 issue) (0) ontestpart, which is due to locked availablity on labels π. In order to apply your model fortestpart, please proceed with the Official RuSentNE-2023 codalab competition page or at Github.
Update 06 March 2024: π
attrdictrepresents the main limitation for code launching inPython 3.10and hence been switched toaddict(see Issue#7).
π₯ Update 24/04/2024: We released fine-tuning log for the prompt-based and THoR-based techniques applied for the
traincompetition data as well as checkpoints for downloading. More ...
π» Update 19/04/2024: We open quick_cot code repository for lauching quick CoT zero-shot-learning / few-shot-learning experiments with LLM, utilied in this studies. More ...
π Update 19/04/2024: We open a separate π πRuSentNE-benchmark repositoryπ π for LLM-resonses, including answers on reasoning steps in THoR CoT for ChatGPT model series. More ...
- Installation
- Preparing Data
- Zero-Shot
- Chain-of-Thought fine-tuning
- Fine-tuned Flan-T5 Checkpoints π₯
- Answers
- References
We separate dependencies necessary for zero-shot and fine-tuning experiments:
pip install -r dependencies_zs.txt
pip install -r dependencies_ft.txtSimply launch the following script for obtaining both original texts and Translated:
python rusentne23_download.pyYou could launch manual data translation to English language (en) via GoogleTrans:
python rusentne23_translate.py --src "data/train_data.csv" --lang "en" --label
python rusentne23_translate.py --src "data/valid_data.csv" --lang "en" --label
python rusentne23_translate.py --src "data/final_data.csv" --lang "en"This is a common script for launching LLM model inference in Zero-shot format using manual or predefined prompts:
python zero_shot_infer.py \
--model "google/flan-t5-base" \
--src "data/final_data_en.csv" \
--prompt "rusentne2023_default_en" \
--device "cpu" \
--to "csv" \
--temp 0.1 \
--output "data/output.csv" \
--max-length 512 \
--hf-token "<YOUR_HUGGINGFACE_TOKEN>" \
--openai-token "<YOUR_OPENAI_TOKEN>" \
--limit 10000 \
--limit-prompt 10000 \
--bf16 \
--l4bSimply setup model name and device you wish to use for launching model.
python zero_shot_infer.py --model google/flan-t5-base --device cpuUse the prompt command for passing the predefined prompt or textual prompt that involves the {text} information.
python zero_shot_infer.py --model google/flan-t5-small \
--device cpu --src data/final_data_en.csv --prompt 'rusentrel2023_default_en'Use the model parameter prefixed by openai:, followed by
model names
as follows:
python zero_shot_infer.py --model "openai:gpt-3.5-turbo-1106" \
--src "data/final_data_en.csv" --prompt "rusentrel2023_default_en_short" \
--max-length 75 --limit 5This functionality if out-of-scope of this repository.
We release a tiny framework, dubbed as quick_cot for applying CoT schemas, with API similar to one in Zero-Shot section, based on schemas written in JSON notation.
π π thor-zero-shot-cot-english-shema.json π
π» π Tiny CoT-framework (quick_cot) π
| Model | prompt | THoR |
|---|---|---|
| FlanT5-base | - | π€ nicolay-r/flan-t5-tsa-thor-base |
| FlanT5-large | - | π€ nicolay-r/flan-t5-tsa-thor-large |
| FlanT5-xl | π€ nicolay-r/flan-t5-tsa-prompt-xl | π€ nicolay-r/flan-t5-tsa-prompt-xl |
python thor_finetune.py -r "thor" -d "rusentne2023"
-m "google/flan-t5-base" \
-li <PRETRAINED_STATE_INDEX> \
-bs <BATCH_SIZE> \
-es <EPOCH_SIZE> \
-f "./config/config.yaml" -c,--cuda_index: Index of the GPU to use for computation (default:0).-m,--model_path: Path to the model on hugging face.-d,--data_name: Name of the dataset (rusentne2023)-r,--reasoning: Specifies the reasoning mode (engine), with singlepromptor multi-stepthormode.-li,--load_iter: load a state on specific index from the samedata_nameresource (default:-1, not applicable.)-es,--epoch_size: amount of training epochs (default:1)-bs,--batch_size: size of the batch (default:None)-t,--temperature: temperature (default=gen_config.temperature)-z,--zero_shot: running zero-shot inference with chosen engine ontestdataset to form answers.-f,--config: Specifies the location of config.yaml file.
Configure more parameters in config.yaml file.
Results of the zero-shot models obtained during experiments fall outside the scope of this repository. We open a separate for LLM-resonses, including answers on reasoning steps in THoR CoT for ChatGPT model series:
π RuSentNE-benchmark repository π
You can cite this work as follows:
@misc{rusnachenko2024large,
title={Large Language Models in Targeted Sentiment Analysis},
author={Nicolay Rusnachenko and Anton Golubev and Natalia Loukachevitch},
year={2024},
eprint={2404.12342},
archivePrefix={arXiv},
primaryClass={cs.CL}
}