GitHub

The Representation Landscape of Few-Shot Learning and Fine-Tuning in Large Language Models

Source code of the paper: The representation landscape of few-shot learning and fine-tuning in large language models. This work has been accepted at the NeurIPS 2024 conference.

Build an environment with the required dependencies

conda create -n repr_fs_ft python=3.11 pip
conda activate repr_fs_ft

pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118
pip install packaging==24.1
pip install -r requirements.txt

Extract the intermediate representations in a slurm cluster

Extracting the representations for Llama-3-8b in a 5shot takes about 32 minutes on an A100 GPU and requires roughly 34GB of VRAM.

sbatch submit_extract

The internal representations of the Llama-3-8b models will be saved in the folder results/layers/mmlu/pretrained/test/llama-3-8b.

Intrinsic dimension, density peaks, ARIs.

The following script takes as input the path where the representations are stored, for instance results/layers/mmlu/pretrained/test/llama-3-8b/5shot for 5 shot representations.

It computes the geometrical properties of the representation landscape described in the papers (intrinsic dimension, clusters, ARIs) and saves the metrics in results/statistics/pretrained/llama-3-8b

layer_path="results/layers/mmlu/pretrained/test/llama-3-8b/5shot"
python analysis/repr_analysis.py \
    --model_name "llama-3-8b" \
    --results_path "./results" \
    --layer_path $layer_path \
    --mask_path "analysis/test_mask_200.npy" \
    --eval_dataset "test" \
    --num_shots 5

Plot the figures.

The script takes as input the path where the statistics are stored (e.g. results/statistics/pretrained/test/llama-3-8b). It saves the plots in the directory figures.

statistics_path="results/statistics/pretrained/llama-3-8b"
python plot.py \
    --figures_dir "figures" \
    --statistics_dir $statistics_path

The output is shown below.

More to come

We will update the scripts to handle the fine-tuned representations and include other plots soon.

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
analysis		analysis
extraction		extraction
figures		figures
utils		utils
.gitignore		.gitignore
README.md		README.md
extract_repr.py		extract_repr.py
finetune.py		finetune.py
plot.py		plot.py
requirements.txt		requirements.txt
submit_analysis		submit_analysis
submit_extract		submit_extract
submit_finetune		submit_finetune

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Representation Landscape of Few-Shot Learning and Fine-Tuning in Large Language Models

Build an environment with the required dependencies

Extract the intermediate representations in a slurm cluster

Intrinsic dimension, density peaks, ARIs.

Plot the figures.

More to come

About

Uh oh!

Releases

Packages

Uh oh!

Languages

diegodoimo/geometry_icl_finetuning

Folders and files

Latest commit

History

Repository files navigation

The Representation Landscape of Few-Shot Learning and Fine-Tuning in Large Language Models

Build an environment with the required dependencies

Extract the intermediate representations in a slurm cluster

Intrinsic dimension, density peaks, ARIs.

Plot the figures.

More to come

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages