- Download LLMs and datasets from huggingface.co and modify their paths in
language_model.py
anddata.py
- Create folder
logs
in current path - python main.py --fname EXPERIMENT_NAME --model-name all --abs-model rep --test-loaders all --test-data 1000 --threshold all --pca-dim 8 --state-num 32 --harmful-data 64
You can customize the hyperparameters in the arguments.