TESA: Task-Agnostic Embedding of Scene Graphs using Vision Alignment

PyTorch implementation of the paper.

Setup

Install Conda. The environment cuda is used.

Exception: voltron preprocessing (further info below).

Environment 'cuda'

Install the environment using:

conda env create -f condaenv_cuda.yaml

pip install git+https://github.com/openai/CLIP.git

Environment for voltron preprocessing

In case you want to use the voltron-based models, preprocessing has to be done in python 3.8.

Create a python 3.8 environment and run:

pip install "cython<3.0.0" wheel

pip install voltron-robotics

Data Preparation

copy the folder ./data into a location with enough disk space
specify its location in .env (see .env.template)
the full structure of the ./data folder is as follows:

processed/ (filled during preproccessing)
raw/
  coco/
    train2017/
    val2017/
  gqa/
  psg/
  vg/
  vqa/
unified_json/ (files as in ./data)
psg_captions/ (files as in ./data)

Download and insert datasets into the respective folders. You might not need all downloads based on which version of TESA you want to train.
- Common Objects in COntext (COCO),
- GQA
- PSG
- VG150 (Images: Part 1, Part 2; Annotations (Links copied from SPAN))
- VQA
preproccessing: run preprocessor.py with all combinations of datasets and vision models that you want to use for training. This will store the image embeddings for a faster training process.

Training and Evaluation

Adjust the config.yaml, Run main.py

For only doing evaluation, use --eval. In that case, the old config (saved from training) is loaded and overwritten by config_eval.yaml.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
data		data
lib/qanli		lib/qanli
out		out
src		src
.env.template		.env.template
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
condaenv_cuda.yaml		condaenv_cuda.yaml
config.yaml		config.yaml
config_eval.yaml		config_eval.yaml
main.py		main.py
preprocessor.py		preprocessor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TESA: Task-Agnostic Embedding of Scene Graphs using Vision Alignment

Setup

Environment 'cuda'

Environment for voltron preprocessing

Data Preparation

Training and Evaluation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

intuitive-robots/TESA

Folders and files

Latest commit

History

Repository files navigation

TESA: Task-Agnostic Embedding of Scene Graphs using Vision Alignment

Setup

Environment 'cuda'

Environment for voltron preprocessing

Data Preparation

Training and Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages