Skip to content

Analyzing current vision transformer architectures on their ability to extract features from the natural image domain and be fine-tuned on astronomical data (transfer learning).

License

Notifications You must be signed in to change notification settings

kumarrah2002/ViT_Pretrained_vs_Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Vision Transformers for Galaxy Morphology Classification:
Fine-Tuning Pre-Trained Networks vs. Training From Scratch

Markdownify

Abstract:

In recent years, the Transformer-based deep learning architecture has become extremely popular for downstream tasks, especially within the field of Computer Vision. However, transformer models are very data-hungry, making them challenging to adopt in many applications where data is scarce. Using transfer learning techniques, we explore the classic Vision Transformer (ViT) and its ability to transfer features from the natural image domain to classify images in the galactic image domain. Using the weights of models trained on ImageNet (a popular benchmark dataset for Computer Vision), we compare the results of two distinct ViTs: one base ViT (without pre-training) and another fine-tuned ViT pre-trained on ImageNet. Our experiments on the Galaxy10 dataset show that by using the pre-trained ViT model, we can get better accuracy compared to the ViT model built from scratch and do so with a faster training time. Experimental data further shows that the fine-tuned ViT model can achieve similar accuracy to the model built from scratch while using less training data.


Vision Transformers for Galaxy Morphology Classification: Fine-Tuning Pre-Trained Networks vs. Training From Scratch was written by Rahul Kumar under the supervision of Dr. Mohammed Kamruzzaman Sarker (advisor) and Dr. Sheikh Rabiul Islam (coadvisor). This thesis has been submitted to the honors committee at the University of Hartford.

PDF version

A rendered PDF version of the thesis can be found in the research_publication_vit.pdf file. This paper was submitted to the DeLTA conference which is being held in Rome, Italy for 2023.

Requirements

Once you have downloaded the project, locate the requirements.txt file using your terminal. From there, run

pip install -r requirements.txt

and all of the required libraries will be downloaded.

Getting Started

There are two folders within this project: data and models. For the purposes of these experiments, we use Jupyter Notebook to conduct and monitor each of the experiements using the Vision Transformer (ViT). Once you have downloaded all the requirements, go into the data.ipynb file and change the paths where you want the data to be saved. Then, run the whole script to ensure that you have downloaded the galactic images.

Next, you can examine the model performances at each dataset size as well as models with no weights and pre-trained weights. To access the ViT with no pre-trained weights, the file is named visiontransformer-scratch.ipynb. Otherwise, the other models all utilized pre-trained weights from the ViT trained on the ImageNet-1k dataset. The number at the end of each file represents the percentage of the dataset used to train the model (e.g [visiontransformer_90.ipynb] = 90% of original dataset size, etc.)

Finally, the plot.ipynb file contains the graphs and charts used for the publication draft.

Authors

Contributor names and contact info:

Rahul Kumar

Dr. Md Kamruzzaman Sarker

Dr. Sheikh Rabiul Islam

Acknowledgements

This research was conducted with the support of the University of Hartford through the Vincent Coffin Grant (ID:398131).

License

This thesis is made available under the GNU General Public License v3.0. A copy of the full license is available in the LICENSE file.

About

Analyzing current vision transformer architectures on their ability to extract features from the natural image domain and be fine-tuned on astronomical data (transfer learning).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published