Vision Transformers for Galaxy Morphology Classification:
Fine-Tuning Pre-Trained Networks vs. Training From Scratch

Abstract:

In recent years, the Transformer-based deep learning architecture has become extremely popular for downstream tasks, especially within the field of Computer Vision. However, transformer models are very data-hungry, making them challenging to adopt in many applications where data is scarce. Using transfer learning techniques, we explore the classic Vision Transformer (ViT) and its ability to transfer features from the natural image domain to classify images in the galactic image domain. Using the weights of models trained on ImageNet (a popular benchmark dataset for Computer Vision), we compare the results of two distinct ViTs: one base ViT (without pre-training) and another fine-tuned ViT pre-trained on ImageNet. Our experiments on the Galaxy10 dataset show that by using the pre-trained ViT model, we can get better accuracy compared to the ViT model built from scratch and do so with a faster training time. Experimental data further shows that the fine-tuned ViT model can achieve similar accuracy to the model built from scratch while using less training data.

Vision Transformers for Galaxy Morphology Classification: Fine-Tuning Pre-Trained Networks vs. Training From Scratch was written by Rahul Kumar under the supervision of Dr. Mohammed Kamruzzaman Sarker (advisor) and Dr. Sheikh Rabiul Islam (coadvisor). This thesis has been submitted to the honors committee at the University of Hartford.

PDF version

A rendered PDF version of the thesis can be found in the research_publication_vit.pdf file. This paper was submitted to the DeLTA conference which is being held in Rome, Italy for 2023.

Requirements

Once you have downloaded the project, locate the requirements.txt file using your terminal. From there, run

pip install -r requirements.txt

and all of the required libraries will be downloaded.

Getting Started

There are two folders within this project: data and models. For the purposes of these experiments, we use Jupyter Notebook to conduct and monitor each of the experiements using the Vision Transformer (ViT). Once you have downloaded all the requirements, go into the data.ipynb file and change the paths where you want the data to be saved. Then, run the whole script to ensure that you have downloaded the galactic images.

Next, you can examine the model performances at each dataset size as well as models with no weights and pre-trained weights. To access the ViT with no pre-trained weights, the file is named visiontransformer-scratch.ipynb. Otherwise, the other models all utilized pre-trained weights from the ViT trained on the ImageNet-1k dataset. The number at the end of each file represents the percentage of the dataset used to train the model (e.g [visiontransformer_90.ipynb] = 90% of original dataset size, etc.)

Finally, the plot.ipynb file contains the graphs and charts used for the publication draft.

Authors

Contributor names and contact info:

Rahul Kumar

Dr. Md Kamruzzaman Sarker

Email: [email protected]
GitHub

Dr. Sheikh Rabiul Islam

Email: [email protected]
GitHub

Acknowledgements

This research was conducted with the support of the University of Hartford through the Vincent Coffin Grant (ID:398131).

License

This thesis is made available under the GNU General Public License v3.0. A copy of the full license is available in the LICENSE file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vision Transformers for Galaxy Morphology Classification:
Fine-Tuning Pre-Trained Networks vs. Training From Scratch

Abstract:

PDF version

Requirements

Getting Started

Authors

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
data		data
models		models
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
research_publication_vit.pdf		research_publication_vit.pdf

License

kumarrah2002/ViT_Pretrained_vs_Scratch

Folders and files

Latest commit

History

Repository files navigation

Vision Transformers for Galaxy Morphology Classification: Fine-Tuning Pre-Trained Networks vs. Training From Scratch

Abstract:

PDF version

Requirements

Getting Started

Authors

Acknowledgements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Vision Transformers for Galaxy Morphology Classification:
Fine-Tuning Pre-Trained Networks vs. Training From Scratch

Packages