IndianArms-ObjDetection-Classification

Indian Arms span through a diversified set of groups that serve for the country. Every state has their own version of State Police that adds further types of Indian Civil service men, hence the need for varied uniforms. With varied uniforms there exists a need for proper knowledge on uniform identification. Automating this requires a two step analogy - Object Detection (Identifying Humans in Uniform) and Object Identification/Classification.

Dataset-Description

Owing to restrictions on taking pictures of men on duty, there is a limited availability of all categories of personnel images inherently. Here we shall deal with 3 broad categories of service men - CRPF, BSF and JK (Jammu&Kashmir). Since we are working on a problem statement that requires simultaneous detection and classification, availability of well-annotated data is further limited. Of all the publicly available ones, this repo utilises data because of it's integrity in terms of annotations and suitability to start model training.

Problems encountered in this dataset:

Lack of negative related samples (Type I)
Lack of negative non-related samples (Type II)

For Type I issue, the major problem that the model faced was ghost prediction (False Positives). Due to similar colour contrasts of other objects in the image background there was an increasing tendency to detect them. This energy was explainable since the model had to trade-off between context(shape) and colour information, situations when the colour information overshadowed contextual information the model steered towards this error. In order to solve this a new class was introduced as BG. BG stands for background (non-human) such that the default class of background coincides with it. Later the model produces a confusion matrix where a lot of these instances are predicted as background due to their inherent similarity. 35 new instances of 4th class (BG) were introduced in 35 different samples spread uniformly within 3 classes as a solution to Type I issue.

For Type II issue, challenges were totally different. Unrelated tricky data points that can challenge the softbed of training data were missing. Outsourcing these samples were important to make the model holistic in terms of real world deployment. Data points with description "military vehicle interior empty"/"camouflage pattern fabric". 60 samples were scraped off using a script with 15 different prompts and the whole of the image were annotated as BG class.

Alternative dataset - Lab's (not annotated)

Working Principle

Bounding boxes work on the core principle of maximising the probability for the object to lie inside the box. It becomes necessary for the detector to correctly identify humans/partial human-like figure initially and map out a box around the region of interest (in this case a person wearing a uniform). Here we leverage YOLO frameworks with the specialisation of object detection and classification only.

Workflow

Step by step solution/configs for data preparation and model training:

Train-Test split

--80:20 standard was followed to set aside 60 data points for final evaluation on unseen test data.

Data Augmentation

--Data availability as discussed is scarce. 302 samples across 3 classes is not enough for training the model from scratch hence fine-tuning on a pre-trained model remains the only viable option. Further elevation in data count is ensured by leveraging data augmentation techniques that add variance to original samples in the space of orientation/distribution/size.

Following are the augmentation config:

'imgsz': 640,       # INPUT SIZE  
'hsv_h': 0.01,      # hue (HSV)
'hsv_s': 0.4,       # saturation (HSV)
'hsv_v': 0.3,       # brightness (HSV)
'degrees': 10,      # max DEGREE OF ROTATION
'translate': 0.1,   # TRANSLATION    
'scale': 0.3,       # maximum SCALING FACTOR
'flipud': 0.0,      # PROBABILITY OF UPSIDE DOWN FLIP
'fliplr': 0.5,      # PROBABILITY OF LEFT RIGHT FLIP
'mosaic': 0.5,      # mosaic TO COMBINE sub_regions of 4 diff IMAGES
'mixup': 0.15       # mixup CRITERIA - PIXEL WISE COMBINATION

K-Fold Validation

--Instead of static train and valid splits, dynamic splitting becomes a non-negotiable criterion owing to small dataset. 5 fold cross validation exploits the possibility of best model churned out due to less variation captured in small sample size. Script for K-Fold splitting has been provided as well with custom yaml file generation.

--Fold wise data distribution:

 Metric               | Fold 1     | Fold 2     | Fold 3     | Fold 4     | Fold 5     
 -------------------- | ---------- | ---------- | ---------- | ---------- | ---------- 
 Training samples     | 252        | 252        | 252        | 252        | 252        
 Validation samples   | 28         | 28         | 28         | 28         | 28         
 Training CRPF        | 72 (28.6%) | 70 (27.7%) | 71 (28.1%) | 70 (27.8%) | 68 (26.9%) 
 Training BSF         | 81 (32.1%) | 78 (30.8%) | 76 (30.1%) | 78 (31.1%) | 89 (35.3%) 
 Training JK          | 67 (26.6%) | 70 (27.7%) | 73 (28.9%) | 72 (28.7%) | 65 (25.8%) 
 Training BG          | 32 (12.7%) | 34 (13.7%) | 32 (12.9%) | 31 (12.4%) | 30 (12.1%) 
 Validation CRPF      | 7 (24.4%)  | 8 (28.1%)  | 8 (26.7%)  | 8 (28.0%)  | 9 (31.9%)  
 Validation BSF       | 9 (31.1%)  | 10 (35.5%) | 11 (38.8%) | 10 (35.4%) | 5 (17.3%)  
 Validation JK        | 9 (31.6%)  | 8 (26.8%)  | 6 (22.3%)  | 6 (22.2%)  | 10 (35.1%) 
 Validation BG        | 3 (13.0%)  | 2 (9.5%)   | 3 (12.1%)  | 4 (14.3%)  | 4 (15.7%)

Model Development and Training

-- Models ranging from YoLoV8 to YoLoV11 were tested on the 5 folds with varied data augmentation configs. After careful fold-wise inspection through the hyperparameter search spaces (warm-up epoch, close_mosaic, freeze_params, etc,.)

YoLoV11 fold4 yielded the best performing model

Metrics

Fold-wise Metric table:

 Fold | mAP50 | mAP50-95 | Precision | Recall | AP50_CRPF  | AP50_BSF  | AP50_JK 
 ---- | ----- | -------- | --------- | ------ | ---------- | --------- | -------- 
 1    | 0.714 | 0.420    | 0.787     | 0.592  | 0.761      | 0.755     | 0.878    
 2    | 0.721 | 0.362    | 0.785     | 0.609  | 0.858      | 0.777     | 0.740    
 3    | 0.631 | 0.396    | 0.774     | 0.540  | 0.555      | 0.625     | 0.745    
 4    | 0.794 | 0.472    | 0.899     | 0.660  | 0.890      | 0.936     | 0.859    
 5    | 0.762 | 0.421    | 0.839     | 0.656  | 0.749      | 0.738     | 0.893

Metric Curves:

Confusion Matrix of Test Data:

Model In Action

Refer to Test Result dir for outputs on unseen samples from the same dataset (Homogeneous source)

-> Unseen Sample_Same Dataset [Level - Individual]

-> Unseen Sample_Same Dataset [Level - Group]

-> Different Dataset

Refer to Lab's Result dir for outputs on samples which are independently sourced from a different dataset (Heterogeneous source)

Usage

Dir Orgaization

|
|----Labs's
|      |----BSF
|      |----CRPF
|      |----J&K
|
|----dataset_kfold_5
|      |----Fold1
|             |----train
|             |      |----images
|             |      |----labels
|             |
|             |----valid
|
|      |----Fold2
|      |----Fold3
|      |----Fold4
|      |----Fold5
|
|----dataset
|      |----test
|      |     |----images
|      |     |----labels
|      |
|      |----train      
|            |----images
|            |----labels
|            |----backgrounds
|                  |----images
|                  |----labels
|
|----weights
|      |
|      |----best.pt
|      
|
|----get_bg_patches.py
|----KFold_gen.py
|----outsource_negatives.py
|----train.py
|----predict.py

Model Weights

-> Top 2 performing folds - F4 and F5 is linked and available for download

F4 wts , F5 wts

Folds Generation

python KFold_gen.py --data_dir dataset --n_splits 5

Training

python train.py --fold_base_dir dataset_kfold_5 --n_splits 5 --project_name UniformDetection_KFold --fold_exist True

Inference

python predict.py --model_weights weights/best.pt --test_images_folder Lab\'s/BSF --results_folder Lab\'s/results/BSF

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
KFold_gen.py		KFold_gen.py
LICENSE		LICENSE
README.md		README.md
get_bg_patches.py		get_bg_patches.py
outsource_negatives.py		outsource_negatives.py
predict.py		predict.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IndianArms-ObjDetection-Classification

Dataset-Description

Working Principle

Workflow

Train-Test split

Data Augmentation

K-Fold Validation

Model Development and Training

YoLoV11 fold4 yielded the best performing model

Metrics

Fold-wise Metric table:

Metric Curves:

Confusion Matrix of Test Data:

Model In Action

-> Unseen Sample_Same Dataset [Level - Individual]

-> Unseen Sample_Same Dataset [Level - Group]

-> Different Dataset

Usage

Dir Orgaization

Model Weights

Folds Generation

Training

Inference

About

Uh oh!

Releases

Packages

Languages

License

raj-1411/IndianArms-ObjDetection-Classification

Folders and files

Latest commit

History

Repository files navigation

IndianArms-ObjDetection-Classification

Dataset-Description

Working Principle

Workflow

Train-Test split

Data Augmentation

K-Fold Validation

Model Development and Training

YoLoV11 fold4 yielded the best performing model

Metrics

Fold-wise Metric table:

Metric Curves:

Confusion Matrix of Test Data:

Model In Action

-> Unseen Sample_Same Dataset [Level - Individual]

-> Unseen Sample_Same Dataset [Level - Group]

-> Different Dataset

Usage

Dir Orgaization

Model Weights

Folds Generation

Training

Inference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages