RCNN

Implementation of Efficient Graph-Based Image Segmentation, Selective Search and RCNN

Steps

Region Proposal: Efficient Graph-Based Image Segmentation is used to divide the input image into smaller, arbitrarily shaped regions
Obtain Bounding Boxes: Selective Search hierarchically groups the regions from the prevous step, into larger and larger bounding boxes by comparing the color, texture, size and fill (how well two regions "fill" the bounding box that encloses them). The bounding box from each intermediate merge is stored, with the final bounding box encapsulating the whole image.
Predictions: The bounding boxes returned from Selective Search are reshaped to a size of (224, 224), and passed through Alexnet (a convolution neural net architecture), which predicts the class
Post-processing / Non-Maximum Supression:
- Discard all bounding boxes predicted as "background"
- Overlapping boxes predicting the same object are filtered using confidence scores, keeping only the most confident box for each object.
- The remaining bounding boxes are our final object detections.

Efficient Graph-Based Image Segmentation

A graph is created with 224x224 nodes, one for each pixel in the input image, with an edge between each pixel and its 8 neighbours. This is an undirected graph (the edges Pixel 0 -> Pixel 1 and Pixel 1 -> Pixel 0 are identical, and therefore won't be included twice).
The weight of each edge is initialized to the difference in intensity between the associated pixels
The edges are sorted in ascending order by intensity (edges between the most similar pixels are at the front).
A disjoint set object is initialized with 224x224 items
We loop through the sorted edges, and get the corresponding sets of each vertex of the edge. If each vertex belongs to a different set, and the edge weight is below a computed threshold, we join the sets.
After looping through all the edges, the remaining sets in the disjoint set object correspond to the proposed regions

Selective Search

Run Efficient Graph-Based Image Segmentation, and create a color histogram and texture histogram for each proposed region
Get the similarity of all neighbouring sets. The similarity is obtained by comparing the color histograms, texture histograms, size and by computing the fill
Sort the regions by similarity; most similar first
Loop through sorted regions and hierarchically join the two most similar, storing the bounding box that encapsulates both of them at each iteration. In the end, only one region will be left, consisting of the whole image.
Get rid of all bounding boxes smaller than a certain threshold. The remaining bounding boxes are returned.

This basic implementation of the RCNN algorithm does not use 21 class specific SVMs or Bounding Box Regression. Furthermore, the images shown below used a model that was only trained for 7000 epochs instead of 50000 epochs.

Run time is approximately 2.5 to 3 seconds.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
results		results
README.md		README.md
build_classifier.py		build_classifier.py
efficient_graph_img_seg.py		efficient_graph_img_seg.py
full_rcnn_program.py		full_rcnn_program.py
get_ground_truths.py		get_ground_truths.py
helper_functions.py		helper_functions.py
make_proposal_dataset.py		make_proposal_dataset.py
selective_search.py		selective_search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RCNN

Steps

Efficient Graph-Based Image Segmentation

Selective Search

About

Uh oh!

Releases

Packages

Languages

nathanjjohnson7/RCNN

Folders and files

Latest commit

History

Repository files navigation

RCNN

Steps

Efficient Graph-Based Image Segmentation

Selective Search

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages