MUR Blog - Stereo Vision Update

As a follow up to the previous post. It was assumed that pixel locations of the object of interest will be provided for triangulation.

However, for our use case an image frame will have multiple objects of interest within view, how would we then pair an object located within the left frame to its corresponding position in the right frame?

In this article a simple brute force method is demonstrated as a proof of concept before any algorithmic optimizations are applied.

Prerequisites/Assumptions

The following technique would require/assume:

Location of the object in both left and right frames
Bounding boxes of said object in both frames

Which can be obtained from a trained neural net like YOLOv3.

Proof of concept

Starting with the position of each cone and its bounding box, a crop is taken from the original title image frame of the bounding box.

The crops are then compared to every other crop within the right image frame.

Mathematically there are multiple ways of calculating how similar two images are,

Sum of absolute differences (SAD)

Sum of squared differences (SSD)

Normalized Cross-correlation (NCC)

As it can be seen, the above mathematical representation of similarity are in increasing computational complexity. Both SAD and SSD will provide a result within the range [0,∞], where 0 if the two inputs are exactly the same. NCC on the other hand will provide a bounded range of [−1,1] in which 0 represents no correlation between images, with 1 and -1 being the two images are identical or exactly inversed. Further rather than using the raw frame matrix, using the mean normalized data, linear changes in colour intensities between two cameras can be accounted for.

By comparing every possible pair in the left and right frame, it is possible to generate a correlation table (Axis are object IDs of cones found in the left and right frame),

Next the two images are paired with a linear assignment solver, to solve the optimisation problem which results in the final pairing,

While the results may look promising currently with the proof of concept, further heuristics and optimisations are possible in order to both increase execution speed and accuracy such as only trying to pair using a k-nearest neighbour algorithm (KNN) and take into stereo setup geometry (Little to no vertical disparity on the y-axis) to reduce redundant or unlikely candidates.

Next Steps

Real-world data testing
Type 1 & 2 error robustness

Pipeline integration

About the Author:

AndrewHuang Andrew Huang
Spatial & Perception Engineer, 2020

15 Dec 2020