úterý 17. února 2015

Generating disparity images in batch from stereo sources with LIBELAS

If you want to generate disparity image from your stereo image sources there are few options. I tried OpenCV's StereoBM/StereoSGBM and LIBELAS. If you are looking for disparity image quality rather than speed then I can recommend the latter library. In this post I will provide you with simple code modifications that worked for me in Ubuntu 14.04.

The LIBELAS (Library for Efficient Large-scale Stereo Matching) package has a demo bundled with it and if you want to batch process multiple stereo inputs you want to check the src/main.cpp file as it contains the "main" method which one can easily modify for batch processing.

LIBELAS can only input and output PGM image file format by itself which stands for Portable GrayMap and only can carry byte values (0-255) for single pixel. This is not a big problem for input, but it limits the output disparity (image) quality in terms that it you can not store float disparity values like this, you have to cast to integer (or better say byte) and thus you loose some depth precision like that. But PGM output is still a good start.

Batch processing input files

In main.cpp you need to locate the "main" method. If you want to batch process files, you can delete all contents of that main method and insert a simple FOR loop like this instead:

int main() {

  for (int i = 0; i < 100; i++) {

    // left
    std::stringstream streamLeft;
    streamLeft << "/your_path/file_left_" << i << ".pgm";
    string stringLeft = streamLeft.str();
    const char * filenameLeft = stringLeft.c_str();

    // right
    std::stringstream streamRight;
    streamRight << "/your_path/file_right_" << i << ".pgm";
    string stringRight = streamRight.str();
    const char * filenameRight = stringRight.c_str();

    process(filenameLeft, filenameRight);

  return 0;

The code above will concatenate your FOR loop index to the filename. It will work for files without a fixed format of number digits like this: file_left_0.pgm or file_left_100.pgm. The code above will not work if your files are numbered with a fixed number of digits like: file_left_00001.pgm.

You also will need to include following at the beginning of main.cpp for the stringstream to work:

#include <sstream>

Disable scaling of output images

Also, the demo code in main.cpp will "scale" pixels in each image to fit the 0-255 byte range. That means the maximum disparity found between the source image pair will become 255 - if your maximum disparity is 100 (pixels), this 100 will be changed to 255 and all intermediate values will be increased/scaled accordingly (keeping the same ratio) for the image to look "nicely".

If you process a set of files which you want to compare with each other or say want to make a video from them then such scaling of maximum values is unwanted, because the max disparity is likely to be at least slightly different for each stereo image pair so you would get a different scaling ratio for every resulting disparity image and the final set of disparity images would not be comparable.

To disable the scaling find the following piece of code and modify it accordingly:

// copy float to uchar
image<uchar> *D1 = new image<uchar>(width,height);
image<uchar> *D2 = new image<uchar>(width,height);
for (int32_t i=0; i<width*height; i++) {
    // disable scaling to 0-255
    D1->data[i] = (uint8_t) D1_data[i];
    D2->data[i] = (uint8_t) D2_data[i];

The red code above will simply cast the float value to 0-255 range. However, there is one consequence to this - you will get bad results if your maximum disparity is higher than 255. I dont know how the cast will behave in such case and my max disparity on the piece of KITTI dataset I tried was about 140 pixels so this simple solution worked for me.

To get the max disparity value across the set of all your files you can simply create a global variable that will watch for the maximum "disp_max" value during your batch loop. In the very beginning of main.cpp you create the variable and set it to zero:

float totalMaxDisparity = 0;

After "disp_max" is found in the "process" method you check whether it is higher than what you aready have and if it is you store it:

// find maximum disparity for scaling output disparity images to [0..255]
float disp_max = 0;
for (int32_t i=0; i<width*height; i++) {
    if (D1_data[i]>disp_max) disp_max = D1_data[i];
    if (D2_data[i]>disp_max) disp_max = D2_data[i];
if (disp_max > totalMaxDisparity) totalMaxDisparity = disp_max;

In the very end of "main" method just after your batch FOR loop you can print the value:

cout << "totalMaxDisparity: " << totalMaxDisparity << endl;

If you use the KITTI dataset then chances are that your max. disparity will be lower than 255.

I hope you enjoyed the post! My modified code is released under GPL to comply with the copyleft requirement.

pátek 12. září 2014

Reprojecting stereo images to 3D point cloud

I recently started to experiment with computer vision. There are still very few code examples out on the Internet, but I found Martin Peris's blog where he published few examples on computer vision. Notably his post on 3D reconstruction with OpenCV and Point Cloud Library (PCL) is a very good one and I used the code to do 3D reconstruction of Karlsruhe dataset by Andreas Geiger and his colleagues. The image pair I used was the first one from the sequence named "2010_03_09_drive_0019".

Here are the source images from 2010_03_09_drive_0019 sequence:

Left image:

Right image:

Disparity image obtained with LIBELAS library:

The disparity image actually is a LEFT disparity image, since the LIBELAS library generates a pair of disparity images - in this case probably computed as left-to-right disparity. I did not use the R disparity image that was generated.

Please excuse the orientation of the point cloud since I am a real beginner with PCL and its viewer is hard to control for me (behaves a bit weird :)).

The resulting point cloud images from various angles are below:

All the images can also be found in a single image gallery.

Please note that the original Karslruhe datased is licensed under Creative Commons by-nc-sa 3.0 license so all my derivative images have the same licensing terms.