OpenCV Transparent API

By | May 27, 2015

A master wordsmith can tell a heart breaking story in just a few words.

For sale: baby shoes, never worn.

A great artist can do so much with so little! The same holds true for great programmers and engineers. They always seem to eek out that extra ounce of performance from their machines. This is what often differentiates a great product from a mediocre one and an exceptional programmer from a run of the mill coder. Such mastery appears magical, but dig a bit deeper and you will notice that the knowledge was available to everyone. Few chose to utilize it.

In this post we will unlock the most easy and probably the most important performance trick you can use in OpenCV 3. It is called the Transparent API ( T-api or TAPI ).

What is Transparent API ( T-API or TAPI ) ?

As of June 1, 2016 Transparent API is still not available via Python bindings

The Transparent API is an easy way to seamlessly add hardware acceleration to your OpenCV code with minimal change to existing code. You can make your code almost an order of magnitude faster by making a laughably small change.

Using Transparent API is super easy. You can get significant performance boost by changing ONE keyword.

Don’t trust me ? Here is an example of standard OpenCV code that does not utilize the transparent API. It reads an image, converts it to grayscale, applies Gaussian blur, and finally does Canny edge detection.

#include "opencv2/opencv.hpp"
using namespace cv;

int main(int argc, char** argv)
{
    Mat img, gray;
    img = imread("image.jpg", 1);
    
    cvtColor(img, gray, COLOR_BGR2GRAY);
    GaussianBlur(gray, gray,Size(7, 7), 1.5);
    Canny(gray, gray, 0, 50);
    
    imshow("edges", gray);
    waitKey();
    return 0;
}

Let’s see how the same code looks with Transparent API.

OpenCV Transparent API example

I have modified the code above slightly to utilize the Transparent API. The difference between the standard OpenCV code and one utilizing TAPI is highlighted below. Notice that all we had to do was to copy the Mat image to UMat ( Unified Matrix ) class and use standard OpenCV functions thereafter.

#include "opencv2/opencv.hpp"
using namespace cv;

int main(int argc, char** argv)
{
    UMat img, gray;
    imread("image.jpg", 1).copyTo(img);
    
    cvtColor(img, gray, COLOR_BGR2GRAY);
    GaussianBlur(gray, gray,Size(7, 7), 1.5);
    Canny(gray, gray, 0, 50);
    
    imshow("edges", gray);
    waitKey();
    return 0;
}

On my Macbook Pro this small change makes the code run 5x faster.

Let us quickly summarize the steps needed to use transparent API

  1. Convert Mat to UMat. There are a couple of ways of doing this.
    Mat mat = imread("image.jpg", IMREAD_COLOR); 
    // Copy Mat to UMat
    UMat umat; 
    mat.copyTo(umat);
    

    Alternatively, you can use getUMat

    Mat mat = imread("image.jpg", IMREAD_COLOR); 
    // Get umat from mat. 
    UMat umat = mat.getUMat( flag );
    

    flag can take values ACCESS_READ, ACCESS_WRITE, ACCESS_RW and ACCESS_FAST. At this point it is not clear what ACCESS_FAST does, but I will update this post once I figure it out.

  2. Use standard OpenCV functions that you would use with Mat.
  3. If necessary, convert UMat back to Mat.. Most of the time you do not need to do this. Here is how you do it in case you need to.
    Mat mat = umat.getMat( flag );
    

    where umat is a UMat image. flag is the same as described above.

Now we know how to use the Transparent API. So what is under the hood that magically improves performance ? The answer is OpenCL. In the section below I briefly explain OpenCL.

What is Open Computing Language (OpenCL) ?

If you are reading this article on a laptop or a desktop computer, it has a graphics card ( either integrated or discrete ) connected to the CPU, which in turn has multiple cores. On the other hand, if you are reading this on a cell phone or tablet, your device probably has a CPU, a GPU, and a Digital Signal Processor ( DSP ). So you have multiple processing units that you can use. The fancy industry words for your computer or mobile device is “heterogeneous platform”.

OpenCL is a framework for writing programs that execute on these heterogenous platforms. The developers of an OpenCL library utilize all OpenCL compatible devices (CPUs, GPUs, DSPs, FPGAs etc) they find on a computer / device and assign the right tasks to the right processor. Keep in mind that as a user of OpenCV library you are not developing any OpenCL library. In fact you are not even a user of the OpenCL library because all the details are hidden behind the transparent API.

What is the difference between OCL Module and Transparent API ?

Short answer : The OCL module is dead. Long live the Transparent API!

OpenCL was supported in OpenCV 2.4 via the OCL module. There were a set of functions defined under the ocl namespace that you could use to call the underlying OpenCL code. Below is an example for reading an image, and using OpenCL to convert it to grayscale.

 
// Example for using OpenCL is OpenCV 2.4
// In OpenCV 3 the OCL module is gone. 
// It is replaced by the much nicer Transparent API

// Initialize OpenCL
std::vector<ocl::Info> param;
ocl::getDevice(param, ocl::CVCL_DEVICE_TYPE_GPU);

// Read image
Mat im = imread("image.jpg"); 

// Convert it to oclMat 
ocl::oclMat ocl_im(im);

// Container for OpenCL gray image. 
ocl::oclMat ocl_gray; 

// BGR2GRAY using OpenCL. 
cv::ocl::cvtColor( ocl_im, ocl_gray, CV_BGR2GRAY );

// Container for OpenCV Mat gray image. 
Mat gray; 

// Convert back to OpenCV Mat
ocl_gray.download(gray);

As you can see it was a lot more cumbersome. With OpenCV 3 the OCL module is gone! All this complexity is hidden behind the so-called transparent API and all you need to do is use UMat instead of Mat and the rest of the code remains unchanged. You just need to write the code once!

Subscribe

If you liked this article, please subscribe to our newsletter and receive a free
Computer Vision Resource guide. In addition to Computer Vision & Machine Learning news we share OpenCV tutorials and examples in C++/Python.

Subscribe Now

Category: how-to OpenCV 3 Tags: , , , ,

About Satya Mallick

I am an entrepreneur with a love for Computer Vision and Machine Learning with a dozen years of experience (and a Ph.D.) in the field. In 2007, right after finishing my Ph.D., I co-founded TAAZ Inc. with my advisor Dr. David Kriegman and Kevin Barnes. The scalability, and robustness of our computer vision and machine learning algorithms have been put to rigorous test by more than 100M users who have tried our products.

  • Neogeo Sotis

    Doesn’t seem to have any difference for detectMultiScale but hope it works for other functions…

    • What kind of graphics card do you have on your machine ?

      • Neogeo Sotis

        Onboard Intel graphics hd 2000, probably very weak, so that’s the answer?

        • That may be the case. When I use the onboard graphics card, the improvement is not huge.

          • Neogeo Sotis

            I read on the documentation that the Transparent API uses gpu optimizations, and cuda, so maybe it needs some cuda cores to run better.

          • PRATYUSH KUMAR

            Intel HD 2000 does not support openCl. That could be the reason.

  • eric

    Do I have to install OpenCL on my machine or is it bundled with OpenCV? I have a GeForce GTS 240. I tested some feature detection code and GPU load went from 0% to 2%, while GPU memory went up about 20MB, calculation time is about the same.

  • eric

    Is there a list of what OpenCV features support the T-API? I’ve tested some code using features2d and the GPU load only went up 2% (the card is an old GeForce GTS 240).

    In the sourcesmodulesfeatures2dsrcopencl folder there are only three files: brute_force_matcher.cl, fast.cl and orb.cl, I’m guessing only these have OpenCL implementations, am I right?

    • I don’t think there is a list. There are 67 opencl ( .cl ) files in different modules, and these are be used by many different functions. So it is tough to say. BTW you can force OpenCV to use a particular device by setting the environment variable OPENCV_OPENCL_DEVICE. E.g. export OPENCV_OPENCL_DEVICE=:GPU:0

  • Arqu

    Do you know how to use T-API in Python, I’ve also posted the question over on SO but it’s getting barely any attention.
    http://stackoverflow.com/questions/31990646/using-opencl-accelerated-functions-with-opencv3-in-python

  • Pawan Mathur

    Satya, thanks for the explaination, To use this feature, I suppose, the cmake makefile generator should enable compiling WITH_OPENCL, though you dont need to directly invoke OpenCL from code.

  • yode
    • haha… that is very odd indeed. But I have seen such behavior before. Could be a variety of factors like GPU is busy etc.