Invisibility Cloak using Color Detection and Segmentation with OpenCV

If you are a Harry Potter fan like me, you would know what an Invisibility Cloak is. Yes! It’s the cloak which Harry Potter uses to become invisible. Of course, we all know that an invisibility cloak is not real — it’s all graphics trickery.

In this post, we will learn how to create our own ‘Invisibility Cloak’ using simple computer vision techniques in OpenCV. We are sharing the code in C++ and Python.

That’s Harry Potter trying out his invisibility cloak! Did you ever have a childhood fantasy to use such this cloak?

Well, it turns out that you can create this magical experience using an image processing technique called color detection and segmentation. And the good news is, you don’t need to be part of Hogwarts for that! All you need is a red colored cloth and follow this post.

Check out the video below where I am trying out my own Invisibility Cloak!

How does it work ?

The algorithm is very similar in principle to green screening. But unlike green screening where we remove the background, in this application, we remove the foreground!

We are using a red colored cloth as our cloak. Why red? Why not green? Sure, we could have used green, isn’t red the magician’s color? Jokes aside, colors like green or blue will also work fine with a little bit of tweaking.

The basic idea is given below:

  1. Capture and store the background frame.
  2. Detect the red colored cloth using color detection algorithm.
  3. Segment out the red colored cloth by generating a mask.
  4. Generate the final augmented output to create the magical effect.

This post has been tested on OpenCV 4.2

The GIF above explains all the mentioned stages of the algorithm in brief. Now we will discuss each step in detail.

Download Code To easily follow along this tutorial, please download code by clicking on the button below. It's FREE!

Step 1: Capture and store a background frame

As explained above, the key idea is to replace the current frame pixels corresponding to the cloth with the background pixels to generate the effect of an invisibility cloak. For this, we need to store a frame of the background.


// Create a VideoCapture object and open the input file
// If the input is the web camera, pass 0 instead of the video file name
VideoCapture cap("video4.mp4");

// Check if camera opened successfully
  cout >> "Error opening video stream or file" >> endl;
  return -1;

Mat background;
for(int i=0;i < 30;i++)
  cap >> background;

//Laterally invert the image / flip the image.


# Creating a VideoCapture object
# This will be used for image acquisition later in the code.
cap = cv2.VideoCapture("video.mp4")

# We give some time for the camera to warm-up!


for i in range(30):
  ret,background = cap.read()

# Laterally invert the image / flip the image.
background = np.flip(background,axis=1)

In the above code, cap.read() method enables us to capture latest frame(to be stored in variable ‘background’) with the camera and it also returns a boolean (True/False stored in ‘ret’). If a frame is read correctly, it will be True. So you can check for the end of the video by checking this return value.

Why capture background image using a ‘for loop’ ?

As the background is static can’t we simply use a single frame? Sure, but the image captured is a bit dark compared to a multiple frames image. This is because the camera is just getting started on capturing frames and hence its parameters are not stable yet. Hence capturing multiple images of static background with a for loop does the trick.

Averaging over multiple frames also reduces noise.

Step 2: Red color detection

Since we are using a red color cloth to convert it into an invisibility cloak we will focus on detection of red color in the frame.

Sound simple? We have an RGB (Red-Green-Blue) image and it is tempting to simply threshold the R channel and get our mask. It turns out that this will not work effectively since the RGB values are highly sensitive to illumination. Hence even though the cloak is of red color there might be some areas where, due-to shadow, Red channel values of the corresponding pixels are quite low.

The right approach is to transform the color space of our image from RGB to HSV (Hue – Saturation – Value).

What is HSV color space?

The HSV color space represents colors using three values

  1. Hue : This channel encodes color color information. Hue can be thought of an angle where 0 degree corresponds to the red color, 120 degrees corresponds to the green color, and 240 degrees corresponds to the blue color.
  2. Saturation : This channel encodes the intensity/purity of color. For example, pink is less saturated than red.
  3. Value : This channel encodes the brightness of color. Shading and gloss components of an image appear in this channel.

Unlike RGB which is defined in relation to primary colors, HSV is defined in a way that is similar to how humans perceive color.

For our application, the major advantage of using the HSV color space is that the color/tint/wavelength is represented by just the Hue component.

To understand different color spaces refer to our detailed blog on color spaces.

So when I say, I need a particular color and select the hue component then depending on the saturation component I get different shades of that color and further depending on the value component I get different Intensities of a particular shade of the color.

In the below code we first capture a live frame, convert the image from RGB to HSV color space and then define a specific range of H-S-V values to detect red color.


Mat frame;
// Capture frame-by-frame
cap >> frame;

//Laterally invert the image / flip the image

//Converting image from BGR to HSV color space.
Mat hsv;
cvtColor(frame, hsv, COLOR_BGR2HSV);

Mat mask1,mask2;
// Creating masks to detect the upper and lower red color.
inRange(hsv, Scalar(0, 120, 70), Scalar(10, 255, 255), mask1);
inRange(hsv, Scalar(170, 120, 70), Scalar(180, 255, 255), mask2);

// Generating the final mask
mask1 = mask1 + mask2;


# Capturing the live frame
ret, img = cap.read()

# Laterally invert the image / flip the image
img = np.flip(imgaxis=1)

# converting from BGR to HSV color space
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)

# Range for lower red
lower_red = np.array([0,120,70])
upper_red = np.array([10,255,255])
mask1 = cv2.inRange(hsv, lower_red, upper_red)

# Range for upper range
lower_red = np.array([170,120,70])
upper_red = np.array([180,255,255])
mask2 = cv2.inRange(hsv,lower_red,upper_red)

# Generating the final mask to detect red color
mask1 = mask1+mask2

The inRange function simply returns a binary mask, where white pixels (255) represent pixels that fall into the upper and lower limit range and black pixels (0) do not.

The Hue values are actually distributed over a circle (range between 0-360 degrees) but in OpenCV to fit into 8bit value the range is from 0-180. The red color is represented by 0-30 as well as 150-180 values.

We use the range 0-10 and 170-180 to avoid detection of skin as red. High range of 120-255 for saturation is used because our cloth should be of highly saturated red color. The lower range of value is 70 so that we can detect red color in the wrinkles of the cloth as well.

mask1 = mask1 + mask2

Using the above line, we combine masks generated for both the red color range. It is basically doing an OR operation pixel-wise. It is a simple example of operator overloading of +.

Now that you understood how color detection is done you can change the H-S-V range and use some other mono-color cloth in place of red color. In fact, a green cloth would work better than a red one because green is farthest away from the human skin tone.

Step 3: Segmenting out the detected red colored cloth

In the previous step, we generated a mask to determine the region in the frame corresponding to the detected color. We refine this mask and then use it for segmenting out the cloth from the frame. The code below illustrates how it is done.


Mat kernel = Mat::ones(3,3, CV_32F);

// creating an inverted mask to segment out the cloth from the frame
Mat res1, res2, final_output;

// Segmenting the cloth out of the frame using bitwise and with the inverted mask


mask1 = cv2.morphologyEx(mask, cv2.MORPH_OPEN, np.ones((3,3),np.uint8))
mask1 = cv2.morphologyEx(mask, cv2.MORPH_DILATE, np.ones((3,3),np.uint8))

#creating an inverted mask to segment out the cloth from the frame
mask2 = cv2.bitwise_not(mask1)

#Segmenting the cloth out of the frame using bitwise and with the inverted mask
res1 = cv2.bitwise_and(img,img,mask=mask2)

Become an expert in Computer Vision, Machine Learning, and AI in 12-weeks! Check out our course

Step 4: Generating the final augmented output to create a magical effect.

Finally, we replace the pixel values of the detected red color region with corresponding pixel values of the static background and finally generate an augmented output which creates the magical effect, converting our cloth into an invisibility cloak. To do this we use bitwise_and operation first to create an image with pixel values, corresponding to the detected region, equal to the pixel values of the static background and then add the output to the image (res1) from which we had segmented out the red cloth.


// creating image showing static background frame pixels only for the masked region

// Generating the final augmented output.
imshow("magic", final_output);


# creating image showing static background frame pixels only for the masked region
res2 = cv2.bitwise_and(background, background, mask = mask1)

#Generating the final output
final_output = cv2.addWeighted(res1,1,res2,1,0)

So now you are all ready to create your own invisibility cloak. Enjoy the magical experience.

Subscribe & Download Code

If you liked this article and would like to download code (C++ and Python) and example images used in this post, please subscribe to our newsletter. You will also receive a free Computer Vision Resource Guide. In our newsletter, we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.

Kaustubh Sadekar:
Related Post