It turns out, we can make our own application with Snapchat like image overlays using Python, OpenCV, and Dlib.
Snapchat like Image Overlays with Dlib, OpenCV, and Python |
So, how do we build it?
- We'll first load the Webcam feed using OpenCV.
- We'll load an image (in our example, and image for the 'eye') to be used as the overlay.
- Use Dlib's face detection to localize the faces, and then use facial landmarks to find where the eyes are.
- Calculate the size and the position of the overlay for each eye.
- Finally, place the overlay image over each eye, resized to the correct size.
Let's start.
We'll start by loading all the required libraries,
import numpy as np
import cv2
import dlib
from scipy.spatial import distance as dist
from scipy.spatial import ConvexHull
Apart from OpenCV and Dlib, we load two methods from the scipy.spatial package, that'll help us with distance and size calculations.
Next, we setup the parameters for the Dlib Face Detector and Face Landmark detector. We also initialize the arrays that help us extract individual face landmarks out of the 68 landmarks which Dlib returns. (see Extracting individual Facial Features from Dlib Face Landmarks).
PREDICTOR_PATH = "path/to/your/shape_predictor_68_face_landmarks.dat"
FULL_POINTS = list(range(0, 68))
FACE_POINTS = list(range(17, 68))
JAWLINE_POINTS = list(range(0, 17))
RIGHT_EYEBROW_POINTS = list(range(17, 22))
LEFT_EYEBROW_POINTS = list(range(22, 27))
NOSE_POINTS = list(range(27, 36))
RIGHT_EYE_POINTS = list(range(36, 42))
LEFT_EYE_POINTS = list(range(42, 48))
MOUTH_OUTLINE_POINTS = list(range(48, 61))
MOUTH_INNER_POINTS = list(range(61, 68))
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(PREDICTOR_PATH)
Next, we'll load the image for the overlay.
I'll be using the following image as the overlay for the eyes,
The overlay image for the eyes |
Feel free to download and use this image with your code as well.
#---------------------------------------------------------
# Load and pre-process the eye-overlay
#---------------------------------------------------------
# Load the image to be used as our overlay
imgEye = cv2.imread('path/to/your/Eye.png',-1)
# Create the mask from the overlay image
orig_mask = imgEye[:,:,3]
# Create the inverted mask for the overlay image
orig_mask_inv = cv2.bitwise_not(orig_mask)
# Convert the overlay image image to BGR
# and save the original image size
imgEye = imgEye[:,:,0:3]
origEyeHeight, origEyeWidth = imgEye.shape[:2]
Notice the '-1' parameter in cv2.imread. It's telling OpenCV to load the 'Alpha Channel' (a.k.a. the transparency channel) of the image, along with the BGR channels.
We take the alpha channel and create a mask from it. We create an inverse of the mark also, which will be used to define the pixels outside of the eye overlay.
We then convert the overlay image back to a regular BGR image, removing the alpha channel.
We now start capturing the frames from the Webcam, detects the faces from the frame, and detect the face landmarks. We extract out the landmarks for the left and right eyes separately from the landmark array.
# Start capturing the WebCam
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
if ret:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
rects = detector(gray, 0)
for rect in rects:
x = rect.left()
y = rect.top()
x1 = rect.right()
y1 = rect.bottom()
landmarks = np.matrix([[p.x, p.y] for p in predictor(frame, rect).parts()])
left_eye = landmarks[LEFT_EYE_POINTS]
right_eye = landmarks[RIGHT_EYE_POINTS]
In order to place the overlay on to the eyes of the face image, we need to find the size and the center of each of the eyes. We define a function in order to calculate them,
def eye_size(eye):
eyeWidth = dist.euclidean(eye[0], eye[3])
hull = ConvexHull(eye)
eyeCenter = np.mean(eye[hull.vertices, :], axis=0)
eyeCenter = eyeCenter.astype(int)
return int(eyeWidth), eyeCenter
We use the euclidean function to calculate the width of the eye, and the ConvexHull function to calculate the center.
We pass each of the eyes separately to get their sizes individually,
leftEyeSize, leftEyeCenter = eye_size(left_eye)
rightEyeSize, rightEyeCenter = eye_size(right_eye)
Now it's time to place the overlay on to the face image. We define the place_eye function for that,
def place_eye(frame, eyeCenter, eyeSize):
eyeSize = int(eyeSize * 1.5)
x1 = int(eyeCenter[0,0] - (eyeSize/2))
x2 = int(eyeCenter[0,0] + (eyeSize/2))
y1 = int(eyeCenter[0,1] - (eyeSize/2))
y2 = int(eyeCenter[0,1] + (eyeSize/2))
h, w = frame.shape[:2]
# check for clipping
if x1 < 0:
x1 = 0
if y1 < 0:
y1 = 0
if x2 > w:
x2 = w
if y2 > h:
y2 = h
# re-calculate the size to avoid clipping
eyeOverlayWidth = x2 - x1
eyeOverlayHeight = y2 - y1
# calculate the masks for the overlay
eyeOverlay = cv2.resize(imgEye, (eyeOverlayWidth,eyeOverlayHeight), interpolation = cv2.INTER_AREA)
mask = cv2.resize(orig_mask, (eyeOverlayWidth,eyeOverlayHeight), interpolation = cv2.INTER_AREA)
mask_inv = cv2.resize(orig_mask_inv, (eyeOverlayWidth,eyeOverlayHeight), interpolation = cv2.INTER_AREA)
# take ROI for the verlay from background, equal to size of the overlay image
roi = frame[y1:y2, x1:x2]
# roi_bg contains the original image only where the overlay is not, in the region that is the size of the overlay.
roi_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)
# roi_fg contains the image pixels of the overlay only where the overlay should be
roi_fg = cv2.bitwise_and(eyeOverlay,eyeOverlay,mask = mask)
# join the roi_bg and roi_fg
dst = cv2.add(roi_bg,roi_fg)
# place the joined image, saved to dst back over the original image
frame[y1:y2, x1:x2] = dst
Here, we're calculating the size for the overlay based on the eye size and position.
We also need to check for clipping. Otherwise, you'll be getting an error like the following when you try to calculate a mask that has pixels outside of the image frame.
OpenCV Error: Assertion failed ((mtype == CV_8U || mtype == CV_8S) && _mask.same
Size(*psrc1)) in cv::binary_op, file C:\bld\opencv_1492084805480\work\opencv-3.2
.0\modules\core\src\arithm.cpp, line 241
Traceback (most recent call last):
File "WebCam-Overlay.py", line 135, in <module>
place_eye(frame, leftEyeCenter, leftEyeSize)
File "WebCam-Overlay.py", line 51, in place_eye
roi_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)
cv2.error: C:\bld\opencv_1492084805480\work\opencv-3.2.0\modules\core\src\arithm
.cpp:241: error: (-215) (mtype == CV_8U || mtype == CV_8S) && _mask.sameSize(*ps
rc1) in function cv::binary_op
What we are basically doing here is to calculate the size of the overlay, get a box of pixels of that size out of the face image around the position of where the overlay should go, substitute the pixels of that extracted box with the pixels from the overlay image excluding the transparent pixels (using the masks we calculate), and finally put the substituted box of pixels back in to the face image.
We need to do this for each eye individually,
place_eye(frame, leftEyeCenter, leftEyeSize)
place_eye(frame, rightEyeCenter, rightEyeSize)
Finally, we just need to show the resulting frame,
cv2.imshow("Faces with Overlay", frame)
And here's the result,
The image overlay working |
Since we calculate the size of each eye individually for the overlay, they show up correctly when you turn your head,
The overlays resize correctly when you turn tour head |
And, it even works when wearing glasses,
The overlays works with glasses also |
Check the video to see the image overlays in action,
Here's the full code for your convenience,
import numpy as np
import cv2
import dlib
from scipy.spatial import distance as dist
from scipy.spatial import ConvexHull
PREDICTOR_PATH = "path/to/your/shape_predictor_68_face_landmarks.dat"
FULL_POINTS = list(range(0, 68))
FACE_POINTS = list(range(17, 68))
JAWLINE_POINTS = list(range(0, 17))
RIGHT_EYEBROW_POINTS = list(range(17, 22))
LEFT_EYEBROW_POINTS = list(range(22, 27))
NOSE_POINTS = list(range(27, 36))
RIGHT_EYE_POINTS = list(range(36, 42))
LEFT_EYE_POINTS = list(range(42, 48))
MOUTH_OUTLINE_POINTS = list(range(48, 61))
MOUTH_INNER_POINTS = list(range(61, 68))
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(PREDICTOR_PATH)
def eye_size(eye):
eyeWidth = dist.euclidean(eye[0], eye[3])
hull = ConvexHull(eye)
eyeCenter = np.mean(eye[hull.vertices, :], axis=0)
eyeCenter = eyeCenter.astype(int)
return int(eyeWidth), eyeCenter
def place_eye(frame, eyeCenter, eyeSize):
eyeSize = int(eyeSize * 1.5)
x1 = int(eyeCenter[0,0] - (eyeSize/2))
x2 = int(eyeCenter[0,0] + (eyeSize/2))
y1 = int(eyeCenter[0,1] - (eyeSize/2))
y2 = int(eyeCenter[0,1] + (eyeSize/2))
h, w = frame.shape[:2]
# check for clipping
if x1 < 0:
x1 = 0
if y1 < 0:
y1 = 0
if x2 > w:
x2 = w
if y2 > h:
y2 = h
# re-calculate the size to avoid clipping
eyeOverlayWidth = x2 - x1
eyeOverlayHeight = y2 - y1
# calculate the masks for the overlay
eyeOverlay = cv2.resize(imgEye, (eyeOverlayWidth,eyeOverlayHeight), interpolation = cv2.INTER_AREA)
mask = cv2.resize(orig_mask, (eyeOverlayWidth,eyeOverlayHeight), interpolation = cv2.INTER_AREA)
mask_inv = cv2.resize(orig_mask_inv, (eyeOverlayWidth,eyeOverlayHeight), interpolation = cv2.INTER_AREA)
# take ROI for the verlay from background, equal to size of the overlay image
roi = frame[y1:y2, x1:x2]
# roi_bg contains the original image only where the overlay is not, in the region that is the size of the overlay.
roi_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)
# roi_fg contains the image pixels of the overlay only where the overlay should be
roi_fg = cv2.bitwise_and(eyeOverlay,eyeOverlay,mask = mask)
# join the roi_bg and roi_fg
dst = cv2.add(roi_bg,roi_fg)
# place the joined image, saved to dst back over the original image
frame[y1:y2, x1:x2] = dst
#---------------------------------------------------------
# Load and pre-process the eye-overlay
#---------------------------------------------------------
# Load the image to be used as our overlay
imgEye = cv2.imread('path/to/your/Eye.png',-1)
# Create the mask from the overlay image
orig_mask = imgEye[:,:,3]
# Create the inverted mask for the overlay image
orig_mask_inv = cv2.bitwise_not(orig_mask)
# Convert the overlay image image to BGR
# and save the original image size
imgEye = imgEye[:,:,0:3]
origEyeHeight, origEyeWidth = imgEye.shape[:2]
# Start capturing the WebCam
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
if ret:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
rects = detector(gray, 0)
for rect in rects:
x = rect.left()
y = rect.top()
x1 = rect.right()
y1 = rect.bottom()
landmarks = np.matrix([[p.x, p.y] for p in predictor(frame, rect).parts()])
left_eye = landmarks[LEFT_EYE_POINTS]
right_eye = landmarks[RIGHT_EYE_POINTS]
# cv2.rectangle(frame, (x, y), (x1, y1), (0, 255, 0), 2)
leftEyeSize, leftEyeCenter = eye_size(left_eye)
rightEyeSize, rightEyeCenter = eye_size(right_eye)
place_eye(frame, leftEyeCenter, leftEyeSize)
place_eye(frame, rightEyeCenter, rightEyeSize)
cv2.imshow("Faces with Overlay", frame)
ch = 0xFF & cv2.waitKey(1)
if ch == ord('q'):
break
cv2.destroyAllWindows()
We only tried an overlay for the eyes here. But using the same techniques, you would be able to create overlays for anything. So, unleash your creativity, and what you can come up with.
Related posts:
Extracting individual Facial Features from Dlib Face Landmarks
Related links:
https://sublimerobots.com/2015/02/dancing-mustaches/
Build Deeper: The Path to Deep Learning
Learn the bleeding edge of AI in the most practical way: By getting hands-on with Python, TensorFlow, Keras, and OpenCV. Go a little deeper...
Get your copy now!
No comments:
Post a Comment