Wink Detection Running with Dlib and OpenCV |
We start by importing all the necessary packages,
import numpy as np
import cv2
import dlib
from scipy.spatial import distance as dist
I'll be using the same Eye Aspect Ratio method to detect when an eye gets closed, so I'm going to use the same eye_aspect_ratio function by Adrian,
def eye_aspect_ratio(eye):
# compute the euclidean distances between the two sets of
# vertical eye landmarks (x, y)-coordinates
A = dist.euclidean(eye[1], eye[5])
B = dist.euclidean(eye[2], eye[4])
# compute the euclidean distance between the horizontal
# eye landmark (x, y)-coordinates
C = dist.euclidean(eye[0], eye[3])
# compute the eye aspect ratio
ear = (A + B) / (2.0 * C)
# return the eye aspect ratio
return ear
Note: This code of the eye_aspect_ratio function is directly taken from here. All credits for it should go to Adrian at PyImageSearch.
In order to calculate the Eye Aspect Ratio, we need to detect the outline points of the eyes first. And, in order to detect the eye points, we first need to detect the face, and then detect the face landmarks. So, we'll be using the same steps we used when extracting individual facial features from Dlib Face Landmarks.
We first define the ranges of the Face Landmarks,
FULL_POINTS = list(range(0, 68))
FACE_POINTS = list(range(17, 68))
JAWLINE_POINTS = list(range(0, 17))
RIGHT_EYEBROW_POINTS = list(range(17, 22))
LEFT_EYEBROW_POINTS = list(range(22, 27))
NOSE_POINTS = list(range(27, 36))
RIGHT_EYE_POINTS = list(range(36, 42))
LEFT_EYE_POINTS = list(range(42, 48))
MOUTH_OUTLINE_POINTS = list(range(48, 61))
MOUTH_INNER_POINTS = list(range(61, 68))
We also define a couple of variables which helps us with the EAR calculations,
EYE_AR_THRESH = 0.25
EYE_AR_CONSEC_FRAMES = 3
COUNTER_LEFT = 0
TOTAL_LEFT = 0
COUNTER_RIGHT = 0
TOTAL_RIGHT = 0
The EYE_AR_THRESH defines the threshold for the EAR value where we consider the eye to be closed. You may need to experiment a bit with this value, as it may depend on your camera and the background lighting.
The EYE_AR_CONSEC_FRAMES defines the number of consecutive frames the eye needs to be 'closed' in order for us to detect it as a 'wink'. Again, experiment with this value a bit to get a good value for your application.
The COUNTER_* and TOTAL_* variables holds the consecutive 'eye closed' frames and the total number of 'winks' respectively.
With the variables ready, we start detecting the face, and its landmarks, to isolate the eyes.
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(PREDICTOR_PATH)
# Start capturing the WebCam
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
if ret:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
rects = detector(gray, 0)
for rect in rects:
x = rect.left()
y = rect.top()
x1 = rect.right()
y1 = rect.bottom()
landmarks = np.matrix([[p.x, p.y] for p in predictor(frame, rect).parts()])
left_eye = landmarks[LEFT_EYE_POINTS]
right_eye = landmarks[RIGHT_EYE_POINTS]
Before we calculate the EAR values of the eyes, we want to draw the outline of the eyes. We use the convexHull and drawContours functions of OpenCV for this.
left_eye_hull = cv2.convexHull(left_eye)
right_eye_hull = cv2.convexHull(right_eye)
cv2.drawContours(frame, [left_eye_hull], -1, (0, 255, 0), 1)
cv2.drawContours(frame, [right_eye_hull], -1, (0, 255, 0), 1)
Now, we calculate the EAR for each eye, and draw the value on the video frame.
ear_left = eye_aspect_ratio(left_eye)
ear_right = eye_aspect_ratio(right_eye)
cv2.putText(frame, "E.A.R. Left : {:.2f}".format(ear_left), (300, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
cv2.putText(frame, "E.A.R. Right: {:.2f}".format(ear_right), (300, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
With the EAR values of each eye calculated, we check whether the value for each eye has gone below the threshold we set earlier, for the minimum duration of frames we set, and increase the counters if it has done so.
if ear_left < EYE_AR_THRESH:
COUNTER_LEFT += 1
else:
if COUNTER_LEFT >= EYE_AR_CONSEC_FRAMES:
TOTAL_LEFT += 1
print("Left eye winked")
COUNTER_LEFT = 0
if ear_right < EYE_AR_THRESH:
COUNTER_RIGHT += 1
else:
if COUNTER_RIGHT >= EYE_AR_CONSEC_FRAMES:
TOTAL_RIGHT += 1
print("Right eye winked")
COUNTER_RIGHT = 0
cv2.putText(frame, "Wink Left : {}".format(TOTAL_LEFT), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
cv2.putText(frame, "Wink Right: {}".format(TOTAL_RIGHT), (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
cv2.imshow("Faces found", frame)
Check out the video below to see the Wink Detector in action,
Here's the complete code to get you started,
import numpy as np
import cv2
import dlib
from scipy.spatial import distance as dist
PREDICTOR_PATH = "path to your shape_predictor_68_face_landmarks.dat file"
FULL_POINTS = list(range(0, 68))
FACE_POINTS = list(range(17, 68))
JAWLINE_POINTS = list(range(0, 17))
RIGHT_EYEBROW_POINTS = list(range(17, 22))
LEFT_EYEBROW_POINTS = list(range(22, 27))
NOSE_POINTS = list(range(27, 36))
RIGHT_EYE_POINTS = list(range(36, 42))
LEFT_EYE_POINTS = list(range(42, 48))
MOUTH_OUTLINE_POINTS = list(range(48, 61))
MOUTH_INNER_POINTS = list(range(61, 68))
EYE_AR_THRESH = 0.25
EYE_AR_CONSEC_FRAMES = 3
COUNTER_LEFT = 0
TOTAL_LEFT = 0
COUNTER_RIGHT = 0
TOTAL_RIGHT = 0
def eye_aspect_ratio(eye):
# compute the euclidean distances between the two sets of
# vertical eye landmarks (x, y)-coordinates
A = dist.euclidean(eye[1], eye[5])
B = dist.euclidean(eye[2], eye[4])
# compute the euclidean distance between the horizontal
# eye landmark (x, y)-coordinates
C = dist.euclidean(eye[0], eye[3])
# compute the eye aspect ratio
ear = (A + B) / (2.0 * C)
# return the eye aspect ratio
return ear
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(PREDICTOR_PATH)
# Start capturing the WebCam
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
if ret:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
rects = detector(gray, 0)
for rect in rects:
x = rect.left()
y = rect.top()
x1 = rect.right()
y1 = rect.bottom()
landmarks = np.matrix([[p.x, p.y] for p in predictor(frame, rect).parts()])
left_eye = landmarks[LEFT_EYE_POINTS]
right_eye = landmarks[RIGHT_EYE_POINTS]
left_eye_hull = cv2.convexHull(left_eye)
right_eye_hull = cv2.convexHull(right_eye)
cv2.drawContours(frame, [left_eye_hull], -1, (0, 255, 0), 1)
cv2.drawContours(frame, [right_eye_hull], -1, (0, 255, 0), 1)
ear_left = eye_aspect_ratio(left_eye)
ear_right = eye_aspect_ratio(right_eye)
cv2.putText(frame, "E.A.R. Left : {:.2f}".format(ear_left), (300, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
cv2.putText(frame, "E.A.R. Right: {:.2f}".format(ear_right), (300, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
if ear_left < EYE_AR_THRESH:
COUNTER_LEFT += 1
else:
if COUNTER_LEFT >= EYE_AR_CONSEC_FRAMES:
TOTAL_LEFT += 1
print("Left eye winked")
COUNTER_LEFT = 0
if ear_right < EYE_AR_THRESH:
COUNTER_RIGHT += 1
else:
if COUNTER_RIGHT >= EYE_AR_CONSEC_FRAMES:
TOTAL_RIGHT += 1
print("Right eye winked")
COUNTER_RIGHT = 0
cv2.putText(frame, "Wink Left : {}".format(TOTAL_LEFT), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
cv2.putText(frame, "Wink Right: {}".format(TOTAL_RIGHT), (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
cv2.imshow("Faces found", frame)
ch = 0xFF & cv2.waitKey(1)
if ch == ord('q'):
break
cv2.destroyAllWindows()
Once you detect the wink, then it's just a matter of assigning an action to it (just like real life ;) )Thanks Shirish Ranade for the idea of wink detection, and Adrian at PyImageSearch for the excellent tutorial that helped get me started.
Related posts:
http://www.codesofinterest.com/2017/04/extracting-individual-facial-features-dlib.html
http://www.codesofinterest.com/2016/10/getting-dlib-face-landmark-detection.html
Related links:
http://www.pyimagesearch.com/2017/04/24/eye-blink-detection-opencv-python-dlib/
Build Deeper: Deep Learning Beginners' Guide is the ultimate guide for anyone taking their first step into Deep Learning.
Get your copy now!
Unable to open path to your shape_predictor_68_face_landmarks.dat file
ReplyDelete