OpenCV Object Tracking 人臉辨識的好搭檔。Use OpenCV to Object Tracking for human face recognition.

- 3月 14, 2021

OpenCV Object Tracking 人臉辨識的好搭檔。
Use OpenCV to Object Tracking for human face recognition...

OpenCV的全稱是Open Source Computer Vision Library，是一個跨平台的電腦視覺庫。OpenCV是由英特爾公司(Intel Corp.)發起並參與開發，以BSD授權條款授權發行，可以在商業和研究領域中免費使用。OpenCV可用於開發即時的圖像處理、電腦視覺以及圖型識別程式。該程式庫也可以使用英特爾公司(Intel Corp.)的IPP進行加速處理。

為推進機器視覺的研究，提供一套開源且最佳化的基礎庫。不重造輪子。

提供一個共同的基礎庫，使得開發人員的代碼更容易閱讀和轉讓，促進了知識的傳播。

透過提供不需要開源或免費的軟體授權，促進商業應用軟體的開發。

OpenCV現在也整合了對CUDA的支援.

OpenCV的第一個預覽版本於2000年在IEEE Conference on Computer Vision and Pattern Recognition公開，並且陸續提供了五個測試版本。1.0版本於2006年發布。

OpenCV的第二個主要版本是2009年10月的OpenCV 2.0。該版本的主要更新包括C++介面，更容易、更類型安全的模式，新的函式，以及對現有實現的最佳化（特別是多核心方面）。現在每6個月就會有一個官方版本，並由一個商業公司贊助的獨立小組進行開發。

在2012年8月，OpenCV的營運由一個非營利組織（OpenCV.org）來提供，並保留了一個開發者網站和使用者網站。

物件追蹤(Object Tracking)

要辨識影片中所有的人物，最簡單的作法是針對每個影格持續的detect face和recognize，但這也是最耗費資源且嚴重影響performance的方式，其實我們可以在辨識出影片中特定人物之後，搭配object tracking予以追蹤，可大幅減輕detect及recognize所帶來的負擔。

從Haar特徵到MTCNN

目前最普遍常用的face detection是OpenCV的Haar cascade方法，用法很簡單，如下方的有色字體部份：

import cv2

import imutils

img = cv2.imread("peoples.jpg")

cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

faces = cascade.detectMultiScale(img, scaleFactor=1.1, minNeighbors=6)

for face in faces:

(x, y, w, h) = face

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)

cv2.imshow("test", imutils.resize(img, width=800))

cv2.waitKey(0)

偵測任務對系統負荷大(Detect is a loading)

雖然方便且執行迅速，但是其偵測的效果並不太好，且為了突顯face detect對於系統的loading，以下示範將使用MTCNN來取代傳統的Haar Cascade。MTCNN是一種深度學習的臉部偵測，其效果雖然比起Haar好上一大截，但在CPU的執行速度約比Haar慢十倍。下方先介紹安裝及使用方法

安裝MTCNN(Install MTCNN)

import imutils

MTCNN module for python：https://pypi.org/project/mtcnn/
安裝：
pip3 install mtcnn

如網頁所示範的，可回傳影像中所有的face box，以及mouth_right, right_eye, left_eye, mouth_left等四個facial landmarks。

MTCNN usage

>>> from mtcnn.mtcnn import MTCNN

>>> import cv2

>>>

>>> img = cv2.imread("ivan.jpg")

>>> detector = MTCNN()

Tracking頭部並追蹤

由於臉部五官並非一個完整的物件，因此在進行object tracking時，很容易因為形狀不完整而發生lost tracking或error tracking的情況，例如下方的示範影片，出現無法持續tracking的情況。

所以我們試著將偵測到的臉部區域，往外擴張約1/3的大小，使其涵蓋整個頭部，接著將此頭部區域作為object tracking的對象，你會發現效果改善了很多。

程式：

只要修改程式中取得face area的程式碼，將其長寬增加固定比例即可。例如：

head = (face[0], face[1], int(face[2]*1.3), int(face[3]*1.3))

辨識臉孔並追蹤

在偵測臉孔之後，接下來，我們在tracking之前插入辨識的工作，這樣可以避免每個影格都要執行一次recognition。程式碼如下：

from mtcnn.mtcnn import MTCNN

import cv2

import imutils

video_file = "0316_DaiZuYing.mp4"

face_detect = "mtcnn"

displayWidth = 500

min_faceSzie = (30, 30)

#tracker_type = "MEDIANFLOW" #BOOSTING, CSRT, TLD, MIL, KCF, MEDIANFLOW, MOSSE

tracker_type = "KCF"

if(face_detect=="mtcnn"):

detector = MTCNN()

elif(face_detect=="dlib"):

detector = dlib.get_frontal_face_detector()

else:

detector = cv2.CascadeClassifier(cascade_path)

def get_faces(img):

faces = []

if(face_detect=="mtcnn"):

allfaces = detector.detect_faces(img)

for face in allfaces:

print("face", face["box"])

x = face["box"][0]

y = face["box"][1]

w = face["box"][2]

h = face["box"][3]

faces.append((int(x),int(y),int(w),int(h)))

elif(face_detect=="dlib"):

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

rects = detector(gray, 2)

for rect in rects:

(x, y, w, h) = rect_to_bb(rect)

faces.append((int(x),int(y),int(w),int(h)))

else:

allfaces = detector.detectMultiScale(img, scaleFactor=1.10, minNeighbors=5)

for face in allfaces:

(x, y, w, h) = face

faces.append((int(x),int(y),int(w),int(h)))

if(len(faces)>0):

return faces

else:

return None

def draw_face(img, bbox, txt):

fontSize = round(img.shape[0] / 930, 1)

if(fontSize<0.35): fontSize = 0.35

boldNum = int(img.shape[0] / 500)

if(boldNum<1): boldNum = 1

if(bbox is not None):

x = int(bbox[0])

y = int(bbox[1])

w = int(bbox[2])

h = int(bbox[3])

cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),boldNum)

print ("draw:", bbox)

cv2.putText(img, txt, (x, y-(boldNum*3)), cv2.FONT_HERSHEY_COMPLEX, fontSize, (255,0,255), boldNum)

return img

def display_frame(frame, face, txt):

displayImg = draw_face(frame, face, txt)

cv2.imshow("frame", imutils.resize(displayImg, width=displayWidth))

cv2.waitKey(1)

if tracker_type == 'BOOSTING':

tracker = cv2.TrackerBoosting_create()

if tracker_type == 'MIL':

tracker = cv2.TrackerMIL_create()

if tracker_type == 'KCF':

tracker = cv2.TrackerKCF_create()

if tracker_type == 'TLD':

tracker = cv2.TrackerTLD_create()

if tracker_type == 'MEDIANFLOW':

tracker = cv2.TrackerMedianFlow_create()

if tracker_type == 'GOTURN':

tracker = cv2.TrackerGOTURN_create()

if tracker_type == 'MOSSE':

tracker = cv2.TrackerMOSSE_create()

if tracker_type == "CSRT":

tracker = cv2.TrackerCSRT_create()

VIDEO_IN = cv2.VideoCapture(video_file)

hasFrame = True

while hasFrame:

hasFrame, frame = VIDEO_IN.read()

if not hasFrame:

break

displayImg = frame.copy()

faceBoxes = get_faces(frame)

if(faceBoxes is not None):

face = faceBoxes[0]

display_frame(frame, face, "")

ok = tracker.init(frame, face)

trackStatus = True

while trackStatus is True:

hasFrame, frame = VIDEO_IN.read()

trackStatus, face = tracker.update(frame)

txtStatus = "tracking"

display_frame(frame, face, txtStatus)

facebox = face

txtStatus = "lost..."

else:

facebox = None

txtStatus = "No face"

display_frame(frame, facebox, txtStatus)

mtcnn github: MTCNN github

that's it.

搜尋此網誌

機器視覺雜話(ZA WARUDO)