الفهرس | Only 14 pages are availabe for public view |
Abstract This thesis targets the detection of pedestrians in images and videos. Our focus is on developing robust fast and accurate feature extraction algorithms that encode image regions as high dimensional feature vectors that support high accuracy object/non-object decisions. We focus on pedestrian detection as people are one of the most challenging object classes with many applications, for example in visual surveillance, traffic control, video analysis. Nevertheless we do not assume any specific characteristics for pedestrians and the framework is quite general and can be easily applied to other object detection problems including cars, motorbikes and cows. We propose a cascade of two complementary features to detect pedestrians in static images and video quickly and accurately. “Cooccurrence Histograms of Oriented Gradients (CoHOG)” descriptors have a strong classification capability, they include co-occurrence with various positional offsets to express complex shapes of objects with local and global distributions of gradient orientations and designed to be robust to small changes in image contour locations and directions, and significant changes in image illumination and color, while remaining highly discriminative for overall visual form but they are extremely high dimensional. On the other hand, simple Haar-like features are fast to compute but they are not discriminative enough to deal with extremely varying texture and shape information such as pedestrians with different clothing and stances. Therefore, the combination of both features enables fast and accurate pedestrian detection. Our framework comprises a cascade of Haar-like features with AdaBoost followed by a CoHOG feature descriptor with linear SVM classifier. Additionally, we propose integrating two of our proposed cascades: one detects full body and another one detects local upper body part (head, torso, and arms) and then we fuse both of them to create a global overall pedestrian detector. The experimental results are evaluated on two famous pedestrian detection benchmark datasets: “DaimlerChrysler pedestrian classification benchmark dataset” and “INRIA person dataset” and show that we can reach very close accuracy to the most accurate CoHOG-only classifier but in less than 1/200 of its computational cost using our first proposed method, while we can reach higher accuracy than the iv standalone full body CoHOG-only in about 1/100 of its computational cost using our second proposed method. The feasibility of the proposed approaches is demonstrated on three challenging video sequences from ETHZ pedestrian video dataset. |