Advanced Multimedia Processing (AMP) Lab, Cornell University

Toshihiko YAMASAKI

Research Interests

Computer Vision and Pattern Recognition
Image Processing, Analysis, and Understanding
Hardware Design

Current Research

Smart Video Surveillance Systems

The goal of this work is to make video surveillance systems smarter so that the systems can be used not only for safety and security purposes but also for "better quality of life." Current work focuses on human attribute analysis: whether the pedestrian is male or female, whether he/she has a bag or not, and so on. Such human attribute analysis can be used for digital signage, detecting people who need help, avoiding accidents in crowded spaces, and so on.

We are now developing algorithms using the surveillance data captured at an airport using a top-view camera. No faces are recorded in the top-view images for privacy reasons and the image resolution for each pedestrian tends to be small, which are challenging problems we have to consider. In our latest approach, we developed a two-stage classification approach. For the first stage classifiers, HoG feature extraction with the optimized parameters was used and feature vectors over the frames were calculated by the BoF. By changing the number of clusters in the k-means clustering, several SVM classifiers are generated. The second stage classifier takes the output values of the first stage classifiers to get the final classification results. For the second stage classifier, either SVM classifier, majority voting classifier, and probability based classifier was employed. The best classification performance for the gender classification was achieved when a SVM classifier is used for the second stage (accuracy = 96.3%, FPR = 3.7%, FNR = 2.8%). And the majority voting classifier yielded the best performance for the with/without bag classification by yielding 96.6% of accuracy with 4.7% of FPR and 2.4% of FNR.

We are also developing algorithms using Fourier shape descriptors, GMM-based bag-of-feature representation, and soft-voting-based feature representation.

Aesthetic Visual Quality Assessment

Figure: top-image sample and proposed multi-stage classifier.

Medical Image Processing

The purpose of this work is to develop efficient algorithms for 3D medical image processing, analysis, and diagnosis. There are already a lot of algorithms for medical image analysis but most of them consider each image slice (2D image) or 3D images with smaller resolution than the original data because of the computational complexity.

We are now working on a fast segmentation algorithm for 3D Magnetic Resonance Images (MRI) using GrowCut algorithm. By the combination of four contributions such as hierarchical segmentation, super-voxelization, skipping method, and parallelization, the computational time is drastically reduced from 523 seconds to 10.8 seconds on average for tumor segmentation of 256 x 256 x 200 MRIs.

Aesthetic Visual Quality Assessment

Figure: Segmentation results of 3D MRIs. The processing time is around 10 seconds whereas the original GrowCut algorithm takes about 10 minutes.