Advanced Multimedia Processing (AMP) Lab, Cornell University

Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models

People

Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen

i3D

Abstract

In many machine learning domains (such as scene understanding), several related sub-tasks (such as scene categorization, depth estimation, object detection) operate on the same raw data and provide correlated outputs. It is desirable to have an algorithm that can capture such correlation without requiring to make any changes to the inner workings of any classiﬁer. A recent method called Cascaded Classiﬁcation Models (CCM) attempts to do so by repeated instantiations of the individual classiﬁers in a cascade; however, it is limited in that the learning algorithm does not maximize the joint likelihood of the sub-tasks, but rather learns only in a feed-forward way.

We propose Feed-back Enabled Cascaded Classiﬁcation Models (FE-CCM), that maximizes the joint likelihood of the sub-tasks by using an iterative algorithm. A feedback step allows later classiﬁers to provide earlier stages information about what error modes to focus on. We have shown that our method signiﬁcantly improves performance in all the sub-tasks in multiple domains: general scene understanding, robotic assistant systems, video surveillance analysis, and so on.

Algorithm

FECCM

Applications

General scene understanding

i3D

Robotic Assistant Systems

Object-grasping Robot	Object-finding Robot

Background Subtraction and Object Detection in Surveillance Videos

i3D

Video Emotion Classification

i3D

Publication

Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen. “Feedback Enabled Cascaded Classification Models for Scene Understanding.” In Neural Information Processing Systems Conference 2010 (NIPS 2010). [pdf | pdf-full version | supplementary]

Congcong Li*, Adarsh Kowdle*, Ashutosh Saxena, Tsuhan Chen. "A generic model to compose vision modules for holistic scene understanding." Workshop on Parts and Attributes, European Conference on Computer Vision, 2010 (ECCV 2010). [* indicates equal contribution] [pdf | slides]

Congcong Li, Chih-Wei Lin, Shiaw-Shian Yu, Tsuhan Chen. "Joint Optimization of Background Subtraction and Object Detection for Night Surveillance." To appear in IEEE International Conference on Image Processing (ICIP 2011). [pdf]

Congcong Li, TP Wong, Norris Xu, Ashutosh Saxena. "FeCCM for Scene Understanding: Helping the Robot to Learn Multiple Tasks." In International Conference on Robotics and Automation 2011 (ICRA 2011). [pdf | mp4 | youtube]

Talk

Congcong Li. Feedback Enabled Cascaded Classification Models: Algorithm and Applications. Cornell Artificial Intelligence Seminar, Spring, 2011.