Advanced Multimedia Processing (AMP) Lab, Cornell University

Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models


Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen





In many machine learning domains (such as scene understanding), several related sub-tasks (such as scene categorization, depth estimation, object detection) operate on the same raw data and provide correlated outputs. It is desirable to have an algorithm that can capture such correlation without requiring to make any changes to the inner workings of any classifier. A recent method called Cascaded Classification Models (CCM) attempts to do so by repeated instantiations of the individual classifiers in a cascade; however, it is limited in that the learning algorithm does not maximize the joint likelihood of the sub-tasks, but rather learns only in a feed-forward way.

We propose Feed-back Enabled Cascaded Classification Models (FE-CCM), that maximizes the joint likelihood of the sub-tasks by using an iterative algorithm. A feedback step allows later classifiers to provide earlier stages information about what error modes to focus on. We have shown that our method significantly improves performance in all the sub-tasks in multiple domains: general scene understanding, robotic assistant systems, video surveillance analysis, and so on.






  • General scene understanding



  • Robotic Assistant Systems

Object-grasping Robot

Object-finding Robot



  • Background Subtraction and Object Detection in Surveillance Videos



  • Video Emotion Classification




Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen. “Feedback Enabled Cascaded Classification Models for Scene Understanding.” In Neural Information Processing Systems Conference 2010 (NIPS 2010). [pdf | pdf-full version | supplementary]

Congcong Li*, Adarsh Kowdle*, Ashutosh Saxena, Tsuhan Chen. "A generic model to compose vision modules for holistic scene understanding." Workshop on Parts and Attributes, European Conference on Computer Vision, 2010 (ECCV 2010). [* indicates equal contribution] [pdf | slides]

Congcong Li, Chih-Wei Lin, Shiaw-Shian Yu, Tsuhan Chen. "Joint Optimization of Background Subtraction and Object Detection for Night Surveillance." To appear in IEEE International Conference on Image Processing (ICIP 2011). [pdf]

Congcong Li, TP Wong, Norris Xu, Ashutosh Saxena. "FeCCM for Scene Understanding: Helping the Robot to Learn Multiple Tasks." In International Conference on Robotics and Automation 2011 (ICRA 2011). [pdf | mp4 | youtube]  



Congcong Li. Feedback Enabled Cascaded Classification Models: Algorithm and Applications. Cornell Artificial Intelligence Seminar, Spring, 2011.