Advanced Multimedia Processing (AMP) Lab, Cornell University

Projects

Current projects

Structure from Motion with Refractive Distortion Model

We reformulate the structure from motion framework with refractive distortion model to reconstruct a 3D scene under water.

Exploiting Adaptive Context from Unlabeled Regions

We discover contextual regions that automatically adapt to the object category of interest in order to capture contextual interactions at varying granularities.

Holistic Scene Understanding

We propose Feed-back Enabled Cascaded Classication Models (FE-CCM) to jointly optimize mulitple tasks for holistic scene understanding.

Active Learning for Piecewise Planar 3D Reconstruction

We propose an active-learning algorithm for piecewise planar 3D reconstruction of a scene.

Pictorial Structures for Object Recognition and Part Labeling in Drawings

We propose an pictorial structure model to detect objects and label the parts in general drawing.

Kinship verficiation

We propose a computational model for family member recognition.

Scribble Based Interactive 3D Reconstruction via Scene Co-segmentation

We propose an interactive algorithm for planar reconstruction of scenes.

Photo Aesthetics

Automatically assess aesthetic visual quality of photos with faces.

Interactive cosegmentation by touch

We develop a user-friendly system which enables a user to cut out objects of interest from a collection of images by providing scribbles on a few images.

iModel: Interactive Co-segmentation for Object of Interest 3D Modeling

We propose an interactive algorithm for object of interest 3D modeling.

Mean Shift Feature Space Warping for Relevance Feedback

We propose mean shift feature space warping as an effective relevance feedback approach for bridging the gap between high-level semantics and the low-level features.

Object-Driven Image Group Annotation

Annotating image groups in real-world photo image databases.

Previous projects

	3D Model Retrieval We propose a new set of features that view the 3D model as a solid binary region. Ten features such as volume-surface ratio and moment invariants are extracted.		Light Weight Multi-View Capturing This project introduces a multi-view capturing system that uses single camera but multiple mirrors. From the image of multiple mirrors, we are able to extract multi-view images of a scene reflected by mirrors.
	3D Reassembly We propose a local feature-based approach to determine compatibility between parts for 3D reassembly. We use a spectral technique to compute the compatibility score.		Low-level Contextual Patch Saliency We propose a contextual measure for saliency, where a patch is considered to be salient if is consistent with the context of the rest of the image.
	Active IBR we want to study how we can capture the scene more efficiently. Our approach is a combination of algorithms in different fields, which is challenging but very interesting.		Low Power VLC Decoder One aspect of multimedia application systems is variable length decoding, which is used for decompression. By designing a custom hardware to decode the bitstream, a low power design can be realized.
	All Focused Light Field Rendering by Fusion In this project, we propose a novel IBR method that enables us to render a novel view with sufficient quality using less number of images compared with that required for non-aliased rendering.		Mobile Camera Array In this project, we build a self-reconfigurable and low-cost camera array for real-time capturing and rendering dynamic scenes.
	From Appearance to Context Based Recognition Does contextual information really improve recognition performance if appearance information is adequate?		Multimodal Biometrics The synergetic fusion of multiple biometric traits has been shown to be of benefit in the detection of fraud and improving the performance and robustness of existing biometric technologies.
	Content-aware Adaptive Retry for Video Streaming on 802.11 LANs In this project, we propose a content-aware adaptive retry (CAR) mechanism for video streaming over 802.11 WLANs, where retry decision is made dynamically according to packet importance and its play-out schedule.		Pattern Recognition Tools for Intrusion Detection We propose a pattern recognition based novel strategy for adaptive intrusion detection that can evolve with changing network environments. We exploit the ensemble of classifiers approach to combine information from multiple sources and tune the system towards minimizing the cost of the errors.
	DSP for Biomolecular Sequences and Structures We propose new techniques for matching, classifying, and searching 3D protein structures. We use the classic signal processing and information theory to discover the "information" in human genes.		Pedestrian Detection We introduce a real-time learning algorithm to detect moving pedestrians from a stationary camera based on motion patterns in the form of Eigenflow.
	Dynamic Route Planning We introduce a "Boids" concept to model the intelligent vehicles. Each intelligent vehicle follows simple rules/operations and communicates locally with the neighbor vehicles for dynamic traffic-aware route planning.		Personal Authentication Based on SMMS We build a new classifier for authentication which has the simple computational requirement on verification with the concept of Symmetric Maximized Minimal distance in Subspace (SMMS).
	Face In Action (FIA) Face Video Database To evaluate video-based face recognition, we are making the effort to collect such a face video database, called Face In Action (FIA) database.		Portable 3D Faces for Field Identification of Suspects Build a hi-resolution 3D face model from a single low-resolution (2D) face image. Using a portable computer (e.g., PDA), synthesize possible changes (facial hair, hair style, expressions, etc.) in the face's appearance.
	Face Recognition We have been actively pursuing algorithms for more human like face recognition in terms of the ability to generalize to unseen conditions and employing different representations for different claimants.		Probabilistic Relevance Feedback for Image Retrieval The goal is to retrieve images in database that are similar to the query the user has in mind. Relevance feedback is a technique to let the user interact with the system by giving examples so that the system has more information of what the user needs.
	Hand Drawn Sketch Retrieval In this system, the user is able to draw a query and the system will search through similar sketches from a collection of free-form hand-drawings.		Profile-Frontal Audio-Visual Speech Recognition We introduce profile view lipreading and show improved word recognition accuracy with profile view over traditionally popular front view lipreading.
	Hand Tracking using Spatial Gesture Modeling as Visual Feedback In this project, we present a complete real-time hand tracking and 3-D modeling system based on a single camera.		Robot-based Imaging Test-bed
	Hierarchical Semantics of Objects (hSO) We introduce hSOs: Hierarchical Semantics of Objects, which capture the interactions between the objects that tend to co-occur in a scene. We learn hSOs in an unsupervised manner from a collection of images of a scene.		Should we make automatic PR more human? The Pattern Recognition (PR) community is largely split on whether it is a good or bad thing to try and make (and how) automatic PRs mimic humans. The central vision for much of our research is concerned with an answer to such a question.
	IBR Sampling We want to study the IBR sampling problem from with classic methods in signal processing.		Speech-in-Silicon In this project, we developed a hardware reference model for a speech recognition engine, Lynx, written in C++.
	ICTrack: A Flexible Toolbox for Tracking For the novice, we create a flexible interface that the novice can use without having to know the details. For the expert, we provide a framework for developing new algorithms and testing them versus existing techniques.		Stop Sign Detection An automatic stop sign detector could help reduce the number of missed stop sign detections. Will an automatic stop sign detector that has a non-zero error rate benefit drivers?
	Image-Based Relighting We develop the statistical framework for Image-Based Relighting. Using image-based relighting, one can render realistic relit images of a scene without prior knowledge of objects.		Trademark Retrieval The user inputs a query by providing a rough sketch and then the system automatically extracts features from this query sketch to search for similar trademarks in the database.
	Illumination Normalization for Face Recognition Face recognition works poorly as the illumination is in bad condition. To solve this problem, an illumination normalization algorithm is proposed as a pre-processing stage before face recognition.		Video Based Face Recognition For having better modeling of different variations of human faces, a face mosaic model is obtained from training face sequences and used for recognition.
	Key Generation based on Biometrics We build a key generation system based on biometrics. We also analyze the security problem of user information associated key generation (UIAKG) systems.		Voice Recognition The ability to estimate robust and accurate speaker models with modest amounts of training utterances is still an elusive task. We are actively developing techniques to address this important problem.
	Lightweight Arithmetic IP The goal of this project is to create a library of parameterizable lightweight-arithmetic IP blocks, suitable for both media and silicon designers.