|
|
Tutorial Sessions |
The following six tutorial sessions will be held on December 14 2010.
Tutorials 1-3 will be held in the morning, while Tutorials 4-6 will be held in the afternoon.
|
Morning session (9.00 am to 12.00 noon)
|
| Tutorial 1: |
3D Video Processing Techniques for Free-viewpoint Television |
| Speaker: |
Prof. Yo-Sung Ho
Gwangju Institute of Science and Technology (GIST), Korea |
| Abstract: |
In recent years, various multimedia services have been available and the demand for
three-dimensional television (3DTV) is growing rapidly. Since 3DTV is considered as the
next generation broadcasting service that can deliver real and immersive experiences by
supporting user-friendly interactions, a number of advanced 3D video processing technologies
have been studied. Among them, multi-view video coding (MVC) is the key technology for
various applications including free-viewpoint television (FVT). In order to support
free-viewpoint video services, we need to develop efficient techniques for 3D video
processing.
The main objective of this tutorial lecture is to provide a comprehensive coverage of
the fundamental principles of 3D video processing, including leading algorithms for
free-viewpoint video applications. After reviewing the basic techniques for multiple camera
calibration, image rectification, illumination compensation and color correction, we are
going to explain different approaches for obtaining depth information of the 3D scene. We
also cover the current state-of-the-art technologies for multi-view video and depth coding,
including several spatio-temporal prediction structures, and explain how to generate
intermediate images at virtual viewpoints for free-viewpoint video services. In this tutorial
lecture, we will discuss the MPEG activities for 3D video coding, including depth map estimation
and intermediate view synthesis software.
|
| Tutorial 2: |
Human-vision Friendly Processing for Images and Graphics |
| Speaker: |
Prof. Weisi Lin
Nanyang Technological University, Singapore |
| Abstract: |
Since the human visual system (HVS) is the ultimate receiver and appreciator for
the majority (if not all) of naturally captured images and computer generated graphics,
it would be better to use a perceptual criterion in the system design, implementation
and optimization, instead of the traditional, mathematically defined one (e.g., MSE,
SNR, PSNR, QoS or their relatives). After million-years of evolution, the HVS develops
unique characteristics, which can be turned into the advantages for system designs. To
make the machine perceive as the HVS does can result in resource savings (for instance,
bandwidth, memory space, computing power) and performance enhancement (such as the
resultant visual quality, and new functionalities). Significant research effort has been
made toward modelling the HVS' mechanism during the past decade, and to apply the
resultant models to various situations (equality evaluation, image/video compression,
watermarking, channel coding, signal restoration/enhancement, computer graphics, visual
content retrieval, etc.).
In this tutorial, we will first introduce the problem formulation, the relevant
physiological/psychological knowledge, and the work so far in the related fields. The
basic engineering modules (like signal decomposition, visual attention, and visibility
determination) are then to be discussed. The issues and difficulties related to the
two major mechanisms in most current systems (i.e., feature detection and pooling) are
to be highlighted and explored. Afterward, different perceptually-driven techniques will
be presented for picture quality evaluation, signal compression, enhancement, communication,
and computer graphics, with proper case studies whenever possible. The last part of the
tutorial is devoted to a summary, points of further discussion and possible future research
directions, based upon our experience in both academic and industrial pursuits.
|
| Tutorial 3: |
Brain-Computer Interface Technology and Applications |
| Speakers: |
Kai Keng Ang, Fabien Pierre Robert Lotte, Cuntai Guan
Institute for Infocomm Research, A*STAR, Singapore |
| Abstract: |
A Brain-computer interfaces (BCI), or sometimes called brain-machine interface,
is a device that respond to neural processes from the brain to provide a direct
communication pathway between the brain and the external device. Research on BCIs
began in the 1970 and recent advances in BCI technology has produced devices that
augment or even help human functions that is only possible in science fiction a
few years ago. This tutorial will present an overview of the current BCI technologies,
ranging from invasive, semi-invasive using ECoG to non-invasive using EEG, MEG, NIRS
and fMRI. Recently, there has been much interest in BCI technology to help improve
the quality of life and to restore function for people with severe motor disabilities.
One of the strategies is to use a BCI to translate brain signals that involves motor
or mental imagery into commands for controlling the robot and bypasses the normal motor
output neural pathways. This tutorial will focus on the signal processing and machine
learning techniques to detect motor imagery. Finally, this tutorial will present how
recent BCI technology can help to improve the lives of people with neurological disorders
such as advanced amyotrophic lateral sclerosis, and to help restore more effective motor
control to people after stroke or other traumatic brain disorders.
The first part of the tutorial will focus on an overview of BCI
technologies: Invasive techniques, semi-invasive techniques using ECoG,
and non-invasive techniques using EEG, MEG, NIRS and fMRI.
The second part of the tutorial will focus on the neurophysiological background
on motor imagery, how to apply machine learning and signal processing algorithms
to detect motor imagery from EEG signals, and how to interpret the computed solution.
The last part of the tutorial will focus on how BCI technology can help to improve lives
of people with advanced amyotrophic lateral sclerosis. It will also describe how BCI
technology can help to restore more effective motor control to people after stroke or
other traumatic brain disorders by helping to guide activity-dependent brain plasticity
|
|
Afternoon session (2.00 pm - 5.00 pm)
|
| Tutorial 4: |
Image Denoising - The SURE-LET Methodology |
| Speaker: |
Prof. Thierry Blu
The Chinese University of Hong Kong |
| Abstract: |
The goal of this tutorial is to introduce the attendance to a new approach for
dealing with noisy data - typically, images or videos here.
Image denoising consists in approximating the noiseless image by performing some,
usually non-linear, processing of the noisy image. Most standard techniques involve
assumptions on the result of this processing (sparsity, low high-frequency contents,
etc.); i.e., on the denoised image.
Instead, the SURE-LET methodology that we promote consists in approximating the
processing itself (seen as a function) in some linear combination of elementary
non-linear processings (LET: Linear Expansion of Thresholds), and to optimize the
coefficients of this combination by minimizing a statistically unbiased estimate of
the Mean-Square Error (SURE: Stein's Unbiased Risk Estimate, for additive Gaussian
noise).
This tutorial will introduce the technique to the attendance, will outline its
advantages (fast, noise-robust, flexible, image adaptive). A very complete set of
results will be shown and compared with the state-of-the-art.
Extensions of the approach to Poisson noise reduction with application to
fluorescence microscopy imaging will also be shown.
|
| Tutorial 5: |
Emotion Recognition and Cognitive Load Measurement from Speech |
| Speakers: |
Dr. Julien Epps, Dr. Fang Chen, Dr. Bo Yin
National ICT Australia |
| Abstract: |
Research in speech processing has seen a gradual movement in attention from speech
recognition and related applications towards paralinguistic speech processing problems
in recent years. A wide range of paralinguistic classification problems have been
considered, relating for example to the recognition of speaker identity, language,
emotion, mental state, gender and age. In the general area of emotion recognition from
speech, the number of papers published annually has increased by an order of magnitude
over the past decade.
One application area of interest in paralinguistic speech classification is the
measurement of cognitive load or mental workload. It is about a century since the proposal
of the Yerkes-Dodson law, which states that there is an optimum mental arousal for
performing a task, below and above which performance will deteriorate. Despite this,
there are few methods that have been demonstrated to measure cognitive load in practise,
and fewer still in real time. Speech-based methods are attractive because they are
non-intrusive, inexpensive and can be real-time.
Like other paralinguistic classification tasks, cognitive load measurement is a
challenging problem, and one that must account for variability posed by linguistic,
contextual and speaker-specific characteristics. Unlike some other paralinguistic
classification tasks, cognitive load measurement requires classification along an ordinal
scale, motivating the use of very specific machine learning techniques.
This tutorial introduces and examines some of the key research problems for emotion
recognition and cognitive load measurement from speech: understanding the psychophysiological
basis of emotion and cognitive load during speech production, extracting suitable features
from the speech signal, reducing feature variability due to speaker and linguistic content,
developing machine learning methods applicable to the task, comparing and evaluating diverse
methods, robustness, and constructing suitable databases. The discussion of cognitive load is
framed in the wider context of emotion recognition from speech, and some key insights from
this area will be covered. The tutorial will also briefly discuss the use of other biomedical
signals for cognitive load measurement. Participants will be exposed to likely future
challenges, both during the tutorial presentation and during the ensuing discussion.
|
| Tutorial 6: |
Human Biometrics: Will it be Reality or Fantasy? |
| Speaker: |
Dr. Waleed H. Abdulla
The University of Auckland, New Zealand |
| Abstract: |
The 2001 MIT Technology Review indicated that biometrics is one of the emerging
technologies that will change the world. Biometrics technology is initially treated
as an exotic topic while recently it is a fast growing industry due to the urgent
needs to secure people properties from goods to information.
Human Biometrics is automated recognition of a person using adherent distinctive
physiological and/or involuntary behavioral features. Physiological features include
facial characteristics, fingerprints, palm prints, iris patterns, and many more.
Examples of behavioral features are signature writing dynamics, gait, speaker
recognition, and keyboard typing dynamics. However, most biometric identifiers are
a combination of physiological and behavioral features and they should not be
exclusively classified into either physiological or behavioral characteristics. For
example, speech is partially determined by the biological structure of the speaker
vocal tract and partially by the way that person speaks. Also, fingerprints may be
physiological in nature but the usage of the input device (e.g., how a user touches
the fingerprint scanner and the pressure on the sensor) depends on the person's
behavior. A car mechanics has different touch from a computer geek! Thus, the input
to the recognition engine is a combination of physiological and behavioral
characteristics. Behaviors can help in distinguishing the confusion happening when
identifying parent, children, and siblings in their voice, gait, signature etc. The
same argument applies to facial recognition. Faces of identical twins may completely
match at birth but during growth, the facial features change based on the person's
behavior developed from profession, way of living, environment, .. etc.
Through this tutorial we will go through all the main aspects of this fast growing
technology. We will discuss in this tutorial if we are about entering an era where
people don't need to carry any identity or credit cards and still can purchase things
and travel to other countries. The attendees will be introduced throughout this
tutorial to the following:
- The fundamentals of Human Biometrics.
- Types of biometrics.
- Biometric systems structure.
- Assessment of the performance of the biometric systems.
|
|
|