ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 13 Issue 2, May 2017

Multichannel-Kernel Canonical Correlation Analysis for Cross-View Person Reidentification
Giuseppe Lisanti, Svebor Karaman, Iacopo Masi
Article No.: 13
DOI: 10.1145/3038916

In this article, we introduce a method to overcome one of the main challenges of person reidentification in multicamera networks, namely cross-view appearance changes. The proposed solution addresses the extreme variability of person appearance in...

A Temporal Order Modeling Approach to Human Action Recognition from Multimodal Sensor Data
Jun Ye, Hao Hu, Guo-Jun Qi, Kien A. Hua
Article No.: 14
DOI: 10.1145/3038917

From wearable devices to depth cameras, researchers have exploited various multimodal data to recognize human actions for applications, such as video gaming, education, and healthcare. Although there many successful techniques have been presented...

Multi-Class Latent Concept Pooling for Computer-Aided Endoscopy Diagnosis
Shuai Wang, Yang Cong, Huijie Fan, Baojie Fan, Lianqing Liu, Yunsheng Yang, Yandong Tang, Huaici Zhao, Haibin Yu
Article No.: 15
DOI: 10.1145/3051481

Successful computer-aided diagnosis systems typically rely on training datasets containing sufficient and richly annotated images. However, detailed image annotation is often time consuming and subjective, especially for medical images, which...

Machine Learning--Based Parametric Audiovisual Quality Prediction Models for Real-Time Communications
Edip Demirbilek, Jean-Charles Grégoire
Article No.: 16
DOI: 10.1145/3051482

In order to mechanically predict audiovisual quality in interactive multimedia services, we have developed machine learning--based no-reference parametric models. We have compared Decision Trees--based ensemble methods, Genetic Programming and...

Congestion Control for Network-Aware Telehaptic Communication
Vineet Gokhale, Jayakrishnan Nair, Subhasis Chaudhuri
Article No.: 17
DOI: 10.1145/3052821

Telehaptic applications involve delay-sensitive multimedia communication between remote locations with distinct Quality of Service (QoS) requirements for different media components. These QoS constraints pose a variety of challenges, especially...

A Video Bitrate Adaptation and Prediction Mechanism for HTTP Adaptive Streaming
Ashkan Sobhani, Abdulsalam Yassine, Shervin Shirmohammadi
Article No.: 18
DOI: 10.1145/3052822

The Hypertext Transfer Protocol (HTTP) Adaptive Streaming (HAS) has now become ubiquitous and accounts for a large amount of video delivery over the Internet. But since the Internet is prone to bandwidth variations, HAS's up and down switching...

Crowd Scene Understanding from Video: A Survey
Jason M. Grant, Patrick J. Flynn
Article No.: 19
DOI: 10.1145/3052930

Crowd video analysis has applications in crowd management, public space design, and visual surveillance. Example tasks potentially aided by automated analysis include anomaly detection (such as a person walking against the grain of traffic or...

V-JAUNE: A Framework for Joint Action Recognition and Video Summarization
Fairouz Hussein, Massimo Piccardi
Article No.: 20
DOI: 10.1145/3063532

Video summarization and action recognition are two important areas of multimedia video analysis. While these two areas have been tackled separately to date, in this article, we present a latent structural SVM framework to recognize the action and...

A Multiplexing Scheme for Multimodal Teleoperation
Burak Cizmeci, Xiao Xu, Rahul Chaudhari, Christoph Bachhuber, Nicolas Alt, Eckehard Steinbach
Article No.: 21
DOI: 10.1145/3063594

This article proposes an application-layer multiplexing scheme for teleoperation systems with multimodal feedback (video, audio, and haptics). The available transmission resources are carefully allocated to avoid delay-jitter for the haptic signal...

A Dual-Domain Perceptual Framework for Generating Visual Inconspicuous Counterparts
Zhuo Su, Kun Zeng, Hanhui Li, Xiaonan Luo
Article No.: 22
DOI: 10.1145/3068427

For a given image, it is a challenging task to generate its corresponding counterpart with visual inconspicuous modification. The complexity of this problem reasons from the high correlativity between the editing operations and vision perception....