Tag: Computer Vision

M1-V Deep Learning in Video Compression

Advancing the state-of-the-art in image/video compression by adopting deep learning methods in prediction, transform, entropy coding and post processing. Develop fresh new coding tools based on deep learning for post processing, reconstruction enhancement. Investigate new pipelines using deep learning for end-to-end image/video compression. Achieve significant coding improvements with applicable computational complexity as well as deliver insights into deep learning video compression for machine consumption, e.g., tracking, segmentation, recognition.

F1-M Bidirectional Deep Learning Architecture for Scene Understanding

This project aims at creating deep architectures inspired by cognitive sciences to under visual scenes either in images or videos. The characteristic of the proposed architecture is that it simplifies the inference using biological plausible marginals (object type and spatial location), which can be learned in an unsupervised way directly from data (i.e. without labels).

F6-V Machine-Learning-Enabled Video Coding Strategy for Object Detection

The goal of this project is to develop a machine-learning-enabled video coding strategy for object detection.  Most existing video encoders minimizes distortion under a rate constraint. However, for surveillance video, it is desired for a video encoder to maximize detection probability under a rate constraint. To address this, we will design a new video coding strategy that maximizes object detection probability under a rate constraint. We will locate the information important to object detector, develop Rate-Detection-Optimized framework for mode selection, and design optimized bit rate allocation method.