ACM MMSys'20, Istanbul, Turkey

Overview

MMSys'20 and its co-located three workshops will be held virtually in a time zone friendly manner, considering participation from Japan/South Korea to the US west coast. To this effect, the speakers have been asked to provide a short video and a longer video (slides with narration), and these videos as well as the proceedings will be shared with the attendees before June 8th.

Each session will have an asynchronous discussion forum to accommodate the participants from different time zones. Participants will be able to post questions/comments for any paper in that session during the entire conference week and authors will answer them. Each session (as listed in the detailed schedule below) will also have a live Q&A session, where at least one author for each paper will be available. Details/links for discussion forums and live Q&A sessions are to be provided in the attendee page.

This year, we also have six awesome keynotes as detailed below and these will be presented live at the designated times.

Keynotes

Light Fields - Building the Core Immersive Photo and Video Format for VR and AR
by Ryan Overbeck (Google, USA)

Light fields provide transportive immersive experiences with a level of realism unsurpassed by any other imaging technology. Within a limited viewing volume, light fields accurately reproduce stereo parallax, motion parallax, reflections, refractions, and volumetric effects for real-world scenes. As such, many believe light fields will become the fundamental immersive media format for VR and AR. Although light fields have been explored in computer graphics since the mid-90’s, practical systems for recording, processing, and delivering high quality light field experiences have largely remained out of reach.

Our team at Google is developing hardware and software solutions that finally make it possible to acquire and view light fields on current and affordable hardware. In this talk, I will describe our work on Welcome to Light fields, the first immersive light field experience that can be downloaded and viewed on VR-ready Windows based computers, such as the HTC Vive, Oculus Rift, or Windows Mixed Reality HMDs. This piece won a Lumière Technology Award in 2018 and enables the user to step into a collection of panoramic light field still photographs and be guided through a number of light field environments, including the flight deck of Space Shuttle Discovery. I will also present our more recent work on light field video that uses machine learning to compress the light field data into a format that can be viewed even on low power mobile standalone headsets like the Oculus Quest.
AI Powered Internet Video Streaming: Trends, Challenges and Practices
by Lifeng Sun (Tsinghua University, China)

The rapid development of networking technologies and multimedia services have imposed unprecedented demands on today's video streaming infrastructure. Tackling such tasks is challenging, as the real-world network conditions are getting increasingly complex and heterogeneous. The recent success of deep learning, however, provides new and powerful tools that can help taming the above problems. In this talk, I will explore the impacts of deep learning methods on today's Internet video streaming systems, including representative cases that deep learning methods outperform the traditional counterparts, challenges that we have encountered during real-world deployment, and recommended directions for further research.

Dr. Lifeng Sun is a full professor of the Department of Computer Science and Technology at Tsinghua University. His professional interests lie in the areas of Video Streaming, 3D video Signal Processing and Coding, Virtual Reality, Multimedia Big Data, Social Media and Multimedia Edge Computing. He has published over 200 papers in the above domain including IEEE JSAC, TMM, TPDS, TIP, TCSVT, TMC, CVPR, INFOCOM, ACM TOMM, Multimedia, AAAI, WWW. Prof. Sun got the Annual Best Paper Award of IEEE TCSVT (2010), Best Paper Award of ACM Multimedia (2012), Best Student Paper Award of Multimedia Modeling (2015), IEEE Multimedia Bigdata (2017) and ACM NOSSDAV (2019). He is a member of VSPC_TC (IEEE Visual Signal Processing and Communication Technical Committee in IEEE Circuits and Systems Society) and MMC_TC (IEEE Multimedia Communications Technical Committee in IEEE Communications Society), he served as Co-Chair of IEEE MMTC Media Streaming Interest Group (2010-2011), TPC Co-Chair of IEEE ICC Symposium on Communications Software, Services and Multimedia Applications (2018).
Preserving Video Truth: an Anti-Deepfakes Narrative
by Roderick Hodgson (Amber Video, UK)

The prevalence of video, the ease to create fake video, and the speed and power to distribute these fakes at scale, globally, have come together to create the perfect landscape for the growth of malicious deepfakes. This talk will look at the state of synthetic media technology, highlight where the fear of it is overblown (on social media) and where we are critically not paying enough attention to it (evidence workflows in due process). The talk will also focus on the only durable technical approach to tackling deepfakes and why we need a shared framework/standard for trusted media.

Roderick Hodgson is co-founder and VP of Engineering for Amber Video, a startup creating a 'truth layer' for video. He holds a degree in Artificial Intelligence and Computer Science from the University of Edinburgh, and has spent the past decade working at the confluence of artificial intelligence, computer security and video engineering. His career in video engineering started in 2009 when he joined the BBC Research and Development department. He has lead the research teams of several startups and served as a Board Director of Secure Chorus, a not-for-profit membership organisation addressing data security requirements. He contributed to the writing of several standards, and holds several patents.
Contextual Video Streaming and Transport Protocols
by Mohammad Alizadeh (MIT, USA)

Today's video streaming applications and network protocols are largely oblivious to each other's behavior and requirements. Video streaming applications use "one-size-fits-all" bitrate adaptation algorithms that fail to adapt to heterogeneous networks. Transport protocols, on the other hand, have no knowledge of a video application's needs, and rely on traditional congestion control algorithms that target connection-level fair bandwidth allocation. This talk will present our work on two systems that aim to bridge these gaps between video applications and the network, to improve user quality of experience (QoE). First, I will describe Pensieve, a system that uses reinforcement learning to learn network-specific bitrate adaptation policies. I will overview Pensieve's design and discuss our experience from a week-long world-wide deployment at Facebook. Next, I will describe Minerva, a video-specific transport protocol that uses information about player state and video characteristics to optimize for QoE fairness when multiple video streams compete for a network bottleneck.

Mohammad Alizadeh is an Associate Professor of Computer Science at MIT. His research interests are in the areas of computer networks and systems, and applied machine learning. His current research focuses on learning-augmented systems, video streaming, and congestion control algorithms for datacenter and wide-area networks. Mohammad's research has garnered significant industry interest. His work on datacenter transport protocols has been implemented in Linux and Windows, and has been deployed by large network operators; his work on adaptive network load balancing algorithms has been implemented in Cisco’s flagship datacenter switching products. Mohammad received his Ph.D. from Stanford University and then spent two years at Insieme Networks (a datacenter networking startup) and Cisco before joining MIT. He is a recipient of the Microsoft Research Faculty Fellowship, VMware Systems Research Award, NSF CAREER Award, SIGCOMM Rising Star Award, Alfred P. Sloan Research Fellowship, and multiple best paper awards.
360-Degree Video Streaming
by Yao Wang (NYU, USA)

360-degree video streaming is an emerging application that presents significant challenges to the bandwidth-limited and dynamically changing networks. This talk will present our recent work for 360-degree video streaming applications with different latency requirement, from on-demand streaming, to live streaming, to interactive streaming. We will first present the proposed system architecture for each application addressing the particular challenges as well as exploiting the unique opportunities of the underlying application. We will then discuss how to use deep learning to overcome two shared challenges in these systems: predicting where the viewer will look (FoV prediction), and adapting the total video rate based on the network condition and buffer status. For FoV prediction, we will present several LSTM-based prediction methods, ranging from using the viewer’s past FoV trajectory alone, to leveraging other viewers FOV distributions, to exploiting the actual video content. For rate adaptation, we will describe a deep reinforcement learning frame work, that aims to maximize the long-term average quality of experience.

Yao Wang is a Professor at New York University Tandon School of Engineering (formerly Polytechnic University, Brooklyn, NY), with joint appointment in Departments of Electrical and Computer Engineering and Biomedical Engineering. She is also Associate Dean for Faculty Affairs for NYU Tandon since June 2019. Her research areas include video coding and streaming, multimedia signal processing, computer vision, and medical imaging. She is the leading author of a textbook titled Video Processing and Communications, and has published over 250 papers in journals and conference proceedings. She received New York City Mayor's Award for Excellence in Science and Technology in the Young Investigator Category in year 2000. She was elected Fellow of the IEEE in 2004 for contributions to video processing and communications. She received the IEEE Communications Society Leonard G. Abraham Prize Paper Award in the Field of Communications Systems in 2004, and the IEEE Communications Society Multimedia Communication Technical Committee Best Paper Award in 2011. She was a keynote speaker at the 2010 International Packet Video Workshop, at INFOCOM Workshop on Contemporary Video in 2014, and at the 2018 Picture Coding Symposium. She received the NYU Tandon Distinguished Teacher Award in 2016.
Live Streaming at Scale in a Low-Latency World
by Eric Klein (Disney Streaming Services, USA)

Low-latency live streaming is a popular topic these days with a variety of different implementation formats appearing on the market. We will cover and discuss some of these implementation formats (Apple HLS, LLHLS) as well as discuss the infrastructure and scale needed to deliver live streaming events for broad audience, the work being done to address some of these global challenges from a caching standpoint, and some of the performance tradeoffs and benefits from different types of streaming methods and their impact on caching.

Eric Klein is the Director of Content Distribution (Delivery Infrastructure) at Disney Streaming Services focusing on video content delivery and optimization technology. Eric is also an avid member of the Streaming Video Alliance, acting as Co-chair of the Open Caching Working Group. Eric graduated summa cum laude from New York University with a bachelor degree in Film Studies. Following university, Eric focused on early streaming video initiatives with Synaptic Digital, a public relations and marketing company working with brands such as Adidas and General Motors on their video delivery initiative. Eric has taken roles in product management, product marketing and project management, focusing on video platform development, live event streaming, social media integration and online video market analysis. He previously worked at Limelight Networks as a Solutions Engineer. He is an avid early technology adopter, with a passionate focus in home entertainment and a specialized interest in video quality and delivery.

Detailed Schedule

Download the pdf

All times are UTC.
Virtual rooms will open 30 minutes prior to the first keynote session.

12:30 pm

Virtual Cookies (Virtual room 1)

All Welcome

01:00 pm

Keynote (Virtual room 1)

Yao Wang (NYU, USA) (more)

02:00 pm

MMVE (Virtual room 1)

Session #1: Immersive Mixed and Virtual Environment Systems (more)

02:30 pm

PV (Virtual room 1)

Session #1: Video Streaming and Coding (more)
Session #2: Cloud Gaming (more)

03:00 pm

Keynote (Virtual room 1)

Ryan Overbeck (Google, USA) (more)

04:00 pm

Social (Virtual room 1)

Mentoring Speed Dating

12:30 pm

Virtual Cookies (Virtual room 1)

All Welcome

01:00 pm

Keynote (Virtual room 1)

Mohammad Alizadeh (MIT, USA) (more)

02:00 pm

MMSys (Virtual room 1)

Session #1: Advances in Delay-Sensitive Media (more)

02:30 pm

MMSys (Virtual room 1)

Session #2: What is New in Streaming? (more)
Session #3: It is All about Mobile Video (more)

03:00 pm

Open Dataset and Software (Virtual room 1)

Session #1: Open Software and Tools (more)

03:00 pm

Demo and Industry (Virtual room 2)

Session #1: Immersive Media and Its Applications (more)

04:00 pm

Social (Virtual room 1)

MMSys Trivia

12:30 pm

Virtual Cookies (Virtual room 1)

All Welcome

01:00 pm

Keynote (Virtual room 1)

Eric Klein (Disney Streaming Services, USA) (more)

02:00 pm

MMSys (Virtual room 1)

Session #4: Creating Innovative Immersive Experiences (more)

02:30 pm

MMSys (Virtual room 1)

Session #5: Next-generation Media Processing (more)

03:00 pm

Open Dataset and Software (Virtual room 1)

Session #2: Datasets (more)

03:00 pm

Demo and Industry (Virtual room 2)

Session #2: Encoding, Streaming and Testing (more)

04:00 pm

N2Women Meeting (Virtual room 1)

Networking Women: Experiences and Connections

12:30 pm

Virtual Cookies (Virtual room 1)

All Welcome

01:00 pm

Keynote (Virtual room 1)

Lifeng Sun (Tsinghua University, China) (more)

02:00 pm

NOSSDAV (Virtual room 1)

Session #1: Adaptive Streaming (more)

02:30 pm

NOSSDAV (Virtual room 1)

Session #2: Immersive Media (more)

03:00 pm

Grand Challenges (Virtual room 1)

Adaptation Algorithms for Near-Second Latency (more)

03:30 pm

Keynote (Virtual room 1)

Roderick Hodgson (Amber Video, UK) (more)

04:30 pm

MMSys (Virtual room 1)

Closing and Awards

Program

Overview

Keynotes

Detailed Schedule

All times are UTC. Virtual rooms will open 30 minutes prior to the first keynote session.

Virtual Cookies (Virtual room 1)

Keynote (Virtual room 1)

MMVE (Virtual room 1)

PV (Virtual room 1)

Keynote (Virtual room 1)

Social (Virtual room 1)

Virtual Cookies (Virtual room 1)

Keynote (Virtual room 1)

MMSys (Virtual room 1)

MMSys (Virtual room 1)

Open Dataset and Software (Virtual room 1)

Demo and Industry (Virtual room 2)

Social (Virtual room 1)

Virtual Cookies (Virtual room 1)

Keynote (Virtual room 1)

MMSys (Virtual room 1)

MMSys (Virtual room 1)

Open Dataset and Software (Virtual room 1)

Demo and Industry (Virtual room 2)

N2Women Meeting (Virtual room 1)

Virtual Cookies (Virtual room 1)

Keynote (Virtual room 1)

NOSSDAV (Virtual room 1)

NOSSDAV (Virtual room 1)

Grand Challenges (Virtual room 1)

Keynote (Virtual room 1)

MMSys (Virtual room 1)

Online Social Events

All times are UTC.
Virtual rooms will open 30 minutes prior to the first keynote session.