Events

Oct 5 - Oct 11, 2025

FIFA Research Institute

Silvio Giancola, Research Scientist, Image and Video Understanding Lab

Oct 6, 12:00 - 13:00

B9 L2 R2325

AI Sports FIFA

This talk will trace the journey from SoccerNet to the creation of the FRI, highlight ongoing projects that bridge cutting-edge research with real-world impact, and discuss what lies ahead at the intersection of AI and sports.

Jul 13 - Jul 19, 2025

Towards Scalable and Efficient Semantic Video Search

Mattia Soldan, Ph.D. Student, Electrical and Computer Engineering

Jul 13, 18:00 - 19:00

B4 L5 R5209

video-language grounding semantic video retrieval multimodal alignment

This dissertation advances fine-grained, content-aware video retrieval by developing novel models and frameworks for Video-Language Grounding, enabling accurate alignment between natural language queries and specific temporal segments in unstructured video content.

Jun 22 - Jun 28, 2025

Computer Vision for Video Editing Learning to Cut, Classify, Assemble, and Generate

Alejandro Pardo, Ph.D. Student, Electrical and Computer Engineering

Jun 23, 10:00 - 12:00

Click here to join the Ph.D. defense via Zoom

This thesis advances video editing by developing a suite of computer vision models for understanding and generating editorial decisions, including a method for ranking video cuts, a dataset for classifying cut types, a language-guided timeline assembler, and a diffusion-based technique for creating match cuts.

Apr 30 - May 6, 2023

Query Localization in Long-form Videos

Mengmeng Xu (Frost), Ph.D., Electrical and Computer Engineering

May 4, 07:30 - 09:00

KAUST

The growth of digital cameras and data communication has led to an exponential increase in video production and dissemination. As a result, automatic video analysis and understanding has become a crucial research topic in the computer vision community. However, the localization problem, which involves identifying a specific event in a large volume of data, particularly in long-form videos, remains a significant challenge.

Apr 9 - Apr 15, 2023

Towards Designing Robust Deep Learning Models for 3D Understanding

Abdullah Hamdi, Ph.D., Electrical and Computer Engineering

Apr 10, 17:00 - 19:00

B3 L5 R5220

deep neural networks

Deep Neural Networks (DNNs) have shown huge success over the years to solve many 2D computer vision tasks driven by massive labeled 2D datasets and advancements in 2D vision models, but less success is witnessed on 3D vision tasks. This dissertation proposes innovative approaches to enhance the robustness of DNNs for 3D understanding and in 3D settings. The research focuses on two main areas: adversarial robustness on 3D data and setups, and the robustness of DNNs to realistic 3D scenarios. Two paradigms for 3D understanding are discussed: representing 3D as a set of 3D points and performing 2D processing of multiple images of the 3D data.

Jan 22 - Jan 28, 2023

Towards Richer Video Representation for Action Understanding

Humam Alwassel, Ph.D., Computer Science

Jan 23, 18:30 - 20:30

B2 L5 R5209

Computer Vision machine learning Human Activity Recognition

With video data dominating the internet traffic, it is crucial to develop automated models that can analyze and understand what humans do in videos. Such models must solve tasks such as action classification, temporal activity localization, spatiotemporal action detection, and video captioning. This dissertation aims to identify the challenges hindering the progress in human action understanding and propose novel solutions to overcome these challenges.

Nov 29 - Dec 5, 2020

Research at the Image and Video Understanding Lab (IVUL) - Graduate Seminar - CS

Bernard Ghanem, Professor, Electrical and Computer Engineering

Nov 30, 12:00 - 13:00

KAUST

In this talk, I will give an overview of research done in the Image and Video Understanding Lab (IVUL) at KAUST. At IVUL, we work on topics that are important to the computer vision (CV) and machine learning (ML) communities, with emphasis on three research themes: Theme 1 (Video Understanding), Theme 2 (Visual Computing for Automated Navigation), Theme 3 (Fundamentals/Foundations).

May 24 - May 30, 2020

Indoor 3D Scene Understanding Using Depth Sensors

Jean Lahoud, Ph.D., Electrical and Computer Engineering

May 28, 16:00 - 18:00

KAUST

Computer Vision 3D object detection Deep learning

One of the main goals in computer vision is to achieve a human-like understanding of images. This understanding has been recently represented in various forms, including image classification, object detection, semantic segmentation, among many others. Nevertheless, image understanding has been mainly studied in the 2D image frame, so more information is needed to relate them to the 3D world. With the emergence of 3D sensors (e.g. the Microsoft Kinect), which provide depth along with color information, the task of propagating 2D knowledge into 3D becomes more attainable and enables interaction between a machine (e.g. robot) and its environment. This dissertation focuses on three aspects of indoor 3D scene understanding: (1) 2D-driven 3D object detection for single frame scenes with inherent 2D information, (2) 3D object instance segmentation for 3D reconstructed scenes, and (3) using room and floor orientation for automatic labeling of indoor scenes that could be used for self-supervised object segmentation. These methods allow capturing of physical extents of 3D objects, such as their sizes and actual locations within a scene.

Mar 29 - Apr 4, 2020

Understanding a Block of Layers in Deep Neural Networks: Optimization, Probabilistic and Tropical Geometric Perspectives

Adel Bibi, Ph.D., Electrical and Computer Engineering

Mar 30, 18:00 - 20:00

KAUST

Computer Vision machine learning optimization

In this dissertation, we aim at theoretically studying and analyzing deep learning models. Since deep models substantially vary in their shapes and sizes, in this dissertation, we restrict our work to a single fundamental block of layers that is common in almost all architectures. The block of layers of interest is the composition of an affine layer followed by a nonlinear activation function and then lastly followed by another affine layer. We study this block of layers from three different perspectives. (i) An Optimization Perspective. We try addressing the following question: Is it possible that the output of the forward pass through the block of layers highlighted above is an optimal solution to a certain convex optimization problem? As a result, we show an equivalency between the forward pass through this block of layers and a single iteration of certain types of deterministic and stochastic algorithms solving a particular class of tensor formulated convex optimization problems.

Jun 9 - Jun 15, 2019

Efficient Localization of Human Actions and Moments in Videos

Victor Escorcia, Ph.D., Electrical and Computer Engineering

Jun 11, 15:00 - 16:00

B3 L5 R5220

Computer Vision machine learning artificial intelligence

Abstract We are stumbling across a video tsunami flooding our communication channels. The ubiquity of digital cameras and social networks has increased the amount of visual media content generated and shared by people, in particular videos. Cisco reports that 82% of the internet traffic would be in the form of videos by 2022. The computer vision community has embraced this challenge by offering the first building blocks to translate the visual data in segmented video clips into semantic tags. However, users usually require to go beyond tagging at the video level. For example, someone may want

May 12 - May 18, 2019

Sim-to-Real Transfer for Autonomous Navigation

Matthias Mueller, Ph.D., Electrical and Computer Engineering

May 14, 16:00 - 17:00

B2 L5 R5220

Computer Vision UAV robotics machine learning

This work investigates the problem of transfer from simulation to the real world in the context of autonomous navigation. To this end, we first present a photo-realistic training and evaluation simulator Sim4CV which enables several applications across various fields of computer vision. Built on top of the Unreal Engine, the simulator features cars and unmanned aerial vehicles (UAVs) with a realistic physics simulation and diverse urban and suburban 3D environments. We demonstrate the versatility of the simulator with two case studies: autonomous UAV-based tracking of moving objects and autonomous driving using supervised learning.

Feb 10 - Feb 16, 2019

ML Hub Seminar Series | The Machine Learning (ML) Hub

Bernard Ghanem, Professor, Electrical and Computer Engineering

Feb 13, 12:00 - 13:00

B9 H2 R2325

machine learning

The Machine Learning Hub @ KAUST is designed to be the one-stop-shop for machine learning (ML) and artificial intelligence (AI) at KAUST. It is an informal forum for exchanging ideas in these areas, including (but not limited to) theoretical foundations, systems, tools, and applications. It will be providing several offerings to the KAUST community interested in ML and AI, including a regular seminar series where new research in the field is presented, an online social forum dedicated to AI and ML discussions, announcements, brainstorming, collaborations, and hands-on activities (e.g