KEYNOTES
What do image generators know?
Abstract
Intrinsic images are maps of surface properties, like depth, normal and albedo.
I will show the results of simple experiments that suggest that very good modern depth, normal and albedo predictors are strongly sensitive to lighting – if you relight a scene in a reasonable way, the reported depth will change. This is intolerable. To fix this problem, we need to be able to produce many different lightings of the same scene. I will describe a method to do so. First, one learns a method to estimate albedo from images without any labelled training data (which turns out to perform well under traditional evaluations). Then, one forces an image generator to produce many different images that have the same albedo — with care, these are relightings of the same scene. I will show some interim results suggesting that learned relightings might genuinely improve estimates of depth, normal and albedo.
But if an image generator can relight a scene, it likely has a representation of depth, normal, albedo and other useful scene properties somewhere. I will show strong evidence that depth, normal and albedo can be extracted from two kinds of image generator, with minimal inconvenience or training data. Furthermore, all these intrinsics are much less sensitive to lighting changes. This suggests that the right way to obtain intrinsic images might be to recover them from image generators. It also suggests image generators might “know” more about scene appearance than we realize.
About the Speaker
He is currently Fulton-Watson-Copp chair in computer science at U. Illinois at Urbana-Champaign, where he moved from U.C Berkeley, where he was also a full professor. He has occupied the Fulton-Watson-Copp chair in Computer Science at the University of Illinois since 2014. He has published over 170 papers on computer vision, computer graphics, and machine learning and served as program co-chair for IEEE Computer Vision and Pattern Recognition in 2000, 2011, 2018, and 2021; general co-chair for CVPR 2006 and 2015 and ICCV 2019; and program co-chair for the European Conference on Computer Vision 2008. He is a regular program committee member of all major international conferences on computer vision and several scientific advisory boards. He has served six years on the SIGGRAPH program committee and is a regular reviewer for that conference. He has served two terms as Editor in Chief, IEEE TPAMI. He has received best paper awards at the International Conference on Computer Vision and at the European Conference on Computer Vision. He also received an IEEE Technical Achievement award in 2005 for his research. He became an IEEE Fellow in 2009 and an ACM Fellow in 2014. His textbook, “Computer Vision: A Modern Approach” (joint with J. Ponce and published by Prentice Hall), is now widely adopted as a course text (adoptions include MIT, U. Wisconsin-Madison, UIUC, Georgia Tech, and U.C. Berkeley). A further textbook, “Probability and Statistics for Computer Science”, is in print; yet another (“Applied Machine Learning”) has just appeared.
David Forsyth
A journey through hierarchies, watersheds, and minimum spanning trees for image segmentation
Jean Cousty
Abstract
A segmentation is a mid-level representation of an image (or more generally of a data set) into regions where two elements of the same region share similar features, such as spatial, spectral or semantic proximity. When each picture element belongs to a single region, the segmentation is a partition of the space and the segmentation is crisp, while, when the segmentation contains nested regions, the segmentation is a hierarchy providing a multiscale representation. Watershed is one of the most commonly used notions for image segmentation (crisp and hierarchy) both because of its accuracy and explainability and of the existence of fast algorithms to compute it. Nowadays, it is often combined with deep learning models that are trained to predict excellent contour images from which watersheds are computed. In this talk, we present a set of remarkable results and algorithms for watersheds and hierarchies in edge-weighted graphs. In particular, we highlight through equivalence theorems, the relations between these notions and the well-known combinatorial optimization problem of finding a minimum spanning tree. This provides us with both optimality conditions and an efficient algorithmic framework. Finally, we give an overview of recent works where these results are used to merge hierarchies, to compute watersheds in differential and interactive processes, to out-of-core segment Giba-byte images, to learn the seeds for interactive segmentation, and to learn hierarchies. For the last two works, the hierarchy algorithm is differentiated in order to train a neural network in an end-to-end manner with gradient descent algorithms.
About the speaker
Jean Cousty received the engineering degree from ESIEE Paris, France in 2004, the Ph.D. degree from Université de Marne-la-Vallée in 2007 and the Habilitation à Diriger des Recherches from Université Paris-Est in 2018. After a one-year post-doctoral period in the ASCLEPIOS research team at INRIA (Sophia-Antipolis, France), he has been teaching and doing research at the Computer Science Department, ESIEE Paris, and at Laboratoire d’Informatique Gaspard-Monge, Université Gustave Eiffel. From 2015 to 2017, he was an invited Professeur in Brazil at UFMG and PUC Minas. His current research interests include graph-based approaches to image analysis and computer vision, hierarchical analysis, mathematical morphology, and discrete topology.
Enhancing Realistic Rendering for Mixed and Virtual Reality Games
Esteban Walter Gonzalez Clua
Abstract
The video game industry continuously advances real-time rendering techniques, with an increasing focus on features like ray-tracing and global illumination. Additionally, VR/MR/AR games are pushing for high-quality rendering despite constraints such as high-definition displays (requiring many pixels), less powerful processors, and higher frequency requirements. This talk will present key optimization strategies, including hybrid denoising, foveated culling methods, optimization for foveated displays, and the usage of neural rendering approaches.
About the speaker
Esteban is Full professor at Universidade Federal Fluminense and coordinator of UFF Medialab, CNPq researcher 1D, Scientist of the State of Rio since 2019. He is undergraduate in Computer Science by Universidade de São Paulo and has master and doctor degree by PUC-Rio. His main research and development area are Real Time rendering, Digital Games, Virtual Reality, GPUs. He is one of the founders of SBGames (Brazilian Symposium of Games and Digital Entertainment) and was the president of Game Committee of the Brazilian Computer Society from 2010 through 2014. He is the general chair of the IFIP TC14 (Entertainment Computing). Esteban is also one of the founders of ABRAGAMES. In 2015 he was nominated as NVidia CUDA Fellow. Esteban is member of the program committee of most digital entertainment conferences. Esteban has 66 journal papers and 224 conference papers published up to now. In 2024 he is the Program chair of the ACM High Performance Computing and General chair of the IFIP International Conference on Entertainment Computing. In 2023 Esteban received the SBGames Award for his life career.
Seeing is learning in high dimensions
Alexandru C. Telea
Abstract
Multidimensional projections (MPs) are one of the techniques of choice for visually exploring large high-dimensional data. Machine learning (ML) and in particular deep learning applications are one of the most prominent generators of large, high-dimensional, and complex datasets which need visual exploration. In this talk, I will explore the connections, challenges, and potential synergies between these two fields.These involve “seeing to learn”, or how to use MP techniques to open the black box of ML models, and “learning to see”, or how to use ML to create better MP techniques for visualizing high-dimensional data. Specific topics include selecting suitable MP methods from the wide arena of such available techniques; using ML to create faster and simpler to use MP methods; assessing projections from the novel perspectives of stability and ability to handle time-dependent data; using projections to create dense representations of classifiers; and revisiting the question of what is a high-quality projection.
About the speaker
Alexandru Telea is Professor of Visual Data Analytics at the Department of Information and Computing Sciences, Utrecht University. He holds a PhD from Eindhoven University and has been active in the visualization field for over 25 years. He has been the program co-chair, general chair, or steering committee member of several conferences and workshops in visualization, including EuroVis, VISSOFT, SoftVis, and EGPGV. His main research interests cover unifying information visualization and scientific visualization, high-dimensional visualization, and visual analytics for machine learning. He has authored over 350 papers. He is the author of the textbook “Data Visualization: Principles and Practice” (CRC Press, 2014), a worldwide reference in teaching data visualization.
Graph-based image segmentation subject to high-level constraints
Abstract
Graph-based frameworks by combinatorial optimization can handle image segmentation as a graph partition problem subject to soft and hard constraints. Recent methods consider the usage of optimal cuts in directed weighted graphs, enabling them to support the processing of several high-level priors, including global properties such as connectedness, shape constraints, boundary polarity, maximum allowable size, closeness constraints, and hierarchical constraints, which allow the customization of the segmentation to a given target object. In this talk, I will discuss some of our recent research on image segmentation subject to high-level constraints expected for the objects of interest, including methods in layered graphs, that lie at the intersection of Generalized Graph Cut and General Fuzzy Connectedness frameworks, and methods in Component Trees.