Spyros Gidaris
du laboratoire d’informatique Gaspard-Monge (LIGM)
pour sa thèse intitulée “effective and annotation-efficient deep learning for image understanding”, sur un sujet traité aujourd’hui par beaucoup de chercheurs, mais dont la qualité scientifique a été jugée de niveau mondial.
1. can you summarize your thesis ?
My PhD thesis is divided into two parts. The subject of the first part was to use deep learning in order to devise methods that are able to semantically interpret the scene depicted on an image, i.e., to recognize the type of objects from which a scene consists of, and to localize those objects on the scene. For example, given an image from the front view of a car, such a method should be able to estimate where is the road and where the pavement on scene that it sees, as well as to localize and recognize objects of interests on the scene, such as other cars, humans, or obstacles on the road. The goal of the first part was to advance the state-of-the-art to this very interesting and practical type of image understanding problems.
Deep learning-based image understanding models, as those that I had to develop for the first part of my thesis, have been proven very successful. However, they have a major limitation, in order to successfully learn to perform such image understanding tasks, they require millions of manually annotated training images. By manually annotation I mean that for each training image, a human must specify what is the desired output that an image understanding system must have for this image. So, in the scene understanding case, the human should annotate with bounding boxes and pixel-wise labels the objects that exist in the image. This is very tedious and error-prone task that might take several minutes per image to be performed. Therefore, as you can understand, in this way it is very difficult and expensive to deploy deep learning based systems for real-life applications, such as self-driving cars, automatic diagnosis from medical images, etc. So, the goal of the second part of my thesis was to explore and propose methods that would allow to apply deep learning for image understanding problems using very limited amount of manually annotated training data.
2. To be awarded the "Best Thesis of the Year" award, what does this mean for you?
Being awarded for my thesis is a very proud moment for me. It means to me that the result of all my hard workd during my PhD, is recognized as valuable and interesting by Fondation des Ponts. Also, it motivates me to continue working hard addressing the challenging and interesting problems of AI (Artificial Intelligence). Therefore, I feel very grateful for that to Fondation des Ponts.
3. What aspects do you think your thesis makes the difference?
I think what made the difference is that with my PhD thesis I contributed to some very challenging and practical problems of automatic image understanding, which is essential to the development of numerous AI applications, such as self-driving cars, devices that provide assistance to visually impaired people, or intelligent transportation systems (e.g., vehicle tracking, traffic anomaly detection, …).
4. A fact that marked you during the jury, an anecdote or an appreciable moment ?
I did my presentation via teleconferencing, which unfortunately did not allow to have much interaction with the jury members. So, I don’t have such a moment.
5. What next ? What are you going to do with this money ?
I am thinking of spending the money to a trip somewhere in the world that I have never been before.
6. Two words ?
Patience and perseverance.