Dimitris Samaras: On Human Behavior Analysis and Biomedical Imaging

Part ten of our AI Researcher Profile series invites Professor Dimitris Samaras, SUNY Empire Innovation Professor and Director of the Computer Vision Lab at the Department of Computer Science, Stony Brook University, to discuss his research interests and knowledge surrounding human behavior and its role in artificial intelligence.

AI Institute: What sparked your interest in computer vision, computer graphics, and their applications in biomedical imaging and human behavior analysis?
Professor Samaras: This goes a long way back, to the late 80s and early 90s, when computer graphics had just started to become a cool thing. It was when the first animated short film was created, the first-ever video game was released, and images, especially digital images, were beginning to become more widely available. I was always fascinated by how these things were made and if physics and mathematics were involved in the process. Soon, I found out that most computer graphics and computer vision tools heavily relied on physics and mathematics. That’s what drew me in. As for human behavior analysis — it is a very rich application domain that allows us to understand how the human brain works, and so, as soon as I realized I was interested in computer graphics, this became an obvious subject to pursue.

AI: You have been the Director of Computer Vision Lab at Stony Brook for 20 years. Can you share some of the more exciting developments we're witnessing in computer vision and human behavior analysis today?
DS: There have been several exciting developments in the past two decades. One is the emergence of cameras in cell phones, wearables, and many other devices that we interact with on a day-to-day basis. This has provided us with a lot of visual data to process and analyze, making it easier to train machine learning models. Another big step was the advent of machine learning in the 2000s and of deep learning in the 2010s, which allowed us to make significant progress in solving basic computer vision problems, to the point where we now have useful tools for recognizing and naming objects that appear in a scene, or accurately tracking the way people move and interact with their environment, or studying human behavior. And as we speak, there is another evolution coming — as large language models have improved, we have seen multimodal vision language models progress, allowing us to use natural language to address a number of computer vision tasks. In other words, we’re getting close to being able to ask questions about the visual world around us, like “Get me that book from the second shelf of this bookshelf,” or “Show me how I would look if I wore that dress,” or “Create a picture of me facing that mountain,” and have the machine understand and respond to our requests.

AI: Please tell us about your recent projects surrounding Computational Pathology.
DS: This is a project that we’ve been working on for eight years now. Right now, we’re trying to build large-scale foundational models on Computational Pathology (a branch of pathology that involves computational analysis of a broad array of methods to analyze patient specimens for the study of disease), because the models that are out there haven’t been trained on a breadth of data, and so they don’t translate well. We believe that spending more time on the pathology data will allow us to leverage some of the large language models, but this remains an open question. We’re also interested in the insights this will give us into human behavior, because training these models will involve analyzing how experts and trainees look at such images, and understanding what draws their attention. We’re hoping to use this information to come up with a decision-support system that can be used by healthcare professionals.

AI: You were recently published in Cancer Research for your work on Keratin 17's impact on immune response in PDAC (Pancreatic Ductal Adenocarcinoma). How have AIs contributions in PDAC evolved over the years, and what are some of the greatest challenges its facing today?
DS: We’ve seen tremendous progress in the analysis of digital images of PDAC, so much so that, soon, we’ll have more tools for studying the details in these pictures. We’ll be able to measure sizes and volumes, draw comparisons, segment and zoom into certain fractions, and count structures of interest more easily, all of which were technically impossible and unscalable up until a few years ago. This will also allow us to conduct studies and derive the knowledge needed to provide decision support, helping doctors understand what possible diagnoses exist, which areas to focus on, and what questions to ask their patients. The greatest challenges these projects face today are the lack of expert-annotated data (which is costly to accrue), privacy, ensuring the data is taken from a diverse range of institutions, and of course, computational resources, which we can never have enough of.

AI: There's an interesting paper about your work on creating 3D human portraits from casual smartphone videos. Can you talk about that?
DS: Absolutely. The technology surrounding casual portraits is at a point where we have a reliable amount of data on 3D faces. There’s a field called neural rendering, or implicit rendering, which allows us to now create representations of the data in three dimensions, so we can render multiple views by simply capturing and processing enough 3D information collected by moving a camera around a face. We’ve come a long way with this project, but we’re still facing challenges in creating these portraits fast enough while also dealing with the effects of light and shadows in base images.

AI: How have researchers' and students' perceptions toward the field of computer vision changed over the past few years?
DS: There was a time when researchers pursued computer vision out of pure intellectual curiosity; there wasn’t much of an industry for it. This changed about ten years ago, allowing Ph.D. students and researchers to understand and play around with the technology, while also making a living out of it. Today, computer vision has become a much more applied field, and a much more real one in many ways.

AI: You recently won Stony Brook's Outstanding Mentor Award. What advice would you share with those who are mentoring or wish to mentor future generations in the fields of AI and computer vision?
DS: Just that it’s important to think about your interests and what you need to achieve, and use your common sense as you work toward your goals.

AI: What kind of future are you hoping to see concerning computer vision and human behavior analysis?
DS: Human behavior analysis is a tricky field. In the future, I hope we’re able to gain more fundamental insights into how the human brain works so that we can understand people better. At the same time, these are dangerous tools. On the one hand, they can help us analyze psychopathology and track and see if people are getting better or worse, and on the other, they can become tools of surveillance. We need to be aware of how these tools are being used and how they’re released. Data should be collected and analyzed with consent, and people should be aware of how it is being used. My hope is that we arrive at a future where we’re able to gain from these technologies without giving up personal liberties.

 

Communications Assistant
Ankita Nagpal