SAI RAJESWAR

I am Sai Rajeswar, a Senior Research Scientist at ServiceNow Research in Montreal, engaged in the exploration and development of advanced machine learning models. My research interests are rooted in Representation Learning and large-scale learning of Generative Models that form the backbone of Multimodal Foundation Models.

My work also spans building generalist agent models combining Sequential Decision Making and World Models that empower agents with a sense of understanding and interaction with their environments. With a focus on integrating perception and action, my research endeavors to bridge the gap w.r.t real-world applicability.

Previously, I obtained my Ph.D. at MILA, University of Montreal, supervised by Prof.Aaron Courville. where I am included in the Dean’s Honor List for the graduating year 2022-23. During my Ph.D. I had an opportunity to work as Research Scientist Intern at Google DeepMind and Google Research.

See my Google Scholar page for my research.

  • If my research interests align with yours, whether it be the aspects of mutlimodal representation learning or sequential decision making systems, I am open to collaborations.

Recent Research

EQUIVARIANT ADAPTATION OF LARGE PRETRAINED MODELS

Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to higher sample efficiency and more accurate and robust predictions. Read more

CAPTURE THE FLAG: UNCOVERING DATA INSIGHTS WITH LARGE LANGUAGE MODELS

The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. However, accomplishing this task requires considerable technical skills, domain expertise, and human labor. Read more

THE UNSOLVED CHALLENGES OF LLMS AS GENERALIST WEB AGENTS: A CASE STUDY

In this work, we investigate the challenges associated with developing goal-driven AI agents capable of performing novel tasks in a web environment using zero-shot learning. Our primary focus is on harnessing the capabilities of large language models (LLMs) as generalist web agents interacting with HTML-based user interfaces (UIs). Read more

Scroll to Top