Rinon Gal is a Ph.D. student at Tel Aviv University where he is supervised by Prof. Daniel Cohen-Or and Dr. Amit Bermano. His research focuses on generative models, few-shot and unsupervised approaches, and on combining vision and language. Recently, Rinon has been interning at NVIDIA Research, where he is working on personalization of vision and language models.
Text-to-image models offer unprecedented freedom to guide creation through natural language. Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes. In this talk, I will outline Textual Inversion and other recent methods that enable such control. We will discuss their strengths and limitations, and leverage them to provide some insights into the structure of the word embedding space.