Collecting and Leveraging Data without Crowd Workers

Speaker

Yuval Kirstain: During my PhD in generative AI under Professor Omer Levy, I specialized in natural language processing and text-to-image generation. My research has been devoted to developing solutions for acquiring and utilizing task-specific data without the need for traditional crowd workers, employing self-supervised training techniques and gamification instead. My practical experience includes two internships at Facebook AI Research (FAIR), focusing on text-to-image generation and editing. Prior to FAIR, I worked on end-to-end task-oriented dialogue during an internship at IBM Research and gained comprehensive experience in training and evaluating large language models while working at AI21 Labs. Homepage: https://www.yuvalkirstain.com

Abstract

In this presentation, we will cover three papers that explore methods for collecting and utilizing task-specific data without reliance on traditional crowd workers, with a focus on both natural language processing and text-to-image generation. The first two papers center on self-supervised training techniques. The first paper introduces a pretraining scheme called “recurring span selection,” enabling dramatic improvements in few-shot question answering. In the second paper, we present a self-supervised training approach for subject-driven text-to-image generation. By combining this approach with a new conditioning mechanism on images, we generate images that align with both user prompts and subject images without needing optimization during inference time. The third paper diverges by introducing a gamification method for collecting supervised task-specific data. This approach allows us to collect and open-source a large dataset for human preferences in text-to-image generation, leading to the creation of PickScore, a scoring function that exhibits superhuman performance in predicting human preferences.

Video