Yoav Levine

Yoav Levine

About me

  • Until recently, I conducted research around large language models as co-Chief Scientist at AI21 Labs.
  • I earned my Ph.D. at the School of Computer Science and Engineering at the Hebrew University of Jerusalem, under the supervision of Prof. Amnon Shashua.
  • I completed my M.Sc. degree in the Condensed Matter Physics Department at The Weizmann Institute of Science, under the supervision of Prof. Yuval Oreg.
  • I have a double B.Sc. in Physics and Electrical Engineering from Tel Aviv University.

Research

I am fascinated with theoretical and practical aspects of using deep learning for natural language processing based applications. My doctoral work was mainly theoretical, and revolved around the application of deep networks in the fields of natural language processing and many-body quantum physics. At AI21 Labs, I empirically investigated the inner workings of large language models and developed practical methods for improving their capabilities and reliability.

News

  • Sep’2023: Our paper on PAC learnability for in-context learning was accepted to NeurIPS 2023.

  • Aug’2023: In-Context Retrieval-Augmented Language Models was accepted for journal publication at TACL.

  • July’2023: We released a paper proposing a new automatic method for evaluating an LLM propensity to generate correct facts from a given corpus.

  • May’2023: We released a paper concluding the largest Turing style experiment to date, with over 10 million conversations of humans trying to reveal whether their partner is human or bot.

  • May’2023: Parallel Context Windows was accepted to ACL 2023.

  • Apr’2023We released a theoretical paper proposing a model of adversarial prompting and misalignment attacks on LLMs.

  • Apr’2023: I am excited to announce that I’ll be joining the Blavatnik School of Computer Science at Tel Aviv University as a faculty member in Fall 2024!

  • Mar’2023: We released a theoretical paper proposing a PAC learning framework for in-context learning in LLMs.

  • Feb’2023: We released In-context Retrieval-Augmented Language Models, a paper studying the benefits of including retrieved documents in the LLM context window. Code available here.

  • Jan’2023: Our paper on the theoretical benefits of sub-task decomposition in seq-to-seq model was accepted to ICLR 2023.

  • Dec’2022: We released Parallel Context Windows, a technique for enlarging an LLM’s context window without training. Code is available here.

  • Jul’2022: Our paper on frozen large language models as readers for open domain question answering was accepted for presentation at the ICML 2022 Workshop on Knowledge Retrieval and Language Models.

  • May’22: We released a paper which proposes a modular, neuro-symbolic architecture that combines large language models, external knowledge sources, and discrete reasoning.

  • Apr’22: We released Standing on the Shoulders of Giant Frozen LMs, a paper which proposes using huge LM as a backbones surrounded by supporting networks that are 1000X smaller (BERT scale) that are able to externally specialize it without sacrificing performance and without actually changing its weights.

  • Apr’22: We released a paper which proves a first positive theoretical result for learning with intermediate supervision. Our paper theoretically motivates trending approaches such as Chain of Thought Prompting, which tackle compounded problems in natural language via introducing intermediate supervision in a seq2seq manner.

  • Jan’22: Our paper on the in-context learning bias in Transformer pretraining was accepted as a spotlight paper at ICLR 2022.

  • Oct’21: We released a paper that theoretically establishes an in-context learning bias in Transformer architectures, and proposes kNN-Pretraining: a new paradigm for pretraining-example design.

  • May’21: Our paper on expressivity bottlenecks in self-attention and their impact on Transformer architecture design across data modalities was accepted to ICML 2021.

  • Apr’21: I am a recipient of the Blavatnik Prize for PhD students.

  •  Jan’21: Our paper on PMI-Masking was accepted as a spotlight paper at ICLR 2021.
  • Sep’20: Our paper on the depth-to-width trade-off in self attention was accepted to NeurIPS 2020.
  • June’20: We released a paper shedding light on the interplay between depth and width in self-attention architectures. See blogpost for overview of results.
  • Apr’20: Our SenseBERT paper was accepted to ACL 2020.
© 2023 Yoav Levine. All rights reserved.
Yoav Levine

About me

  • Until recently, I conducted research around large language models as co-Chief Scientist at AI21 Labs.
  • I earned my Ph.D. at the School of Computer Science and Engineering at the Hebrew University of Jerusalem, under the supervision of Prof. Amnon Shashua.
  • I completed my M.Sc. degree in the Condensed Matter Physics Department at The Weizmann Institute of Science, under the supervision of Prof. Yuval Oreg.
  • I have a double B.Sc. in Physics and Electrical Engineering from Tel Aviv University.

Research

I am fascinated with theoretical and practical aspects of using deep learning for natural language processing based applications. My doctoral work was mainly theoretical, and revolved around the application of deep networks in the fields of natural language processing and many-body quantum physics. At AI21 Labs, I empirically investigated the inner workings of large language models and developed practical methods for improving their capabilities and reliability.

News

  • Aug’2023: In-Context Retrieval-Augmented Language Models was accepted for journal publication at TACL.

  • July’2023: We released a paper proposing a new automatic method for evaluating an LLM proneness to generate correct facts from a given corpus.

  • May’2023: We released a paper concluding the largest Turing style experiment to date, with over 10 million conversations of humans trying to reveal whether their partner is human or bot.

  • May’2023: Parallel Context Windows was accepted to ACL 2023.

  • APR’2023: I am excited to announce that I’ll be joining the Blavatnik School of Computer Science at Tel Aviv University as a faculty member in Fall 2024!

  • Mar’2023: We released a theoretical paper proposing a PAC learning framework for in-context learning in LLMs.

  • Feb’2023: We released In-context Retrieval-Augmented Language Models, a paper studying the benefits of including retrieved documents in the LLM context window. Code available here.

  • Jan’2023: Our paper on the theoretical benefits of sub-task decomposition in seq-to-seq model was accepted to ICLR 2023.

  • Dec’2022: We released Parallel Context Windows, a technique for enlarging an LLM’s context window without training. Code is available here.

  • Jul’2022: Our paper on frozen large language models as readers for open domain question answering was accepted for presentation at the ICML 2022 Workshop on Knowledge Retrieval and Language Models.

  • May’22: We released a paper which proposes a modular, neuro-symbolic architecture that combines large language models, external knowledge sources, and discrete reasoning.

  • Apr’22: We released Standing on the Shoulders of Giant Frozen LMs, a paper which proposes using huge LM as a backbones surrounded by supporting networks that are 1000X smaller (BERT scale) that are able to externally specialize it without sacrificing performance and without actually changing its weights.

  • Apr’22: We released a paper which proves a first positive theoretical result for learning with intermediate supervision. Our paper theoretically motivates trending approaches such as Chain of Thought Prompting, which tackle compounded problems in natural language via introducing intermediate supervision in a seq2seq manner.

  • Jan’22: Our paper on the in-context learning bias in Transformer pretraining was accepted as a spotlight paper at ICLR 2022.

  • Oct’21: We released a paper that theoretically establishes an in-context learning bias in Transformer architectures, and proposes kNN-Pretraining: a new paradigm for pretraining-example design.

  • May’21: Our paper on expressivity bottlenecks in self-attention and their impact on Transformer architecture design across data modalities was accepted to ICML 2021.

  • Apr’21: I am a recipient of the Blavatnik Prize for PhD students.

  •  Jan’21: Our paper on PMI-Masking was accepted as a spotlight paper at ICLR 2021.
  • Sep’20: Our paper on the depth-to-width trade-off in self attention was accepted to NeurIPS 2020.
  • June’20: We released a paper shedding light on the interplay between depth and width in self-attention architectures. See blogpost for overview of results.
  • Apr’20: Our SenseBERT paper was accepted to ACL 2020.
© 2023 Yoav Levine. All rights reserved.