Sep’2023: Our paper on PAC learnability for in-context learning was accepted to NeurIPS 2023.
Aug’2023: In-Context Retrieval-Augmented Language Models was accepted for journal publication at TACL.
July’2023: We released a paper proposing a new automatic method for evaluating an LLM propensity to generate correct facts from a given corpus.
May’2023: We released a paper concluding the largest Turing style experiment to date, with over 10 million conversations of humans trying to reveal whether their partner is human or bot.
May’2023: Parallel Context Windows was accepted to ACL 2023.
Apr’2023: We released a theoretical paper proposing a model of adversarial prompting and misalignment attacks on LLMs.
Apr’2023: I am excited to announce that I’ll be joining the Blavatnik School of Computer Science at Tel Aviv University as a faculty member in Fall 2024!
Mar’2023: We released a theoretical paper proposing a PAC learning framework for in-context learning in LLMs.
Feb’2023: We released In-context Retrieval-Augmented Language Models, a paper studying the benefits of including retrieved documents in the LLM context window. Code available here.
Jan’2023: Our paper on the theoretical benefits of sub-task decomposition in seq-to-seq model was accepted to ICLR 2023.
Dec’2022: We released Parallel Context Windows, a technique for enlarging an LLM’s context window without training. Code is available here.
Jul’2022: Our paper on frozen large language models as readers for open domain question answering was accepted for presentation at the ICML 2022 Workshop on Knowledge Retrieval and Language Models.
May’22: We released a paper which proposes a modular, neuro-symbolic architecture that combines large language models, external knowledge sources, and discrete reasoning.
Apr’22: We released Standing on the Shoulders of Giant Frozen LMs, a paper which proposes using huge LM as a backbones surrounded by supporting networks that are 1000X smaller (BERT scale) that are able to externally specialize it without sacrificing performance and without actually changing its weights.
Apr’22: We released a paper which proves a first positive theoretical result for learning with intermediate supervision. Our paper theoretically motivates trending approaches such as Chain of Thought Prompting, which tackle compounded problems in natural language via introducing intermediate supervision in a seq2seq manner.
Jan’22: Our paper on the in-context learning bias in Transformer pretraining was accepted as a spotlight paper at ICLR 2022.
Oct’21: We released a paper that theoretically establishes an in-context learning bias in Transformer architectures, and proposes kNN-Pretraining: a new paradigm for pretraining-example design.
May’21: Our paper on expressivity bottlenecks in self-attention and their impact on Transformer architecture design across data modalities was accepted to ICML 2021.
Apr’21: I am a recipient of the Blavatnik Prize for PhD students.
Jan’20: Deep Autoregressive Models for the Efficient Variational Simulation of Many-Body Quantum Systems was accepted to Physical Review Letters.
Aug’2023: In-Context Retrieval-Augmented Language Models was accepted for journal publication at TACL.
July’2023: We released a paper proposing a new automatic method for evaluating an LLM proneness to generate correct facts from a given corpus.
May’2023: We released a paper concluding the largest Turing style experiment to date, with over 10 million conversations of humans trying to reveal whether their partner is human or bot.
May’2023: Parallel Context Windows was accepted to ACL 2023.
APR’2023: I am excited to announce that I’ll be joining the Blavatnik School of Computer Science at Tel Aviv University as a faculty member in Fall 2024!
Mar’2023: We released a theoretical paper proposing a PAC learning framework for in-context learning in LLMs.
Feb’2023: We released In-context Retrieval-Augmented Language Models, a paper studying the benefits of including retrieved documents in the LLM context window. Code available here.
Jan’2023: Our paper on the theoretical benefits of sub-task decomposition in seq-to-seq model was accepted to ICLR 2023.
Dec’2022: We released Parallel Context Windows, a technique for enlarging an LLM’s context window without training. Code is available here.
Jul’2022: Our paper on frozen large language models as readers for open domain question answering was accepted for presentation at the ICML 2022 Workshop on Knowledge Retrieval and Language Models.
May’22: We released a paper which proposes a modular, neuro-symbolic architecture that combines large language models, external knowledge sources, and discrete reasoning.
Apr’22: We released Standing on the Shoulders of Giant Frozen LMs, a paper which proposes using huge LM as a backbones surrounded by supporting networks that are 1000X smaller (BERT scale) that are able to externally specialize it without sacrificing performance and without actually changing its weights.
Apr’22: We released a paper which proves a first positive theoretical result for learning with intermediate supervision. Our paper theoretically motivates trending approaches such as Chain of Thought Prompting, which tackle compounded problems in natural language via introducing intermediate supervision in a seq2seq manner.
Jan’22: Our paper on the in-context learning bias in Transformer pretraining was accepted as a spotlight paper at ICLR 2022.
Oct’21: We released a paper that theoretically establishes an in-context learning bias in Transformer architectures, and proposes kNN-Pretraining: a new paradigm for pretraining-example design.
May’21: Our paper on expressivity bottlenecks in self-attention and their impact on Transformer architecture design across data modalities was accepted to ICML 2021.
Apr’21: I am a recipient of the Blavatnik Prize for PhD students.
Jan’20: Deep Autoregressive Models for the Efficient Variational Simulation of Many-Body Quantum Systems was accepted to Physical Review Letters.