One of the most striking findings in modern research on large language models (LLMs) is that, given a model and dataset of sufficient scale, scaling up compute at training time leads to better final results. However, there is also another lesser-mentioned scaling phenomenon, where adopting more sophisticated methods and/or scaling compute at inference time can result in significantly better output from LLMs. We will present a tutorial on past and present classes of generation algorithms for generating text from autoregressive LLMs, ranging from greedy decoding to sophisticated meta-generation algorithms used to power compound AI systems. We place a special emphasis on techniques for making these algorithms efficient, both in terms of token costs and generation speed. Our tutorial unifies perspectives from three research communities: traditional natural language processing, modern LLMs, and machine learning systems. In turn, we aim to make attendees aware of (meta-)generation algorithms as a promising direction for improving quality, increasing diversity, and enabling resource-constrained research on LLMs.
Our tutorial will be held on Tuesday December 10, 1:30pm - 4:00pm (all the times are Vancouver local time).
Time | Section | Presenter |
---|---|---|
1:30pm - 1:40pm | Section 1: Introduction [Slides] | Sean |
1:40pm - 2:10pm | Section 2: Generation algorithms [Slides] | Matthew |
2:10pm - 2:50pm | Section 3: Meta-generation algorithms [Slides] | Sean |
2:50pm - 3:20pm | Section 4: Efficient generation [Slides] | Hailey |
3:20pm - 3:25pm | Section 5: Conclusion [Slides] | Sean |
3:25pm - 3:55pm | Panel discussion | Ilia |
Join us for an insightful panel discussion featuring a selected group of experts in research related to Large Language Models (LLMs) and meta-generation algorithms. Our panelists are listed below!
@article{welleck2024metageneration,
title={From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models},
author={Sean Welleck and Amanda Bertsch and Matthew Finlayson and Hailey Schoelkopf and Alex Xie and Graham Neubig and Ilia Kulikov and Zaid Harchaoui},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2024},
url={https://openreview.net/forum?id=eskQMcIbMS},
note={Survey Certification}
}