logo

Cohere: Unlocking the Potential of Large Language Models (LLMs)

Conference:  Transform X 2022

2022-10-19

Authors:   Aidan Gomez


Summary

The speaker discusses the practical applications and limitations of large language models, and emphasizes the importance of making the technology accessible to developers. They also address concerns around bias in data and the need for monitoring and mitigation.
  • Large language models can be used for creative applications such as world building in gaming and pro-social technology to create healthier online communities
  • The technology requires deep contextual understanding of language and sentiment analysis
  • To drive adoption, the interfaces onto the text need to be made easier for developers to use
  • Data filtration and monitoring are necessary to mitigate bias and prevent misuse
  • The speaker is excited about the potential for models to use tools and references in the world to improve efficiency
The speaker gives an example of a game character that can respond to questions about their life, which is currently difficult to achieve without writing a complex dialogue tree. They also discuss the need for middle ground solutions between hiring an ML expert or buying an off-the-shelf solution for implementing smart features like search. The speaker emphasizes the importance of making the technology accessible to all developers, including high school students learning to code.

Abstract

Aidan Gomez is the co-founder and CEO of Cohere, a provider of cutting-edge NLP models, and coauthor of "Attention is All You Need," one of the most-cited machine learning papers of all time, which introduced the world to the transformer architecture. Gomez and Scale CEO Alexandr Wang discuss this paper's impact on the world and how it has led to the creation of large language models (LLMs) such as GPT-3 and BLOOM. While the current generation of LLMs is impressive, Gomez explains why they are not yet good enough for production applications and are inaccessible to most developers. He believes the industry must make it easy to access models and incorporate humans-in-the-loop to improve model performance so that the average developer without ML expertise can enrich their applications. In this fireside chat, Wang and Gomez also discuss trends to watch for, including providing AI with tools such as knowledge bases to augment its capabilities and the advent of multimodal models with text, images, video, and audio fused in one model. Gomez draws from his experience focusing on large-scale machine learning during his time at Google Brain, where he collaborated with many AI luminaries, including Geoff Hinton and Jeff Dean.

Materials:

Post a comment

Related work

Conference:  Transform X 2022
Authors: Thomas Kurian, Alexandr Wang
2022-10-19

Conference:  Transform X 2022
Authors: Dr. Craig Martell, Alexandr Wang
2022-10-19

Conference:  Transform X 2021
Authors: Andrew Ng
2021-10-07


Conference:  Transform X 2022
Authors: Austin Russell, Alexandr Wang
2022-10-19

Conference:  Transform X 2022
Authors: Jason Matheny, Alexandr Wang
2022-10-19