Join senior executives in San Francisco on July 11-12 to learn how leaders are integrating and optimizing AI investments for success. Learn more
Over the past decade, improvements in the ability of machines to generate images and text have been staggering. As is often the case with innovation, progress is not linear, but happens in leaps and bounds, surprising and delighting researchers and users alike. 2022 has been a banner year for generative AI innovation, built on the advent of broadcast methods for image generation and increasingly large transformers for text generation.
And while it provided a major leap forward for the entire natural language processing (NLP) industry, there are three reasons why generative AI models were the first to garner attention. audience enthusiasm, and why they will remain the main entry points into this language that AI can do for the time being.
What’s behind the generative excitement of AI?
The most obvious reason is that they belong to a very intuitive class of AI systems. These templates aren’t used to create high-dimensional vector or uninterpretable code, but rather natural-looking images or smooth, consistent text – something everyone can see and understand. People outside of machine learning don’t need specific expertise to judge the nature or fluidity of the system, which makes this part of AI research much more accessible than other fields ( perhaps equally important).
Second, there is a direct link between generation and how we assess intelligence: when examining students in school, we value the ability to generate answers about the ability to discriminate answers by selecting the correct answer. We believe that having students explain things in their own words helps show a better understanding of the topic – eliminating the possibility that they have simply guessed the correct answer or memorized it.
Join us in San Francisco on July 11-12, where senior executives will share how they integrated and optimized AI investments for success and avoided common pitfalls.
So when artificial systems produce natural images or coherent prose, we feel constrained to compare this to similar knowledge or understanding in humans, although whether this is too generous for the actual capabilities of artificial systems is an open question in the research community. What is technically clear is that the ability of models to produce novel but plausible images and text shows that rich internal representations of the underlying domain (e.g., the task at hand, the kind of things the images or text are talking about) are contained in these templates.
Moreover, these representations are useful in a wider range of areas than just generation for generation. In short, while generative models were the first models to catch the public eye, there will be many more valuable use cases to come.
One thing from another
Third, the latest generative models show an ability to generate conditionally. Instead of sampling existing images or text snippets, they have the option to create text, videos, images, or other modalities that are conditioning over something else – like partial text or images.
To understand why this is important, just look at most human activities, which consist of generating something based on something else. To give some examples:
- Writing an essay is about generating a text conditioned by a question/topic and the knowledge and viewpoints contained in our own experience and in books, articles and other documents.
- To have a conversation is to generate responses conditioned by our knowledge of the world, our understanding of the pragmatics that the situation calls for and of what has been said so far in the conversation.
- Drawing architectural plans generates an image based on our knowledge of architectural and structural engineering principles, sketches or images of the terrain and its topology/environment, and the (often under-specified) requirements provided by the client.
Most intelligent behavior follows this production pattern something based on other stuff as context. The fact that artificial systems now have this capability means that we are likely to see more automation in our work, or at least a more symbiotic relationship between humans and computers to get things done. We can already see it in new tools to help humans code, like CodeWhisperer, or help write marketing copy, like Jasper.
Today we have systems capable of creating text, images or videos from other information that we pass on to them. This means that we can apply these generations to similar problems and processes for which we once needed human experts. This will lead to further automation, or more symbiotic forms of support between humans and artificial systems, which has both practical and economic consequences.
The new fundamental tools
For the rest of 2023, the big question will be what all of this progress actually means in terms of potential applications and usefulness. It’s a hugely exciting time to be in the industry, as we seek nothing less than to create fundamental tools for creating intelligent systems and processes, making them as intuitive and applicable as possible, and putting them in the hands of the widest possible class of developers, builders and innovators. It’s something that motivates my team and fuels our mission to help computers communicate better with us and use language to do so.
While there is more to human intelligence than the processes this technology will allow, I have no doubt that – coupled with the boundless ability of humans to constantly innovate on the back of new tools and technologies – innovation we will see in 2023 will change the way we use computers in disruptive and wonderful ways.
Ed Grefenstette East Head of Machine Learning at Cohere.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including data technicians, can share data insights and innovations.
If you want to learn more about cutting-edge insights and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.
You might even consider writing your own article!
Learn more about DataDecisionMakers