Google CEO Sundar Pichai speaks at the Google I/O developer conference.
Andrej Sokolow | Picture Alliance | Getty Images
Google on Tuesday hosted its annual I/O developer conference, and rolled out a range of artificial intelligence products, from new search and chat features to AI hardware for cloud customers. The announcements underscore the company’s focus on AI as it fends off competitors, such as OpenAI.
Many of the features or tools Google unveiled are only in a testing phase or limited to developers, but they give an idea of how the tech giant is thinking about AI and where it’s investing. Google makes money from AI by charging developers who use its models and from customers who pay for Gemini Advanced, its competitor to ChatGPT, which costs $19.99 per month and can help users summarize PDFs, Google Docs and more.
Tuesday’s announcements follow similar events held by its AI competitors. Earlier this month, Amazon-backed Anthropic announced its first-ever enterprise offering and a free iPhone app. Meanwhile, OpenAI on Monday launched a new AI model and desktop version of ChatGPT, along with a new user interface.
Here’s what Google announced.
Gemini AI updates
There’s also a new Gemini 1.5 Flash AI model, which the company said is more cost-effective and designed for smaller tasks like quickly summarizing conversations, captioning images and videos and pulling data from large documents.
Google CEO Sundar Pichai highlighted improvements to Gemini’s translations, adding that it will be available to all developers worldwide in 35 languages. Within Gmail, Gemini 1.5 Pro will analyze attached PDFs and videos, giving summaries and more, Pichai said. That means that if you missed a long email thread on vacation, Gemini will be able to summarize it along with any attachments.
The new Gemini updates are also helpful for searching Gmail. One example the company gave: If you’ve been comparing prices from different contractors to fix your roof and are looking for a summary to help you decide who to pick, Gemini could return three quotes along with the anticipated start dates offered in the different email threads.
Google said Gemini will eventually replace Google Assistant on Android phones, suggesting it’s going to be a more powerful competitor to Apple’s Siri on iPhone.
Google Veo, Imagen 3 and Audio Overviews
Google announced “Veo,” its latest model for generating high-definition video, and Imagen 3, its highest quality text-to-image model, which promises lifelike images and “fewer distracting visual artifacts than our prior models.”
The tools will be available for select creators on Monday and will come to Vertex AI, Google’s machine learning platform that lets developers train and deploy AI applications.
The company also showcased “Audio Overviews,” the ability to generate audio discussions based on text input. For instance, if a user uploads a lesson plan, the chatbot can speak a summary of it. Or, if you ask for an example of a science problem in real life, it can do so through interactive audio.
Separately, the company also showcased “AI Sandbox,” a range of generative AI tools for creating music and sounds from scratch, based on user prompts.
Generative AI tools such as chatbots and image creators continue to have issues with accuracy, however.
Earlier this year, Google introduced the Gemini-powered image generator. Users discovered historical inaccuracies that went viral online, and the company pulled the feature, saying it would relaunch it in the coming weeks. The feature has still not been re-released.
New search features
The tech giant is launching “AI Overviews” in Google Search on Monday in the U.S. AI Overviews show a quick summary of answers to the most complex search questions, according to Liz Reid, head of Google Search. For example, if a user searches for the best way to clean leather boots, the results page may display an “AI Overview” at the top with a multi-step cleaning process, gleaned from information it synthesized from around the web.
The company said it plans to introduce assistant-like planning capabilities directly within search. It explained that users will be able to search for something like, “‘Create a 3-day meal plan for a group that’s easy to prepare,'” and you’ll get a starting point with a wide range of recipes from across the web.
As far as its progress to offer “multimodality,” or integrating more images and video within generative AI tools, Google said it will begin testing the ability for users to ask questions through video, such as filming a problem with a product they own, uploading it and asking the search engine to figure out the problem. In one example, Google showed someone filming a broken record player while asking why it wasn’t working. Google Search found the model of the record player and suggested that it could be malfunctioning because it wasn’t properly balanced.
Another new feature being tested is called “AI Teammate,” which will integrate into a user’s Google Workspace. It can build a searchable collection of work from messages and email threads with more PDFs and documents. For instance, a founder-to-be could ask the AI Teammate, “Are we ready for launch?” and the assistant will provide an analysis and summary based on the information it can access in Gmail, Google Docs and other Workspace apps.
Project Astra
Project Astra is Google’s latest advancement toward its AI assistant that’s being built by Google’s DeepMind AI unit. It’s just a prototype for now, but you can think of it as Google’s aim to develop its own version of J.A.R.V.I.S., Tony Stark’s all-knowing AI assistant from the Marvel Universe.
In the demo video presented at Google I/O, the assistant — through video and audio, rather than a chatbot interface — was able to help the user remember where they left their glasses, review code and answer questions about what a certain part of a speaker is called, when that speaker was shown on video.
Google said a truly useful chatbot needs to let users “talk to it naturally and without lag or delay.” The conversation in the demo video happened in real time, without lags. The demo followed OpenAI’s Monday showcase of a similar audio back-and-forth conversation with ChatGPT.
DeepMind CEO Demis Hassabis said onstage that “getting response time down to something conversational is a difficult engineering challenge.”
Pichai said he expects Project Astra to launch in Gemini later this year.
AI hardware
Google also announced Trillium, its sixth-generation TPU, or tensor processing unit — a piece of hardware integral to running complex AI operations — which is to be available to cloud customers in late 2024.
The TPUs aren’t meant to compete with other chips, like Nvidia’s graphics processing units. Pichai noted during I/O, for example, that Google Cloud will begin offering Nvidia’s Blackwell GPUs in early 2025.
Nvidia said in March that Google will be using the Blackwell platform for “various internal deployments and will be one of the first cloud providers to offer Blackwell-powered instances,” and that access to Nvidia’s systems will help Google offer large-scale tools for enterprise developers building large language models.
In his speech, Pichai highlighted Google’s “longstanding partnership with Nvidia.” The companies have been working together for more than a decade, and Pichai has said in the past that he expects them to still be doing so a decade from now.