At its Google I/O 2024 event, Google unveiled several updates and additions to its offerings, focusing on improving user experiences and providing new tools for developers. Here's a breakdown of what was announced:

Gemini 1.5 Pro and 1.5 Flash

Gemini 1.5 Pro has received significant quality improvements in various tasks such as translation and encoding. These enhancements aim to empower users to handle broader and more complex tasks effectively.

Meanwhile, the introduction of Gemini 1.5 Flash caters to tasks that require quick responses, thanks to its optimized design. Both models support native multimodal capabilities, allowing users to seamlessly embed text, images, audio, and video.

While the standard version offers a context window of 1 million tokens, users can access the 2 million token window by joining the waiting list in Google AI Studio or Vertex AI for Google Cloud customers.

Availability: Preview versions of both models are available today in more than 200 countries and territories, with general availability expected in June.

New features for developers

In response to user feedback, Google is introducing two new features to the Gemini API: video frame extraction and parallel function calls, which enable more efficient processing of tasks.

In June, Google plans to implement the context cache for Gemini 1.5 Pro, reducing the need to repeatedly forward requests and large files to the model, thus improving efficiency and accessibility.

See also  Launch of the Crossbeats RoadEye 2.0 Dual Dash Camera for Car

Price: While access to the Gemini API remains free in qualifying regions through Google AI Studio, Google is expanding its pay-as-you-go service, increasing rate caps to allow for greater usage.

Additions to the Gemma Family

PaliGemma, Google's inaugural open visual language model optimized for tasks such as image captioning and visual Q&A, debuts, joining existing Gemma variants CodeGemma and RecurrentGemma.

  • Launching in June, Gemma 2 represents the next generation of Gemma models, delivering industry-leading performance in developer-friendly sizes.
  • Designed specifically to balance size and performance, the Gemma 27B model outperforms larger models while remaining GPU and TPU compatible.

Gemini API Developer Contest

Google has introduced its inaugural Gemini API Developer Competition, encouraging developers to showcase their creativity. The submission period ends on August 12, with a grand prize of a custom electric DeLorean.

Gemini 1.5 Pro Updates

At Google I/O 2024, Google unveiled several updates to Gemini 1.5 Pro, including a longer context window, improved data analysis capabilities, and better integration with Google apps.

Analyze documents with a long context window

Gemini 1.5 Pro, part of Gemini Advanced, now supports a context window of 1 million tokens, the longest for any consumer chatbot.

This allows you to process large documents of up to 1,500 pages or summarize 100 emails. You'll soon be handling an hour of video or code bases with over 30,000 lines.

Users can upload files via Google Drive or directly to Gemini Advanced, making it easy to extract information from dense documents like leases or research papers.

In addition, Gemini will soon analyze data files, creating custom visualizations from spreadsheets. All files remain private and are not used for model training.

See also  Endefo launches a range of Enbuds TWS earphones and a Wireless Pro 10 power bank in India

Gemini 1.5 Pro also improves image understanding. Users can take photos of dishes to get recipes or do math problems to get step-by-step solutions.

Availability: Gemini 1.5 Pro will be available to Gemini Advanced subscribers in more than 150 countries and 35 languages.

Natural conversations with Gemini Live

Gemini Live enhances interaction by allowing users to chat by text or voice. On Google Messages, users can chat with Gemini directly. With Gemini Live, users can choose from multiple natural-sounding voices, speak at their own pace, and interrupt for clarification.

For example, users can prepare for job interviews by rehearsing with Gemini, which can suggest skills to stand out. Later this year, users will be able to use their camera during live sessions for more interactive conversations.

Availability: Gemini Live will roll out to Gemini Advanced subscribers in the coming months.

Simplified trip planning

Gemini Advanced now offers a planning feature that creates custom itineraries. Users can ask Gemini to plan trips by pulling flight and hotel information from Gmail, taking into account dining preferences and suggesting local activities.

For example, a request for a trip to Miami might generate a schedule that includes flight details, restaurant recommendations, museum tours, and more. The itinerary is automatically updated with any changes or additional details.

Availability: This planning feature will be available soon to Gemini Advanced subscribers.

Customize Gemini with Gems

Gemini Advanced subscribers will soon be able to create Gems, customized versions of Gemini for specific tasks, such as being a gym buddy, kitchen assistant, coding buddy, or writing guide. Users describe their desired role and behavior and Gemini creates the gem accordingly.

See also  HUAWEI Pura 70 Ultra, Pura 70 Pro+ and Pura 70 Pro Announced with 6.8″ FHD+ 120Hz LTPO OLED Display, Variable Aperture Camera and Pura 70

Integration with More Google Apps

Gemini will connect with more Google tools, such as Google Calendar, Tasks and Keep. This allows users to automate tasks like creating calendar entries from school syllabus photos or turning recipe photos into shopping lists.

Gemini Model Updates

  • Flash 1.5 was introduced, optimized for speed and efficiency. It's lighter than 1.5 Pro, but excels at tasks like summarizing, chat applications, and extracting data from documents.

  • Significant improvements have been made to 1.5 Pro, improving its performance in various tasks such as code generation, logical reasoning and audio understanding.

  • Gemini Nano now understands multimodal input, including text, images, sound, and spoken language.

Gemma Family Updates

  • Gemma 2: Google introduced Gemma 2, the next generation of open models designed for breakthrough performance and efficiency, available in new sizes.
  • PaliGemma: The Gemma family expands with PaliGemma, the first visual language model inspired by PaLI-3.
  • Responsible Generative AI Toolkit: Updated with LLM Comparator to assess the quality of model responses.
Progress with the Astra Project

Google has shared the progress of Project Astra, aimed at developing universal AI agents for everyday assistance.

These agents aim to understand and respond to the world like humans, being proactive, teachable and personable. Google has worked to improve perception, reasoning and conversation to improve the quality and pace of interaction.

Google he stressed their continued pursuit of innovation with Gemini models, exploring new ideas while unlocking exciting use cases.

For the Latest Jobs And Information Visit: Mksjobs.Com

Aslo Read:

Mobile Updates:

Daily Update: