Cryptopolitan
2025-06-25 02:06:39

Gemini’s Robotics On-Device outperforms Google’s other models

Google DeepMind on Tuesday introduced a new language model called Gemini Robotics On-Device. The firm revealed that the model can run tasks locally on robots without an internet connection. The new model, which builds on the company’s previous Gemini Robotics AI model that was released in March, can control a robot’s movements. Google also acknowledged that the vision-language-action model (VLA) is small and efficient enough to run directly on a robot. According to the company, developers can control and fine-tune the model to suit various needs using natural language prompts. Robotics On-Device outperforms Google’s other models We’re bringing powerful AI directly onto robots with Gemini Robotics On-Device. 🤖 It’s our first vision-language-action model to help make robots faster, highly efficient, and adaptable to new tasks and environments – without needing a constant internet connection. 🧵 pic.twitter.com/1Y21D3cF5t — Google DeepMind (@GoogleDeepMind) June 24, 2025 Head of robotics at Google DeepMind, Carolina Parada, maintained that the original Gemini Robotics model uses a hybrid approach, allowing it to operate on-device and on the cloud. She said that with the new device-only model, users can access offline features almost as well as the flagship’s. The tech company claims that the model performs at a level close to the cloud-based Gemini Robotics model in benchmarks. Google also said it outperforms others on-device modes in general benchmarks, though it didn’t name those models. “The Gemini Robotics hybrid model is still more powerful, but we’re actually quite surprised at how strong this on-device model is. I would think about it as a starter model or as a model for applications that just have poor connectivity.” -Carolina Parada, Head of Robotics at Google DeepMind. The firm illustrated in the demo robots running the local model, unzipping bags, and folding clothes. Google acknowledged that while the model was trained for ALOHA robots, it later adapted it to work on a bi-arm Franka FR3 robot and the Apollo humanoid robot by Apptronik. The tech company claims the bi-arm Franka FR3 was successful in tackling scenarios and objects it hadn’t seen before, like doing assembly on an industrial belt. The firm mentioned that developers can show robots 50 to 100 demonstrations of tasks to train them on new tasks using the models on the MuJoCo physics simulator. Google DeepMind also mentioned the release of a software development kit called the Gemini Robotics SDK. The company revealed that its Robotics SDK provides full lifecycle tooling necessary for using Gemini Robotics models, including accessing checkpoints, serving a model, evaluating the model on the robot and in the sim, uploading data, and fine-tuning it. The firm disclosed that its on-device Gemini Robotics model and its SDK will be available to a group of trusted testers while Google continues to work toward minimizing safety risks. Tech companies join the robotics race Other companies that use AI models are also showing interest in robotics. Nvidia is building a platform to create foundational models for humanoids. The firm’s CEO, Jensen Huang, noted that building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today. Huang argued that the humanoid factor is one of the most contested topics in the world of robotics at the moment. He acknowledged that it’s raising venture capital by the boatload while generating massive skepticism along the way. Nvidia has also been championing robotic innovation through initiatives like Isaac and Jetson. Last year in March, at its annual GTC developer conference, the company joined the humanoid race with Project GROOT. Nvidia referred to the new platform as a general-purpose foundation model for humanoid robots. The firm said GROOT will support new hardware from Nvidia as well. Hugging Face is not only developing open models and data sets for robotics, but it is also working on robots. The firm revealed earlier this month an OpenAI model for robotics called SmolVLA. The company claims the model is trained on community-shared datasets and outperforms much larger models for robotics in both virtual and real-world environments. Hugging Face also revealed that SmolVLA aims to democratize access to vision-language-action (VLA) models and accelerate research toward generalist robotic agents. Last year, the firm launched LeRobot, a collection of robotics-focused models, datasets, and tools. More recently, Hugging Face acquired Pollen Robotics, a robotics startup based in France, and revealed several inexpensive robotics systems, including humanoids, for purchase. Cryptopolitan Academy: Want to grow your money in 2025? Learn how to do it with DeFi in our upcoming webclass. Save Your Spot

Ricevi la newsletter di Crypto
Leggi la dichiarazione di non responsabilità : Tutti i contenuti forniti nel nostro sito Web, i siti con collegamento ipertestuale, le applicazioni associate, i forum, i blog, gli account dei social media e altre piattaforme ("Sito") sono solo per le vostre informazioni generali, procurati da fonti di terze parti. Non rilasciamo alcuna garanzia di alcun tipo in relazione al nostro contenuto, incluso ma non limitato a accuratezza e aggiornamento. Nessuna parte del contenuto che forniamo costituisce consulenza finanziaria, consulenza legale o qualsiasi altra forma di consulenza intesa per la vostra specifica dipendenza per qualsiasi scopo. Qualsiasi uso o affidamento sui nostri contenuti è esclusivamente a proprio rischio e discrezione. Devi condurre la tua ricerca, rivedere, analizzare e verificare i nostri contenuti prima di fare affidamento su di essi. Il trading è un'attività altamente rischiosa che può portare a perdite importanti, pertanto si prega di consultare il proprio consulente finanziario prima di prendere qualsiasi decisione. Nessun contenuto sul nostro sito è pensato per essere una sollecitazione o un'offerta