Vision-language foundation models for robotic manipultion
Title: Vision-language foundation models for robotic manipultion
DNr: Berzelius-2024-124
Project Type: LiU Berzelius
Principal Investigator: Sichao Liu <sicliu@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2024-03-20 – 2024-10-01
Classification: 10207
Homepage: https://www.kth.se/
Keywords:

Abstract

The main task of this project is to use the state-of-the-art foundation models, including large language models and vision-language models to perform research that will result in the general-purpose robotic manipulation, with a focus on autonomous mobile robot systems. This project is first fine-tunes robotics foundation models with datasets includes texts, images and videos with the help of GPU, and algorithms and approaches that we plan to develop or apply are all based on GPU. This project is a combination of robotics, vision and AI research.