科学研究
科学研究
实验室

超平面实验室

  超平面实验室隶属排列三走势图,由董豪老师带领,其研究方向主要涉及计算机视觉、智能机器人、具身智能和开源AI软件。实验室目前的研究方向包括:机器人泛化操纵、导航及具身大模型等;研究旨在打造机器人型号无关、自主决策的通用具身智能系统,以加速机器人普适化。更多信息请看:https://zsdonghao.github.io/

 

实验室成员

 

               董豪

                                                                                   

实验室代表成果

 

1. 非监督场景理解

  

  

  如何以类人的方法理解世界,是人工智能领域的重要问题。交互式的物理场景包括可控的智能体和环境物体,我们实现了在没有动作标注的情况下,对智能体与环境交互进行建模和生成,成功实现了类人的交互式场景“理解与想象”。具体来说,人类具备通过观察真实世界的物理环境来对其进行建模的能力。交互式物理场景的变化取决于智能体的动作及其和环境的交互。人类通过观察智能体和环境的交互,可以建立出与环境交互的规律,因此获得了控制智能体的能力以及预测未来场景变化的能力,这些能力能帮助人类做准确的决策和行动。

 

  比如,我们的一个研究方向针对环境建模的问题。尽管现有的有监督方法能够实现很好的效果,但为了使用无标注数据训练以实现类人的学习,我们提出一个方法只通过视频数据来识别出智能体并对其动作进行无监督的建模,对离散动作进行聚类。最终成功对智能体与环境的交互进行非监督学习,实现准确的基于输入动作的视频预测(未来场景预测)。

 

Enabling machines to understand the world in human-like approaches in an important topic in AI and cognitive science. Interactive environments include controllable agent and other objects; we achieve an action-label free approach to model the interaction between the agent and the physical environment, enabling action-based visual forecasting.

 

Specifically, humans are talented at modelling the real-world physical environment from visual observations. The environment is driven forward by the control of the agent and the interaction between the agent and the environment. By observing an agent interacting with the environment, humans develop senses of interaction laws in the environment, thus acquiring the ability to control the agent and predict events happen in the future, which helps them to make decisions and perform actions.

 

For example, one of our research studies the problem of modelling the environment. Although existing methods achieve impressive results training with action supervision, environment modelling from only visual observations still remains challenging. One of our study proposes an end-to-end method that learns to identify the agent within a deterministic environment using only visual observations (videos) as the training data. Then, for each action, only one demonstration that shows the transition caused by the action is applied to model the interaction between the agent and environment.

 

2. 生成模型与计算机视觉

 

  

 

  可控的数据生成对计算机视觉和机器学习都非常得重要。我们研究如何在少量标签情况下实现包括图像到图像转换、基于语言的图像处理等任务。这些研究对领域自适应、生成模型等方向有重要意义。具体来说,比如我们的一个研究对图像和文本中的共有语义信息进行解耦合,在不需要知道目标图像的情况下实现了基于图像和文本的新图像合成。我们另外一个研究提出了一种方法作为cycle loss的代替品,成功实现了图像转换良好的形状改变、物体及纹理的去除。这些方法对领域自适应,Sim2Real等方法有意义。

 

Controllable data generation is important to computer vision and machine learning. We study how to use less supervision to achieve various generative tasks, such as image-to-image translation and natural language image manipulation. Those researches are important to both domain adaptation and generation models.

 

For example, one of our study disentangles the semantic information from the two modalities (image and text) and generate new images from the combined semantics. Another study proposes a new method to replace cycle loss and successfully perform geometric changes, remove large objects, or ignore irrelevant texture.

 

3. 小数据医疗

 

  

 

  

 

 

  在医疗健康方面,我们研究如何用少量的数据和廉价的传感器来实现更高的精度和更多的功能。我们的研究降低了MRI扫描所需要的时间、减少了图像肿瘤分割和睡眠检测需要的数据量。具体来说,我们设计新方法来实现MRI的重建和肿瘤的分割。除了医疗影像以外,我们也有关于其他数据形态的研究。比如我们之前的一个方向是设计新型柔性脑电波电极材料、以实现在耳洞内部采集脑电波信号,以及设计单通道脑电波分析算法来实现低成本的睡眠分析与监控。

 

For health care, we study how to use less data and low-cost sensors to achieve better diagnostic performance. Our researches reduce the scanning time of MRI, reduce the data required for tumour segmentation and sleep scoring. Specifically, we design new algorithms for MRI reconstruction and new algorithms for MRI tumor segmentation. Apart from medical imaging, we also have research related to other data modality. For example, another previous direction designs new soft material and electrode for in-the-ear EEG recoding and single-channel EEG analysis method for low-cost sleep stage scoring.