2025-08-09

Text Conditioned Physics-based Character Control

Introduction

In this blog, we try to train a text-conditioned robot control policy. It extends a naive adversarial imitation policy by supporting user text instructions to generate motions with different styles. Specifically, we integrate a BERT language model into an adversarial imitation RL pipeline.

…

Read more ⟶

2025-07-06

Manipulator Demo: Smooth action chunck sequence generation with predictive flow matching denoising decoder for VLA

Introduction

VLA (Vision-Language-Action) models have shown promising results for robots to perform tasks in complex environments based on visual input and textual instructions. Specifically, chunk-based action generation is crucial for its long-horizon planning and short-term per-chunk smooth motion. However, naive chunk-by-chunk action execution strategy may result in jerky robot motion when switching chunks. In this blog, a predictive flow-matching-based action decoder design is proposed to generate smooth action sequences, which forms a key component of our lightweight VLA variant implementation.

**Franka Panda Emika Robot**
*(RoboGroove©)*

Task: turn on the stove and put the moka pot on it

…

Read more ⟶

2025-06-09

Humanoid Demo: Experiments with Unitree G1

Introduction

In this blog, we will conduct some experiments on Unitree G1. Specifically, using RL + IL and Motion Retargeting from the CMU MoCap dataset, we will attempt to teach Unitree G1 to run, jump, ...

…

Read more ⟶

2025-03-01

Humanoid Demo: Game2D-to-Sim3D Cross Domain Skill Adaptation From CPG Expert Demonstrations

Introduction

In previous blog posts, we have shown successful applications of CPG-based RL for robot locomotion in both 2D (game) and 3D (world) physical simulation environments. While this approach offers the flexibility to adjust parameters for walking gait on-the-fly, it necessitates heavy reward tuning and prolonged training time. On the other hand, imitation learning from human demonstrations has shown more rapid convergence for natural gait cloning. Here we will try to combine these two techniques and explore the feasibility of using a 2D CPG expert to guide a 3D humanoid robot in learning to walk.

BipedalWalker: Finally has learned to walk~

Unitree/H1-2: Done copying!

…

Read more ⟶

2025-01-26

Humanoid Demo: Porting CPG to Unitree H1

Introduction

In this blog, we will attempt to adapt CPG + RL, which was successfully applied to the 2D Gym BipedalWalker, to a real 3D bipedal humanoid model: the Unitree H1-2.

…

Read more ⟶

2024-12-06

BipedalWalker Demo: Experiments with CPG-based RL

Background

In the previous blog, we provided a brief introduction to central pattern generators (CPGs) and how they can be used as quadrupedal locomotion controllers. CPG-based methods offer advantages such as parameterizable gait behavior generation, dynamic motion pattern adjustment on the fly, etc. Therefore it is interesting to see how this method performs when applied to bipedal robots.

…

Read more ⟶

2024-11-30

Accelerating Simulation of Stable Baselines3's VecEnv in Multi-Core Processor

Background

Stable-baselines3 (SB3) is a popular library for rapid prototype development of RL algorithm, it ships with many common model-free algorithms out of the box. SB3 requires vectorized environment VecEnv as input interface to its built-in RL algorithms for allowing training on multiple environments, it provides two class DummyVecEnv and SubprocVecEnv. However both DummyVecEnv and SubprocVecEnv has obvious drawbacks for leveraging the capacity of multi-core processor for simulating large number of environments. In this blog we implement a new PoolVecEnv which draws inspiration from ThreadPool/ProcessPool for accelerating large number of environments simulation. Modified Code will be available on Github later

…

Read more ⟶