Skip to content

[FEATURE] remote policy with server/client#394

Merged
xyao-nv merged 9 commits intomainfrom
socket_for_policy
Feb 26, 2026
Merged

[FEATURE] remote policy with server/client#394
xyao-nv merged 9 commits intomainfrom
socket_for_policy

Conversation

@kanghui0204
Copy link
Copy Markdown
Collaborator

Summary

old MR is MR 310

Hi Team,

I added a remote server/client mechanism for IsaacLab-Arena that allows the IsaacLab environment and the policy model environment to run separately. Alex and I wrote a design document(design doc) for this feature, which includes our discussions about the design. Below is an overview of this PR.

Functionality

Previously, the IsaacLab-Arena pipeline ran entirely in a single process:

IsaacLab envEnv policy class (e.g., Gr00tClosedloopPolicy) → local policy model (e.g., Gr00tPolicy), all in one environment.

This PR adds remote policy/server–client support and decouples the env policy class from the local policy model. The new pipeline becomes:

Client: IsaacLab envEnv policy class (e.g., ActionChunkingClientPolicy)

Server: VLA/VLN policy model (e.g., Gr00tRemoteServerSidePolicy)

The client and server exchange observations and actions via sockets.

The server/client implementation lives in isaaclab_arena/remote_policy inside IsaacLab-Arena. Users can copy this directory and import it on the server side directly, without needing to install it as a separate package.

@kanghui0204 kanghui0204 force-pushed the socket_for_policy branch 5 times, most recently from dba65cc to aaf15f2 Compare February 9, 2026 15:27
@kanghui0204
Copy link
Copy Markdown
Collaborator Author

Hi @alexmillane @xyao-nv ,could you please review the latest code? I’ve tested it repeatedly and also added some unit tests and documentation. If everything looks good to you, I think we can merge it now, because the Chinese New Year holidays are coming up and if we wait it might have to be postponed until March.

@kanghui0204 kanghui0204 requested review from alexmillane and xyao-nv and removed request for alexmillane February 9, 2026 15:52
@kanghui0204 kanghui0204 force-pushed the socket_for_policy branch 2 times, most recently from 5942d26 to a4a975f Compare February 10, 2026 14:26
Copy link
Copy Markdown
Collaborator

@alexmillane alexmillane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of small comments. Feel free to address them and merge.

One thing that this would benefit from is a design diagram. There are a few classes that have similar names. For example, policy_server.py, server_side_policy.py, etc. A diagram would help understand which role each of these classes plays.

Comment thread docker/setup/install_gr00t_deps.sh Outdated
$PYTHON_CMD -m pip install --no-deps --ignore-requires-python \
-e ${WORKDIR}/submodules/Isaac-GR00T/

# Install GR00T main dependencies (part 1, with build isolation)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with build isolation -> without build isolation

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread docker/setup/install_gr00t_deps.sh Outdated
Comment on lines +33 to +37
if [ -d "/usr/local/cuda" ]; then
export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
export PATH=${CUDA_HOME}/bin:${PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
fi
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this script is used by our docker, is there a situation where we end up without CUDA? In other words, is there a reason we have this if condition?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment thread docker/setup/install_gr00t_deps.sh Outdated
Comment on lines +31 to +51
if [[ "$USE_SERVER_ENV" -eq 1 ]]; then
# Server environment: use system CUDA if available
if [ -d "/usr/local/cuda" ]; then
export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
export PATH=${CUDA_HOME}/bin:${PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
fi
echo "[SERVER] CUDA_HOME=${CUDA_HOME:-unset}"
echo "[SERVER] PATH=$PATH"
echo "[SERVER] LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-unset}"
else
# Isaac Sim Docker: fixed CUDA 12.8 toolchain
export CUDA_HOME=/usr/local/cuda-12.8
export PATH=/usr/local/cuda-12.8/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:-}
export TORCH_CUDA_ARCH_LIST=8.0+PTX
echo "[ISAACSIM] CUDA_HOME=$CUDA_HOME"
echo "[ISAACSIM] PATH=$PATH"
echo "[ISAACSIM] LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
echo "[ISAACSIM] TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"
fi
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this whole conditional can be avoided by setting

export CUDA_HOME=/usr/local/cuda
export PATH=${CUDA_HOME}/bin:${PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
export TORCH_CUDA_ARCH_LIST=8.0+PTX

both dockers, pytorch and isaac_arena_gr00t, have cuda installed at /usr/local/cuda. In pytorch this folder is a symlink to /usr/local/cuda-12.5, in the isaac_arena_gr00t the folder is a symlink to /usr/local/cuda-12.8

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +35 to +38
In practice, you implement a ``ServerSidePolicy`` in the remote
environment and a matching ``ClientSidePolicy`` inside Isaac Lab Arena.
As long as they agree on an ``ActionProtocol``, the environments and
evaluation scripts can remain unchanged.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can reword this. This makes it sound like we need both a ServerSidePolicy and a ClientSidePolicy per model. If I'm not mistake, the case is much better than that. We need a ServerSidePolicy per model, however the ClientSidePolicy is policy-independent. For example the ActionChunkingClientSidePolicy, in my opinion, is likely to work for many existing VLAs. In the case of those compatible policies, no further implementation is needed on the client side.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change the doc to

"In this configuration, the environment and evaluation logic run inside
Isaac Lab Arena, while GR00T inference runs in the separate server
process, connected through the remote policy interface. The client-side
policy only depends on Isaac Lab Arena; it does not require GR00T code
or GR00T-specific Python dependencies to be installed in the Arena
container."

How do you think

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding such clear and comprehensive docs!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a diagram of the design. I think that this would go a long way to explaining the structure.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a diagram

Comment on lines +164 to +167
If the models directory was already configured in previous steps (for
example via the ``MODELS_DIR`` environment variable), the ``-m`` flag
can be omitted. Otherwise, use ``-m`` to point to the directory that
contains the GR00T model files on the host.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, you can ommit this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +175 to +190
python isaaclab_arena/evaluation/policy_runner.py \
--policy_type isaaclab_arena.policy.action_chunking_client.ActionChunkingClientSidePolicy \
--remote_host 127.0.0.1 \
--remote_port 5555 \
--num_steps 1500 \
--num_envs 5 \
--enable_cameras \
--device cpu \
--remote_kill_on_exit \
galileo_g1_locomanip_pick_and_place \
--object brown_box \
--embodiment g1_wbc_joint

In this configuration, the environment and evaluation logic run inside
Isaac Lab Arena, while GR00T inference runs in the separate server
process, connected through the remote policy interface.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!!

Out of interest, I presume that this policy(client) can be run without the gr00t docker?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes , without gr00t docker , you can build docker with bash docker/run_docker.sh without -g , and it can run。

Comment on lines +34 to +35
if protocol_cls is None:
raise ValueError("protocol_cls is required.")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ommit this check. The constructor requires it to be passed, so the check that the user hasn't passed None is superfluous.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +66 to +68
# Mode-specific field.
action_chunk_length: int = 0
action_horizon: int = 0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we describe, in a comment, the difference between these two variables? They're quite similar. I'm not sure I understand the difference 😅

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we omit the defaults ( = 0) to ensure that a user passes values?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that action_horizon is the length of the action
sequence produced by the model for one forward pass, while
action_chunk_length is the length that the client-side post-processing
actually consumes from each chunk at a time. In the current GN1.6
configuration these two values are the same, but GN1.6 has already
split them conceptually and we expect future use cases with asynchronous
execution where the model’s planning horizon and the client’s chunk size
may differ. I’ve updated the comments accordingly and removed the = 0
defaults so both fields have to be provided explicitly.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference here between gr00t_remote_policy and gr00t_closed_loop_policy.

Perhaps niavely I wouldn't expect any difference in the gr00t policy if it's running on a server or running locally.

What are the differences?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally, Gr00tClosedLoopPolicy and the combination
Gr00tRemoteServerSidePolicy + ActionChunkingClientSidePolicy are
equivalent:

@kanghui0204 kanghui0204 force-pushed the socket_for_policy branch 3 times, most recently from d2057c6 to 5a7501b Compare February 13, 2026 13:15
- The **Base** Isaac Lab Arena container, started via
``docker/run_docker.sh``. This container does not need to install
GR00T when you run policies in remote mode.
- A separate **GR00T policy server** container, started via
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark it as optional. Add gr000t with docker option.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some changes for this

# - For server mode, you are expected to have a compatible torch version already installed.

echo "Installing flash-attn 2.7.4.post1..."
$PYTHON_CMD -m pip install --no-build-isolation --use-pep517 flash-attn==2.7.4.post1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only $PYTHON_CMD is changed

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! You're right that $PYTHON_CMD is the primary difference for the Python install commands. However the --server flag also controls a few other things beyond that:

  • $SUDO: The server environment may not be running as root, so we need sudo for apt-get. Inside the Isaac Sim Docker container we're always root, so sudo is not needed.
  • Environment finalization (lines 115-124): The torchrun PATH fix and typing_extensions cleanup are Isaac Sim-specific and should be skipped in server mode.
    The CUDA setup has already been simplified per Alex's suggestion — both environments now use /usr/local/cuda (a symlink), removing the previous conditional block.



def build_gr00t_policy_inputs_np(
rgb_np: np.ndarray, # (N, H, W, C)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont like this. Multi-rgb are expected

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like all those standalone func calls. Need to think

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I actually think the current standalone-function approach works better than a class-based one for our use case. The reason is that the remote policy and the local policy each use a different subset of these functions — if we bundle them into a single class, that class would end up being partially used by each side, which feels awkward and violates the single-responsibility principle.

Standalone functions are also stateless and easier to compose — each caller picks exactly the functions it needs without carrying unused baggage.

That said, I'm open to revisiting this in a follow-up PR. In particular, once we have a clearer picture of whether the local (single-container) path is still needed long-term, we can reconsider the abstraction. Happy to discuss further!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for leaving my self review todos on it. Local path is still needed for v0.2 release at least and I'm assuming we will having both options available for some time.

camera_name = self.policy_config.pov_cam_name_sim

# 1) Shared numpy-based preprocessing
policy_observations = self._build_policy_observations(observation, camera_name)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client-server is np, one-docker is torch. Need to do a explicit data convert func

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some changes for this

@xyao-nv
Copy link
Copy Markdown
Collaborator

xyao-nv commented Feb 25, 2026

Changes I applied so far,

  1. Refactored policy core and action chunking related funcs such that they consume the same set of base funcs.
  2. Bug fixes to inference pipeline regarding chunking reset logics, and task description setting in remote.
  3. Verified locally for our 3 examples, achieve similar metrics between remote & local policy inference.
  4. Refactored obs/action packing logics such that data conversion between numpy/torch is modular, and reusable by others.

In general, it allows remote & local has a single source of truth for GR00T policy inference related modules, easy for maintenance and bug fixes.

@kanghui0204 Up to this point, Im done code refactoring and once CI passes, I'll merge it. Thanks for your contribution!
Also thanks to @alexmillane for offering great reviews!

@xyao-nv xyao-nv merged commit 1f62bd6 into main Feb 26, 2026
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants