[FEATURE] remote policy with server/client by kanghui0204 · Pull Request #394 · isaac-sim/IsaacLab-Arena

kanghui0204 · 2026-02-07T12:26:02Z

Summary

old MR is MR 310

Hi Team,

I added a remote server/client mechanism for IsaacLab-Arena that allows the IsaacLab environment and the policy model environment to run separately. Alex and I wrote a design document(design doc) for this feature, which includes our discussions about the design. Below is an overview of this PR.

Functionality

Previously, the IsaacLab-Arena pipeline ran entirely in a single process:

IsaacLab env → Env policy class (e.g., Gr00tClosedloopPolicy) → local policy model (e.g., Gr00tPolicy), all in one environment.

This PR adds remote policy/server–client support and decouples the env policy class from the local policy model. The new pipeline becomes:

Client: IsaacLab env → Env policy class (e.g., ActionChunkingClientPolicy)

Server: VLA/VLN policy model (e.g., Gr00tRemoteServerSidePolicy)

The client and server exchange observations and actions via sockets.

The server/client implementation lives in isaaclab_arena/remote_policy inside IsaacLab-Arena. Users can copy this directory and import it on the server side directly, without needing to install it as a separate package.

kanghui0204 · 2026-02-09T15:30:07Z

Hi @alexmillane @xyao-nv ,could you please review the latest code? I’ve tested it repeatedly and also added some unit tests and documentation. If everything looks good to you, I think we can merge it now, because the Chinese New Year holidays are coming up and if we wait it might have to be postponed until March.

alexmillane

A couple of small comments. Feel free to address them and merge.

One thing that this would benefit from is a design diagram. There are a few classes that have similar names. For example, policy_server.py, server_side_policy.py, etc. A diagram would help understand which role each of these classes plays.

alexmillane · 2026-02-11T08:36:59Z

+$PYTHON_CMD -m pip install --no-deps --ignore-requires-python \
+    -e ${WORKDIR}/submodules/Isaac-GR00T/
+
+# Install GR00T main dependencies (part 1, with build isolation)


with build isolation -> without build isolation

alexmillane · 2026-02-11T08:38:14Z

+  if [ -d "/usr/local/cuda" ]; then
+      export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
+      export PATH=${CUDA_HOME}/bin:${PATH}
+      export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
+  fi


Given that this script is used by our docker, is there a situation where we end up without CUDA? In other words, is there a reason we have this if condition?

alexmillane · 2026-02-11T08:52:54Z

+if [[ "$USE_SERVER_ENV" -eq 1 ]]; then
+  # Server environment: use system CUDA if available
+  if [ -d "/usr/local/cuda" ]; then
+      export CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
+      export PATH=${CUDA_HOME}/bin:${PATH}
+      export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-}
+  fi
+  echo "[SERVER] CUDA_HOME=${CUDA_HOME:-unset}"
+  echo "[SERVER] PATH=$PATH"
+  echo "[SERVER] LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-unset}"
+else
+  # Isaac Sim Docker: fixed CUDA 12.8 toolchain
+  export CUDA_HOME=/usr/local/cuda-12.8
+  export PATH=/usr/local/cuda-12.8/bin:${PATH}
+  export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:-}
+  export TORCH_CUDA_ARCH_LIST=8.0+PTX
+  echo "[ISAACSIM] CUDA_HOME=$CUDA_HOME"
+  echo "[ISAACSIM] PATH=$PATH"
+  echo "[ISAACSIM] LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
+  echo "[ISAACSIM] TORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST"
+fi


I believe this whole conditional can be avoided by setting

export CUDA_HOME=/usr/local/cuda export PATH=${CUDA_HOME}/bin:${PATH} export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH:-} export TORCH_CUDA_ARCH_LIST=8.0+PTX

both dockers, pytorch and isaac_arena_gr00t, have cuda installed at /usr/local/cuda. In pytorch this folder is a symlink to /usr/local/cuda-12.5, in the isaac_arena_gr00t the folder is a symlink to /usr/local/cuda-12.8

alexmillane · 2026-02-11T09:01:00Z

+In practice, you implement a ``ServerSidePolicy`` in the remote
+environment and a matching ``ClientSidePolicy`` inside Isaac Lab Arena.
+As long as they agree on an ``ActionProtocol``, the environments and
+evaluation scripts can remain unchanged.


I'm wondering if we can reword this. This makes it sound like we need both a ServerSidePolicy and a ClientSidePolicy per model. If I'm not mistake, the case is much better than that. We need a ServerSidePolicy per model, however the ClientSidePolicy is policy-independent. For example the ActionChunkingClientSidePolicy, in my opinion, is likely to work for many existing VLAs. In the case of those compatible policies, no further implementation is needed on the client side.

I change the doc to

"In this configuration, the environment and evaluation logic run inside
Isaac Lab Arena, while GR00T inference runs in the separate server
process, connected through the remote policy interface. The client-side
policy only depends on Isaac Lab Arena; it does not require GR00T code
or GR00T-specific Python dependencies to be installed in the Arena
container."

How do you think

alexmillane · 2026-02-11T09:04:11Z

Thanks for adding such clear and comprehensive docs!

What do you think about adding a diagram of the design. I think that this would go a long way to explaining the structure.

add a diagram

alexmillane · 2026-02-11T09:07:10Z

+If the models directory was already configured in previous steps (for
+example via the ``MODELS_DIR`` environment variable), the ``-m`` flag
+can be omitted. Otherwise, use ``-m`` to point to the directory that
+contains the GR00T model files on the host.


As above, you can ommit this.

alexmillane · 2026-02-11T09:08:16Z

+   python isaaclab_arena/evaluation/policy_runner.py \
+     --policy_type isaaclab_arena.policy.action_chunking_client.ActionChunkingClientSidePolicy \
+     --remote_host 127.0.0.1 \
+     --remote_port 5555 \
+     --num_steps 1500 \
+     --num_envs 5 \
+     --enable_cameras \
+     --device cpu \
+     --remote_kill_on_exit \
+     galileo_g1_locomanip_pick_and_place \
+     --object brown_box \
+     --embodiment g1_wbc_joint
+
+In this configuration, the environment and evaluation logic run inside
+Isaac Lab Arena, while GR00T inference runs in the separate server
+process, connected through the remote policy interface.


Very cool!!

Out of interest, I presume that this policy(client) can be run without the gr00t docker?

yes , without gr00t docker , you can build docker with bash docker/run_docker.sh without -g , and it can run。

alexmillane · 2026-02-11T09:25:33Z

+        if protocol_cls is None:
+            raise ValueError("protocol_cls is required.")


You can ommit this check. The constructor requires it to be passed, so the check that the user hasn't passed None is superfluous.

alexmillane · 2026-02-11T09:44:09Z

+    # Mode-specific field.
+    action_chunk_length: int = 0
+    action_horizon: int = 0


Could we describe, in a comment, the difference between these two variables? They're quite similar. I'm not sure I understand the difference 😅

Could we omit the defaults ( = 0) to ensure that a user passes values?

My understanding is that action_horizon is the length of the action
sequence produced by the model for one forward pass, while
action_chunk_length is the length that the client-side post-processing
actually consumes from each chunk at a time. In the current GN1.6
configuration these two values are the same, but GN1.6 has already
split them conceptually and we expect future use cases with asynchronous
execution where the model’s planning horizon and the client’s chunk size
may differ. I’ve updated the comments accordingly and removed the = 0
defaults so both fields have to be provided explicitly.

alexmillane · 2026-02-11T10:00:39Z

What is the difference here between gr00t_remote_policy and gr00t_closed_loop_policy.

Perhaps niavely I wouldn't expect any difference in the gr00t policy if it's running on a server or running locally.

What are the differences?

Functionally, Gr00tClosedLoopPolicy and the combination
Gr00tRemoteServerSidePolicy + ActionChunkingClientSidePolicy are
equivalent:

xyao-nv · 2026-02-18T07:11:09Z

+- The **Base** Isaac Lab Arena container, started via
+  ``docker/run_docker.sh``. This container does not need to install
+  GR00T when you run policies in remote mode.
+- A separate **GR00T policy server** container, started via


Mark it as optional. Add gr000t with docker option.

I have some changes for this

xyao-nv · 2026-02-18T07:14:56Z

+# - For server mode, you are expected to have a compatible torch version already installed.
+
+echo "Installing flash-attn 2.7.4.post1..."
+$PYTHON_CMD -m pip install --no-build-isolation --use-pep517 flash-attn==2.7.4.post1


only $PYTHON_CMD is changed

Thanks for the review! You're right that $PYTHON_CMD is the primary difference for the Python install commands. However the --server flag also controls a few other things beyond that:

$SUDO: The server environment may not be running as root, so we need sudo for apt-get. Inside the Isaac Sim Docker container we're always root, so sudo is not needed.

Environment finalization (lines 115-124): The torchrun PATH fix and typing_extensions cleanup are Isaac Sim-specific and should be skipped in server mode.
The CUDA setup has already been simplified per Alex's suggestion — both environments now use /usr/local/cuda (a symlink), removing the previous conditional block.

xyao-nv · 2026-02-18T07:20:42Z

+
+
+def build_gr00t_policy_inputs_np(
+    rgb_np: np.ndarray,  # (N, H, W, C)


I dont like this. Multi-rgb are expected

I don't like all those standalone func calls. Need to think

Thanks for the feedback! I actually think the current standalone-function approach works better than a class-based one for our use case. The reason is that the remote policy and the local policy each use a different subset of these functions — if we bundle them into a single class, that class would end up being partially used by each side, which feels awkward and violates the single-responsibility principle.

Standalone functions are also stateless and easier to compose — each caller picks exactly the functions it needs without carrying unused baggage.

That said, I'm open to revisiting this in a follow-up PR. In particular, once we have a clearer picture of whether the local (single-container) path is still needed long-term, we can reconsider the abstraction. Happy to discuss further!

Sorry for leaving my self review todos on it. Local path is still needed for v0.2 release at least and I'm assuming we will having both options available for some time.

xyao-nv · 2026-02-18T07:24:00Z

+        camera_name = self.policy_config.pov_cam_name_sim
+
+        # 1) Shared numpy-based preprocessing
+        policy_observations = self._build_policy_observations(observation, camera_name)


client-server is np, one-docker is torch. Need to do a explicit data convert func

I have some changes for this

xyao-nv · 2026-02-25T23:57:41Z

Changes I applied so far,

Refactored policy core and action chunking related funcs such that they consume the same set of base funcs.
Bug fixes to inference pipeline regarding chunking reset logics, and task description setting in remote.
Verified locally for our 3 examples, achieve similar metrics between remote & local policy inference.
Refactored obs/action packing logics such that data conversion between numpy/torch is modular, and reusable by others.

In general, it allows remote & local has a single source of truth for GR00T policy inference related modules, easy for maintenance and bug fixes.

@kanghui0204 Up to this point, Im done code refactoring and once CI passes, I'll merge it. Thanks for your contribution!
Also thanks to @alexmillane for offering great reviews!

kanghui0204 force-pushed the socket_for_policy branch 5 times, most recently from dba65cc to aaf15f2 Compare February 9, 2026 15:27

kanghui0204 requested review from alexmillane and xyao-nv and removed request for alexmillane February 9, 2026 15:52

kanghui0204 force-pushed the socket_for_policy branch 2 times, most recently from 5942d26 to a4a975f Compare February 10, 2026 14:26

alexmillane approved these changes Feb 11, 2026

View reviewed changes

kanghui0204 force-pushed the socket_for_policy branch 3 times, most recently from d2057c6 to 5a7501b Compare February 13, 2026 13:15

xyao-nv reviewed Feb 18, 2026

View reviewed changes

kanghui0204 force-pushed the socket_for_policy branch from 5a7501b to 225877c Compare February 24, 2026 15:12

hkang-nv and others added 7 commits February 25, 2026 15:52

add socket code for policy

201da3e

Move Gr00t policy to server side

dedabc2

worked in closedloop local

56e184b

wip doc

0bd272c

update doc

5532b23

refactor action chunking

978a5d3

update doc cmd

d62404a

xyao-nv force-pushed the socket_for_policy branch from 5123f57 to d62404a Compare February 25, 2026 23:53

xyao-nv added 2 commits February 25, 2026 17:27

lint

0dc47d9

dox fix

04ec7d8

xyao-nv merged commit 1f62bd6 into main Feb 26, 2026
7 of 10 checks passed

kanghui0204 mentioned this pull request Feb 26, 2026

[WIP] VLN Benchmark: H1 Navigation in Matterport 3D with NaVILA VLM #446

Open

10 tasks

		if protocol_cls is None:
		raise ValueError("protocol_cls is required.")



		def build_gr00t_policy_inputs_np(
		rgb_np: np.ndarray, # (N, H, W, C)

Conversation

kanghui0204 commented Feb 7, 2026

Summary

Functionality

Uh oh!

kanghui0204 commented Feb 9, 2026

Uh oh!

alexmillane left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xyao-nv commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

xyao-nv commented Feb 25, 2026 •

edited

Loading