Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,16 @@ Every Ansible deployment automatically deploys an observability stack alongside
- Example: Without flag, a random node will be selected automatically
13. `--checkpoint-sync-url` specifies the URL to fetch finalized checkpoint state from for checkpoint sync. Default: `https://leanpoint.leanroadmap.org/lean/v0/states/finalized`. Only used when `--restart-client` is specified.
14. `--restart-client` comma-separated list of client node names (e.g., `zeam_0,ream_0`). When specified, those clients are stopped, their data cleared, and restarted using checkpoint sync. Genesis is skipped. Use with `--checkpoint-sync-url` to override the default URL.
15. `--replace-with` comma-separated list of replacement node names, positionally matched 1:1 with `--restart-client`. Swaps client implementations while keeping the same validator slot, keys, and server. Updates `validator-config.yaml`, `validators.yaml`, `annotated_validators.yaml`, and renames `.key` files. Leanpoint is re-synced when replacements occur.
- Example: `--restart-client zeam_0 --replace-with ream_0` → replaces zeam_0 with ream_0
- Example: `--restart-client zeam_0,ream_0 --replace-with ream_1` → ream_1 replaces zeam_0, ream_0 just restarts
- Empty entries skip that position: `--restart-client zeam_0,ream_0 --replace-with ,ream_1` → zeam_0 restarts, ream_0 replaced with ream_1
16. `--logs` enables run logging. When specified:
- Appends UTC-timestamped START/END entries with duration and log file path to `tmp/devnet.log`
- Duplicates console output to a timestamped log file in `tmp/`:
- `tmp/local-run-DD-MM-YYYY-HH-MM.log` for local deployments
- `tmp/ansible-run-DD-MM-YYYY-HH-MM.log` for Ansible deployments
- Example: `NETWORK_DIR=local-devnet ./spin-node.sh --node all --logs`

### Checkpoint sync

Expand Down Expand Up @@ -238,6 +248,16 @@ NETWORK_DIR=local-devnet ./spin-node.sh --restart-client zeam_0 \
2. Data directories are cleared
3. Clients are started with `--checkpoint-sync-url` so they sync from the remote checkpoint instead of genesis

**Replacing a client during restart:**

```sh
# Replace zeam_0 with ream_0 (same validator slot, keys, and server)
NETWORK_DIR=ansible-devnet ./spin-node.sh --restart-client zeam_0 --replace-with ream_0 --useRoot

# Replace first node, just restart second
NETWORK_DIR=local-devnet ./spin-node.sh --restart-client zeam_0,ream_0 --replace-with qlean_0
```

**Deployment modes:**
- **Local** (`NETWORK_DIR=local-devnet`): Uses Docker directly
- **Ansible** (`NETWORK_DIR=ansible-devnet`): Uses Ansible to deploy to remote hosts
Expand Down
3 changes: 2 additions & 1 deletion ansible/playbooks/site.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,12 @@
genesis_dir: "{{ network_dir }}/genesis"
when: not (skip_genesis | default(false) | bool)

# Always run copy-genesis - it targets all:!local (remote hosts only)
# Copy genesis to all remote hosts (skip when restarting a single node — deploy-nodes.yml syncs to the target)
- name: Copy Genesis Files to Remote Hosts
import_playbook: copy-genesis.yml
vars:
genesis_dir: "{{ network_dir }}/genesis"
when: not (skip_genesis | default(false) | bool)

- name: Deploy Nodes
import_playbook: deploy-nodes.yml
Expand Down
18 changes: 10 additions & 8 deletions convert-validator-config.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ def convert_validator_config(
output_path: str,
base_port: int = 8081,
docker_host: bool = False,
quiet: bool = False,
):
"""
Convert validator-config.yaml to upstreams.json.
Expand Down Expand Up @@ -83,17 +84,18 @@ def convert_validator_config(
with open(output_path, 'w') as f:
json.dump(output, f, indent=2)

print(f"✅ Converted {len(upstreams)} validators to {output_path}")
print(f"\nGenerated upstreams:")
for u in upstreams:
print(f" - {u['name']}: {u['url']}{u['path']}")

print(f"\n💡 To use: leanpoint --upstreams-config {output_path}")
if not quiet:
print(f"✅ Converted {len(upstreams)} validators to {output_path}")
print(f"\nGenerated upstreams:")
for u in upstreams:
print(f" - {u['name']}: {u['url']}{u['path']}")
print(f"\n💡 To use: leanpoint --upstreams-config {output_path}")


def main():
args = [a for a in sys.argv[1:] if a != "--docker"]
args = [a for a in sys.argv[1:] if a not in ("--docker", "--quiet")]
docker_host = "--docker" in sys.argv
quiet = "--quiet" in sys.argv

if len(args) < 2:
if len(args) == 0:
Expand All @@ -109,7 +111,7 @@ def main():
output_path = args[1]

try:
convert_validator_config(yaml_path, output_path, docker_host=docker_host)
convert_validator_config(yaml_path, output_path, docker_host=docker_host, quiet=quiet)
except FileNotFoundError as e:
print(f"Error: File not found: {e}", file=sys.stderr)
sys.exit(1)
Expand Down
17 changes: 17 additions & 0 deletions parse-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,15 @@ while [[ $# -gt 0 ]]; do
skipLeanpoint=true
shift
;;
--replace-with)
replaceWith="$2"
shift
shift
;;
--logs)
enableLogs=true
shift
;;
*) # unknown option
shift # past argument
;;
Expand All @@ -117,6 +126,12 @@ then
exit
fi;

# Validate --replace-with requires --restart-client
if [[ -n "$replaceWith" ]] && [[ ! -n "$restartClient" ]]; then
echo "Warning: --replace-with requires --restart-client. Ignoring --replace-with."
replaceWith=""
fi

# When using --restart-client with checkpoint sync, set default checkpoint URL if not provided
if [[ -n "$restartClient" ]] && [[ ! -n "$checkpointSyncUrl" ]]; then
checkpointSyncUrl="https://leanpoint.leanroadmap.org/lean/v0/states/finalized"
Expand Down Expand Up @@ -144,3 +159,5 @@ echo "coreDumps = ${coreDumps:-disabled}"
echo "checkpointSyncUrl = ${checkpointSyncUrl:-<not set>}"
echo "restartClient = ${restartClient:-<not set>}"
echo "skipLeanpoint = ${skipLeanpoint:-false}"
echo "replaceWith = ${replaceWith:-<not set>}"
echo "enableLogs = ${enableLogs:-false}"
124 changes: 122 additions & 2 deletions spin-node.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,26 @@ if [ "$deployment_mode" == "ansible" ] && ([ "$validatorConfig" == "genesis_boot
echo "Using Ansible deployment: configDir=$configDir, validator config=$validator_config_file"
fi

# Set up logging if --logs flag is enabled
if [ "$enableLogs" == "true" ]; then
_log_dir="$scriptDir/tmp"
mkdir -p "$_log_dir"
_log_start=$(date -u +%s)
_log_args="$*"
if [ "$deployment_mode" == "ansible" ]; then
_log_prefix="ansible-run"
else
_log_prefix="local-run"
fi
_log_file="$_log_dir/${_log_prefix}-$(date -u '+%d-%m-%Y-%H-%M').log"
echo "$(date -u '+%Y-%m-%d %H:%M:%S') START spin-node.sh $_log_args" >> "$_log_dir/devnet.log"
# Store relative log path for devnet.log END entry (e.g. tmp/ansible-run-11-03-2026-14-30.log)
_log_file_rel="${_log_file#$scriptDir/}"
trap 'echo "$(date -u '\''+%Y-%m-%d %H:%M:%S'\'') END spin-node.sh ($(( $(date -u +%s) - _log_start ))s) -> '"$_log_file_rel"'" >> "'"$_log_dir"'/devnet.log"' EXIT
exec > >(tee -a "$_log_file") 2>&1
echo "Logging to $_log_file"
fi

#1. setup genesis params and run genesis generator
source "$(dirname $0)/set-up.sh"
# ✅ Genesis generator implemented using PK's eth-beacon-genesis tool
Expand Down Expand Up @@ -163,6 +183,104 @@ if [[ -n "$restartClient" ]]; then
echo "Restarting with checkpoint sync: ${spin_nodes[*]} from $checkpointSyncUrl"
cleanData=true # Clear data when restarting with checkpoint sync
node_present=true

# --- Handle --replace-with: swap client implementations ---
if [[ -n "$replaceWith" ]]; then
IFS=',' read -r -a replace_nodes <<< "$replaceWith"

# Build replacement pairs: replace_map[i] corresponds to requested_nodes[i]
declare -A replace_map
has_replacements=false
for i in "${!requested_nodes[@]}"; do
old_name="${requested_nodes[$i]}"
new_name=""
if [ $i -lt ${#replace_nodes[@]} ]; then
new_name=$(echo "${replace_nodes[$i]}" | xargs) # trim whitespace
fi
if [ -n "$new_name" ] && [ "$new_name" != "$old_name" ]; then
replace_map["$old_name"]="$new_name"
has_replacements=true
echo "Will replace: $old_name → $new_name"
fi
done

# Warn about extra --replace-with entries beyond --restart-client count
if [ ${#replace_nodes[@]} -gt ${#requested_nodes[@]} ]; then
echo "Warning: --replace-with has more entries (${#replace_nodes[@]}) than --restart-client (${#requested_nodes[@]}). Extra entries ignored."
fi

if [ "$has_replacements" = true ]; then
# 1. Stop old containers BEFORE config changes (inventory still resolves to old names)
echo "Stopping old containers before replacement..."
for old_name in "${!replace_map[@]}"; do
if [ "$deployment_mode" == "ansible" ]; then
echo "Stopping $old_name via Ansible..."
"$scriptDir/run-ansible.sh" "$configDir" "$old_name" "" "$validatorConfig" "$validator_config_file" "$sshKeyFile" "$useRoot" "stop" "" "true" "" || {
echo "Warning: Failed to stop $old_name via Ansible, continuing..."
}
else
echo "Stopping local container $old_name..."
if [ -n "$dockerWithSudo" ]; then
sudo docker rm -f "$old_name" 2>/dev/null || true
else
docker rm -f "$old_name" 2>/dev/null || true
fi
fi
done

# 2. Remove old data directories (different clients have different data structures)
# New data dirs will be created fresh by the spin loop below (mkdir -p + cleanData)
for old_name in "${!replace_map[@]}"; do
old_data_dir="$dataDir/$old_name"
if [ -d "$old_data_dir" ]; then
rm -rf "$old_data_dir"
echo " Removed data dir $old_name"
fi
done

# 3. Update config files (rename old → new)
for old_name in "${!replace_map[@]}"; do
new_name="${replace_map[$old_name]}"
echo "Updating config files: $old_name → $new_name"

# validator-config.yaml: rename .validators[].name
yq eval -i "(.validators[] | select(.name == \"$old_name\") | .name) = \"$new_name\"" "$validator_config_file"

# validators.yaml: rename top-level key
validators_file="$configDir/validators.yaml"
if [ -f "$validators_file" ]; then
yq eval -i ".[\"$new_name\"] = .[\"$old_name\"] | del(.[\"$old_name\"])" "$validators_file"
fi

# annotated_validators.yaml: rename top-level key
annotated_file="$configDir/annotated_validators.yaml"
if [ -f "$annotated_file" ]; then
yq eval -i ".[\"$new_name\"] = .[\"$old_name\"] | del(.[\"$old_name\"])" "$annotated_file"
fi

# Rename key file
if [ -f "$configDir/$old_name.key" ]; then
mv "$configDir/$old_name.key" "$configDir/$new_name.key"
echo " Renamed $old_name.key → $new_name.key"
fi
done

# 3. Update spin_nodes array with new names
for i in "${!spin_nodes[@]}"; do
old="${spin_nodes[$i]}"
if [[ -n "${replace_map[$old]+x}" ]]; then
spin_nodes[$i]="${replace_map[$old]}"
fi
done

# Re-read nodes from updated config (needed for aggregator and downstream logic)
nodes=($(yq eval '.validators[].name' "$validator_config_file"))

echo "Updated spin_nodes: ${spin_nodes[*]}"
echo "Config files updated successfully."
fi
fi

# Parse comma-separated or space-separated node names or handle single node/all
elif [[ "$node" == "all" ]]; then
# Spin all nodes
Expand Down Expand Up @@ -232,6 +350,7 @@ if [ "$deployment_mode" == "ansible" ]; then
fi

# Determine skip_genesis for Ansible (true when restarting with checkpoint sync)
# deploy-nodes.yml syncs config files to the target host, so copy-genesis to all hosts is not needed
ansible_skip_genesis="false"
[[ "$restart_with_checkpoint_sync" == "true" ]] && ansible_skip_genesis="true"

Expand All @@ -256,7 +375,7 @@ if [ "$deployment_mode" == "ansible" ]; then
exit 1
fi

if [ -z "$skipLeanpoint" ]; then
if [ -z "$skipLeanpoint" ] && { [ "$restart_with_checkpoint_sync" != "true" ] || [ "${has_replacements:-false}" = "true" ]; }; then
# Sync leanpoint upstreams to tooling server and restart remote container (no 5th arg = remote)
if ! "$scriptDir/sync-leanpoint-upstreams.sh" "$validator_config_file" "$scriptDir" "$sshKeyFile" "$useRoot"; then
echo "Warning: leanpoint sync failed. If the tooling server requires a specific SSH key, run with: --sshKey <path-to-key>"
Expand Down Expand Up @@ -479,8 +598,9 @@ if [ -n "$enableMetrics" ] && [ "$enableMetrics" == "true" ]; then
fi

# Deploy leanpoint: locally (local devnet) or sync to tooling server (Ansible), unless --skip-leanpoint
# Skip leanpoint during checkpoint sync restart (node list hasn't changed)
local_leanpoint_deployed=0
if [ -z "$skipLeanpoint" ]; then
if [ -z "$skipLeanpoint" ] && { [ "$restart_with_checkpoint_sync" != "true" ] || [ "${has_replacements:-false}" = "true" ]; }; then
if "$scriptDir/sync-leanpoint-upstreams.sh" "$validator_config_file" "$scriptDir" "$sshKeyFile" "$useRoot" "$dataDir"; then
local_leanpoint_deployed=1
else
Expand Down
2 changes: 1 addition & 1 deletion sync-leanpoint-upstreams.sh
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ fi

out_file=$(mktemp)
trap "rm -f $out_file" EXIT
python3 "$convert_script" "$validator_config_file" "$out_file" || {
python3 "$convert_script" "$validator_config_file" "$out_file" --quiet || {
echo "Warning: convert-validator-config.py failed, skipping leanpoint sync."
exit 0
}
Expand Down
Loading