Skip to content

Node data persistence in persistent volume claims#787

Open
Jhoyola wants to merge 8 commits intobitcoin-dev-project:mainfrom
Jhoyola:node-persistence
Open

Node data persistence in persistent volume claims#787
Jhoyola wants to merge 8 commits intobitcoin-dev-project:mainfrom
Jhoyola:node-persistence

Conversation

@Jhoyola
Copy link

@Jhoyola Jhoyola commented Feb 17, 2026

At Torq / Mion we intend to move to using warnet for persistent regtest lightning network for development locally. This requires to have the nodes keep their state indefinetly, to not need to constantly recreate the network, connect nodes to our software etc. I tried to do it by defining volumes and volumeMounts, but for bitcoin core the .bitcoin directory is already mounted so it conflicted on making a new mount to the same dir. This PR aims to add built-in, easy and configurable persistence option.

Overview

  • Uses persistent volume claims. Either auto generated or pre-defined
  • A new helm configuration value set "persistence"
    • Simply add enabled: true to create automatic PVC for the node
    • Use existingClaim: <existing-pvc-name> to use pre-created PVC instead of creating new one
    • If not set, everything should work as it did before
  • Down command prompts to delete the PVCs if they exist
  • Added documentation on how to use this

Tested in Windows WSL environment.

Copy link
Contributor

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK so far, a few questions below.

It would be great if possible if you could add a test that demonstrates the data recovery. I feel like there's one step missing in the docs (see below) that I could use to test the feature.

Have you also tried the persistent volumes with a cloud provider like Digital Ocean? I'm just wondering if there's any obstacles with that. I imagine the volumes persist as long as the cluster is running.

Also - tested using Windows WSL! I know we connected on #655 already, I'm just impressed you are using windows ;-)

And finally thanks for the contribution! Your PR is clean, well organized, and the commits show understanding of the repo which is really great. Please keep in touch about how you're using Warnet at Mion and if there's anything else you need.

- configMap:
name: {{ include "cln.fullname" . }}
name: config
- name: shared-volume
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e966d99

This can be named something more specific like cln-datadir and below, lnd-datadir. I think the reason it is called "shared volume" on main is because for lnd it is shared with circuit breaker and other containers that read the macaroons, etc. But that was a poor choice of name and we can fix it now.

persistence:
enabled: false
storageClass: ""
accessMode: ReadWriteOnce
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e966d99

I'm not super familiar with these options, but is there a reason you chose ReadWriteOnce which is accessible by the entire kubernetes node and not ReadWriteOncePod which seems to be a bit more specific for the use case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very good with Kubernetes, but good suggestion, ReadWriteOncePod is better. I changed it to that.

```yaml
persistence:
enabled: true
existingClaim: "bitcoin-node-001-data"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

763d940

How do you get the name of the claim from the last lifecycle?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is decided based on tank name, namespace and node type. I added it to the documentation.

@Jhoyola
Copy link
Author

Jhoyola commented Feb 18, 2026

I didn't test with any cloud setup. I'm actually quite new to kubernetes and never ran it in cloud. I would imagine this works nicely there too.

One thing I'm still unsure about: Now it's possible to leave the volumes intact even after warnet down command. But currently I don't think there is a way of restarting the network gracefully with existing data. The deploy command kind of works, but returns some errors because it expects it to be the first startup. And anyways if wallets, channels etc. already exist, it's not correct to recreate them I think. Would it make sense to create some "restart" command?

I at least added a small fix for the ensure_miner to allow to load existing miner wallet without throwing an error.

inquirer.Confirm(
pvc_confirmed,
message=click.style(
"Do you also want to delete all PVCs containing persistent data? This will result in permanent loss of data.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just noticed this message is too long for click I guess? Just barely

Image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shortened it in rebase

@Jhoyola
Copy link
Author

Jhoyola commented Feb 18, 2026

Now by adding restartPolicy: Always and loading unloaded miner wallet in ensure_miner the nodes seem operational after each computer restart. The ensure_miner fix is important to be able to start miner after pod restart.

@Jhoyola
Copy link
Author

Jhoyola commented Feb 19, 2026

For me the ln_test passes locally on this branch so maybe there is some race condition or similar?

@pinheadmz
Copy link
Contributor

For me the ln_test passes locally on this branch so maybe there is some race condition or similar?

Our ci is notoriously flaky, working with kubernetes has this "eventual consistency" to it which isn't easy to reproduce all the time. Anyway I restarted CI on your PR and everything looks good.

I'm going to add a test myself and push to your branch (if you don't mind) then we'll be RFM.

@Jhoyola
Copy link
Author

Jhoyola commented Feb 19, 2026

I'm going to add a test myself and push to your branch (if you don't mind) then we'll be RFM.

Thanks, yes, sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments