Skip to content

Bump OVN northd probe interval to 20 seconds#2904

Open
pperiyasamy wants to merge 1 commit intoopenshift:masterfrom
pperiyasamy:bump-northd-probe-interval
Open

Bump OVN northd probe interval to 20 seconds#2904
pperiyasamy wants to merge 1 commit intoopenshift:masterfrom
pperiyasamy:bump-northd-probe-interval

Conversation

@pperiyasamy
Copy link
Member

@pperiyasamy pperiyasamy commented Feb 11, 2026

Large OVN NB/SB databases can delay probe responses during initial sync. The 10s interval is not sufficient and can lead to repeated reconnects and failures. So bumping it to 20s to improve startup stability.

2026-02-04T16:03:33.603510916Z 2026-02-04T16:03:33.603Z|00554|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connecting...
2026-02-04T16:03:33.603534674Z 2026-02-04T16:03:33.603Z|00555|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connected
2026-02-04T16:03:53.611519611Z 2026-02-04T16:03:53.611Z|00556|reconnect|ERR|unix:/var/run/ovn/ovnnb_db.sock: no response to inactivity probe after 10 seconds, disconnecting
2026-02-04T16:03:53.611548880Z 2026-02-04T16:03:53.611Z|00557|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connection dropped
2026-02-04T16:03:53.611558696Z 2026-02-04T16:03:53.611Z|00558|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: waiting 2 seconds before reconnect
2026-02-04T16:03:55.613540420Z 2026-02-04T16:03:55.613Z|00559|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connecting...
2026-02-04T16:03:55.613568133Z 2026-02-04T16:03:55.613Z|00560|reconnect|INFO|unix:/var/run/ovn/ovnnb_db.sock: connected
2026-02-04T16:04:09.483100968Z 2026-02-04T16:04:09.483Z|00561|reconnect|ERR|unix:/var/run/ovn/ovnsb_db.sock: no response to inactivity probe after 10 seconds, disconnecting
2026-02-04T16:04:09.483140599Z 2026-02-04T16:04:09.483Z|00562|reconnect|INFO|unix:/var/run/ovn/ovnsb_db.sock: connection dropped
2026-02-04T16:04:09.483158393Z 2026-02-04T16:04:09.483Z|00563|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.

Large OVN NB/SB databases can delay probe responses during initial
sync. The 10s interval is not sufficient and can lead to repeated
client reconnects and failures. Bump it to 20s to improve startup
stability.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Walkthrough

Updated the OVN_NORTHD_PROBE_INTERVAL environment variable value from "10000" to "20000" in cluster-network-operator deployment manifests for both standard and IBM Cloud managed environments. No other configuration changes.

Changes

Cohort / File(s) Summary
Cluster Network Operator Deployments
manifests/0000_70_cluster-network-operator_03_deployment.yaml, manifests/0000_70_cluster-network-operator_03_deployment-ibm-cloud-managed.yaml
Updated OVN_NORTHD_PROBE_INTERVAL environment variable from "10000" to "20000".

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

@pperiyasamy
Copy link
Member Author

/retest

@danwinship
Copy link
Contributor

seems plausible i guess
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 12, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 12, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, pperiyasamy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 12, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 12, 2026

@pperiyasamy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 961f551 link false /test security
ci/prow/e2e-aws-ovn-windows 961f551 link true /test e2e-aws-ovn-windows
ci/prow/4.22-upgrade-from-stable-4.21-e2e-azure-ovn-upgrade 961f551 link false /test 4.22-upgrade-from-stable-4.21-e2e-azure-ovn-upgrade
ci/prow/e2e-aws-ovn-serial-1of2 961f551 link true /test e2e-aws-ovn-serial-1of2
ci/prow/hypershift-e2e-aks 961f551 link true /test hypershift-e2e-aks

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants