Skip to content

apc_modbus: Add quirk for devices with word swapped bitfields#3381

Closed
EchterAgo wants to merge 2 commits intonetworkupstools:masterfrom
EchterAgo:apc_modbus_word_swap_quirk
Closed

apc_modbus: Add quirk for devices with word swapped bitfields#3381
EchterAgo wants to merge 2 commits intonetworkupstools:masterfrom
EchterAgo:apc_modbus_word_swap_quirk

Conversation

@EchterAgo
Copy link
Copy Markdown
Contributor

@EchterAgo EchterAgo commented Mar 27, 2026

General points

  • Described the changes in the PR submission or a separate issue, e.g.
    known published or discovered protocols, applicable hardware (expected
    compatible and actually tested/developed against), limitations, etc.

  • There may be multiple commits in the PR, aligned and commented with
    a functional change. Notably, coding style changes better belong in a
    separate PR, but certainly in a dedicated commit to simplify reviews
    of "real" changes in the other commits. Similarly for typo fixes in
    comments or text documents.

  • Use of coding helper tools and AI should be disclosed in the commit
    or PR comments (it is interesting to know which ones do a decent job).
    As with other contributions, a human is responsible and thanked for the
    quality and content of the change, and is presumed to have the right to
    post that code to be published further under the project's license terms.

  • Please star NUT on GitHub, this helps with sponsorships! ;)

Frequent "underwater rocks" for driver addition/update PRs

  • Revised existing driver families and added a sub-driver if applicable
    (nutdrv_qx, usbhid-ups...) or added a brand new driver in the other
    case.

  • Did not extend obsoleted drivers with new hardware support features
    (notably blazer and other single-device family drivers for Qx protocols,
    except the new nutdrv_qx which should cover them all).

  • For updated existing device drivers, bumped the DRIVER_VERSION macro
    or its equivalent.

  • For USB devices (HID or not), revised that the driver uses unique
    VID/PID combinations, or raised discussions when this is not the case
    (several vendors do use same interface chips for unrelated protocols).

  • For new USB devices, built and committed the changes for the
    scripts/upower/95-upower-hid.hwdb file

  • Proposed NUT data mapping is aligned with existing docs/nut-names.txt
    file. If the device exposes useful data points not listed in the file, the
    experimental.* namespace can be used as documented there, and discussion
    should be raised on the NUT Developers mailing list to standardize the new
    concept.

  • Updated data/driver.list.in if applicable (new tested device info)

Frequent "underwater rocks" for general C code PRs

  • Did not "blindly assume" default integer type sizes and value ranges,
    structure layout and alignment in memory, endianness (layout of bytes and
    bits in memory for multi-byte numeric types), or use of generic int where
    language or libraries dictate the use of size_t (or ssize_t sometimes).
  • Progress and errors are handled with upsdebugx(), upslogx(),
    fatalx() and related methods, not with direct printf() or exit().
    Similarly, NUT helpers are used for error-checked memory allocation and
    string operations (except where customized error handling is needed,
    such as unlocking device ports, etc.)

  • Coding style (including whitespace for indentations) follows precedent
    in the code of the file, and examples/guide in docs/developers.txt file.

  • For newly added files, the Makefile.am recipes were updated and the
    make distcheck target passes.

General documentation updates

  • Added a bullet point into NEWS.adoc, possibly also UPGRADING.adoc
    if there is something packagers or custom-build users should take into
    account (new driver categories, configuration options, dependencies...)

  • Updated docs/acknowledgements.txt (for vendor-backed device support)

  • Added or updated manual page information in docs/man/*.txt files
    and corresponding recipe lists in docs/man/Makefile.am for new pages

  • Passed make spellcheck, updated spell-checking dictionary in the
    docs/nut.dict file if needed (did not remove any words -- the make
    rule printout in case of changes suggests how to maintain it).

Additional work may be needed after posting this PR

  • Propose a PR for NUT DDL with detailed device data dumps from tests
    against real hardware (the more models, the better).

  • Address NUT CI farm build failures for the PR: testing on numerous
    platforms and toolkits can expose issues not seen on just one system.

  • Revise suggestions from LGTM.COM analysis about "new issues" with
    the changed codebase.

Fixes #2338

@EchterAgo
Copy link
Copy Markdown
Contributor Author

I'm not sure we need the 0xAA check, I had the quirk test in upsdrv_updateinfo before.

@EchterAgo
Copy link
Copy Markdown
Contributor Author

I'd also like to test how a USB Modbus connection reacts to this, although I have reason to believe it will be exactly like RS232.

@EchterAgo EchterAgo force-pushed the apc_modbus_word_swap_quirk branch 2 times, most recently from 303ccd6 to 49db177 Compare March 27, 2026 15:29
@AppVeyorBot
Copy link
Copy Markdown

Signed-off-by: Axel Gembe <axel@gembe.net>
Signed-off-by: Axel Gembe <axel@gembe.net>
@EchterAgo EchterAgo force-pushed the apc_modbus_word_swap_quirk branch from 49db177 to 95dbdb6 Compare March 27, 2026 18:49
@cardil
Copy link
Copy Markdown

cardil commented Mar 30, 2026

@EchterAgo I deployed this on my SMT1500RMI2U (FW 15.1, serial). The quirk detection doesn't trigger, and load.on still fails.

Driver startup at debug level 3 shows no quirk-related messages at all: no "Detected word swap quirk", no "Failed to detect word swap quirk", nothing. Here's the version string reported:

nut-driver@apc[176583]: Network UPS Tools 2.8.4.1689.2-1691+g95dbdb614 (development iteration after 2.8.4) - NUT APC Modbus driver without USB support (libmodbus link type: dynamic) 0.19

I re-ran my pymodbus probe to double-check; same results as before: BE [0x0000, 0x0007] gives exception 4, LE [0x0007, 0x0000] succeeds.

The PR probe sends LE order [0x0007, 0x0000], which succeeds on my unit (firmware accepts it silently), so the if (... < 0) block is never entered and needs_word_swap_quirk stays 0.

I think the detection logic is inverted. On your unit, the probe fails with EMBXILVAL because the firmware validates reg 1538 properly, meaning BE order already works for you. On mine, the probe succeeds because the firmware doesn't validate the combo, but that's exactly the unit that needs the swap.

The fix should be: set needs_word_swap_quirk = 1 when the probe succeeds, not when it fails.

Also the EMBXSFAIL comment mentioning SMT1500RMI2U comes from my BE-order test, not from the LE probe. The LE probe returns SUCCESS on my unit.

@EchterAgo
Copy link
Copy Markdown
Contributor Author

EchterAgo commented Mar 30, 2026

 resrvd  flags
[0x0007, 0x0000]

is what is supposed to succeed because according to spec the flags are all in the reserved area and there is no command, so this succeeds on my unit.

_apc_modbus_from_uint64 produces [0x0000, 0x0007] from the same flags, which fails with EMBXILVAL on my unit because the combination of flags is invalid.

It seems that your unit behaves the same way mine does. Are you sure swapping the words actually makes the commands work on your unit or does it just accept them because all the bits are then in the reserved area?

For example shutdown.return will fail with EMBXILVAL on my unit because it does not have a main outlet group the switched outlet groups have to be turned off first. I'm using outlet.3.load.off and outlet.3.load.on to test because I don't have anything connected to that outlet group. I'm actually not sure what to do there anyway, my device has 3 switched outlet groups, what to do in that case? I don't think we can even detect what outlet groups there are, and if I shut them down sequentially, what happens if the first outlet group is the one that powers the machine running the apc_modbus driver?

@cardil
Copy link
Copy Markdown

cardil commented Apr 1, 2026

@EchterAgo Fair question about whether LE word order actually executes commands or just gets silently accepted. I haven't proven that yet - my pymodbus test sent LE load.on while the load was already on, so the UPS accepted it without error but I can't confirm the command actually did anything. I need to move my production workloads off this UPS first so I can safely toggle outlets. Once that's done I'll connect a laptop via serial, test both word orders with actual state changes (outlet OFF -> send load.on -> verify it turns ON), and report back here.

On your question about what to switch off and what happens if the driver's outlet loses power - that's the normal FSD scenario. By the time upsdrvctl shutdown runs, the OS has already halted everything. The driver sends shutdown.return to tell the UPS "turn off all outlets now (using off-delay), turn back on when mains returns." The machine running the driver losing power is the desired behavior - it preserves remaining charge on the UPS, and when mains comes back the UPS cleanly starts up all servers. The off-delay gives the OS time to finish halting before the UPS cuts power.

Looking at the code, the apc_modbus command map already has shutdown.return registered with CMD_OUTPUT_SHUTDOWN | TARGET_MAIN_OUTLET_GROUP | MOD_USE_OFF_DELAY, and it goes through _apc_modbus_instcmd() which does the proper 2-register FC 0x10 write plus your word swap. Both usbhid-ups and snmp-ups use shutdown.return as the first command in their default sdcommands list: "shutdown.return,shutdown.reboot,load.off.delay,shutdown.stayoff" - the framework tries them in order until one succeeds. The apc_modbus driver should probably do the same instead of its current upsdrv_shutdown() which does its own FC 0x06 single-register write bypassing both the command map and the word swap logic.

For your unit with only SOGs and no MOG, the spec section 3.3 you quoted says that issuing the Shutdown command in SimpleSignalCommand_BF will "issue an on command to every SOG (and MOG) to turn off using its specified OffDelay program" and "when AC is present the outlets that were on prior to issuing this command will be turned on automatically." That's exactly what shutdown.return needs to do for FSD, and it works regardless of how many SOGs or MOGs the unit has. Could be a better target for upsdrv_shutdown() than manually targeting individual outlet groups.

I'll do the proper outlet toggle tests and upsdrvctl shutdown testing as soon as I have the UPS free.

@EchterAgo
Copy link
Copy Markdown
Contributor Author

EchterAgo commented Apr 2, 2026

I think my unit acutally does have an MOG but you need to have switched off all SOGs before operating it. I have a good idea now how to correctly map the commands, but I need to figure out whether we can just specify all SOGs in shutdown.* and load.* or whether we need a probe mechanism and only specify the SOGs that actually exist on the unit. I just got out my old RS232 / USB SMT1500I that has only 1 SOG, hopefully I can do some testing with it today. I think this specific discussion should continue in #2338 because it has nothing to do with the word swap quirk.

@EchterAgo
Copy link
Copy Markdown
Contributor Author

See #3395

@EchterAgo
Copy link
Copy Markdown
Contributor Author

Closing, see #2338 (comment)

@EchterAgo EchterAgo closed this Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apc_modbus: follow-up wanted regarding Main Outlet Group operations

3 participants