Skip to content

Add PWM sensor to thermal task#2430

Open
mkeeter wants to merge 4 commits intomasterfrom
mkeeter/thermal-pwm-logging
Open

Add PWM sensor to thermal task#2430
mkeeter wants to merge 4 commits intomasterfrom
mkeeter/thermal-pwm-logging

Conversation

@mkeeter
Copy link
Collaborator

@mkeeter mkeeter commented Mar 9, 2026

(staged on top of #2428)

This adds a new fan_ctrl sensor, which is updated by the thermal loop. In combination with oxidecomputer/humility#602, we can log both fan speed (which was already possible) and commanded PWM values with humility dashboard -o out.csv

Here's an example while running stress-ng:

Screenshot 2026-03-09 at 10 54 42 AM

The hysteresis from the fan controller is quite noticeable if we plot PWM vs RPM:

Screenshot 2026-03-09 at 10 55 01 AM

@mkeeter mkeeter requested review from hawkw and labbott March 9, 2026 15:06
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from 59b81b4 to 7e0e7af Compare March 9, 2026 15:08
Comment on lines +1564 to +1569
self.set_pwm(target_pwm)?;
self.sensor_api.post(
OUTPUT_PWM_SENSOR,
target_pwm.0 as f32,
now_ms,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda wonder if maybe we ought to just make self.set_pwm also post the new duty cycle to the sensors task, so that it's impossible to set the duty cycle without also posting value to the sensors task?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done – it's a little awkward (because we need to represent the NoData::DeviceOff case), so I switched the input argument to a Result<PWMDuty, NoData>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, did we have to do it like that? I was imagining that in the DeviceOff state we would just call self.sensor_api.post directly. Rather than making set_pwm the only way we ever post a sensor reading, I was just thinking that any time we set the PWM duty cycle we also post a sensor reading (but this would not be the only place we posted a sensor reading). set_pwm(Err(NoData::DeviceOff)) feels a bit awkward to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Result<..> shenanigans are the simplest way to do it. In our previous world where set_pwm just takes a PwmDuty, when we go uncontrollable, we'd still call set_pwm(PwmDuty(0)) – and would therefore post 0 to the sensors task because we don't know any better.

The Result<..> argument lets us distinguish between "there's an error, so post the error but set the PWM to 0" versus "situation normal, post a pwm of 0 and set it".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, yeah, okay. I don't hate it.

@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch 2 times, most recently from a14b7c2 to 918ce72 Compare March 9, 2026 22:31
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from c48f4a1 to a27d9e8 Compare March 9, 2026 22:32
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from 918ce72 to 33651a8 Compare March 9, 2026 22:32
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from a27d9e8 to 2d6751c Compare March 9, 2026 22:33
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch 2 times, most recently from 2b5aa4f to 50a6a4c Compare March 10, 2026 13:54
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from bde98a4 to ffdf316 Compare March 10, 2026 14:00
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from 50a6a4c to 130b41d Compare March 10, 2026 14:00
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from ffdf316 to 014d2be Compare March 10, 2026 14:07
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from 130b41d to b5dd092 Compare March 10, 2026 14:07
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from 014d2be to fdd03a0 Compare March 10, 2026 14:24
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from b5dd092 to 63b3e35 Compare March 10, 2026 14:42
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch 2 times, most recently from 588cbe9 to 7010013 Compare March 10, 2026 14:50
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from 63b3e35 to f6c3cc5 Compare March 10, 2026 14:50
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from 7010013 to c64bf6a Compare March 11, 2026 14:07
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from f6c3cc5 to 9f4b4a6 Compare March 11, 2026 14:07
Copy link
Collaborator

@labbott labbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few minor questions but LGTM

ringbuf_entry!(Trace::PowerDownFailed(e));
}
self.set_pwm(PWMDuty(0))?;
self.set_pwm(Err(task_sensor_api::NoData::DeviceOff), now_ms)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels strange to treat DeviceOff as an error. Under what circumstances would we hit PowerDown?

[[config.sensor.devices]]
name = "fan_ctrl"
device = "thermal_loop"
description = "Thermal loop PWM command"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the command part of this description feels weird. Just drop it?

@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from c64bf6a to 8e2c3a1 Compare March 11, 2026 20:17
@mkeeter mkeeter force-pushed the mkeeter/thermal-pwm-logging branch from 9f4b4a6 to 47218f6 Compare March 11, 2026 20:17
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from 8e2c3a1 to 234e459 Compare March 12, 2026 13:13
@mkeeter mkeeter force-pushed the mkeeter/shrink-thermal-stack branch from 234e459 to 323f754 Compare March 12, 2026 13:28
Base automatically changed from mkeeter/shrink-thermal-stack to master March 12, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants