[RFC] Uniform service spawning with external 'resources'#740
[RFC] Uniform service spawning with external 'resources'#740williampMSFT wants to merge 17 commits intoOpenDevicePartnership:v0.2.0from
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces an RFC service initialization/spawning pattern in embedded-services based on explicit Resources storage and a Runner handle, and applies it to the time-alarm and eSPI services (plus updating an example and tests) to reduce per-service task boilerplate and make service startup uniform.
Changes:
- Add
embedded_services::service::{Service, ServiceResources, ServiceRunner}traits and aspawn_service!()macro to standardize service construction and task spawning. - Refactor
time-alarm-serviceandespi-serviceto use(Service, Runner)returned fromService::new(...)with externalResources, removing the oldtaskmodules. - Update the RT685 example and time-alarm tokio tests to use the new APIs.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
embedded-service/src/service.rs |
Adds the new service traits and spawn_service!() macro for uniform initialization/spawning. |
embedded-service/src/lib.rs |
Exposes the new service module publicly. |
time-alarm-service/src/lib.rs |
Refactors time-alarm into Resources + Runner + control-handle Service implementing the new trait. |
time-alarm-service/src/task.rs |
Removes the old dedicated task wrapper API. |
time-alarm-service/tests/tad_test.rs |
Updates tokio tests to construct resources and drive the new Runner. |
examples/rt685s-evk/src/bin/time_alarm.rs |
Switches the example to spawn_service!() and updates relay handler wiring accordingly. |
espi-service/src/espi_service.rs |
Refactors eSPI service into Resources + Runner + control-handle Service implementing the new trait. |
espi-service/src/task.rs |
Removes the old task wrapper API. |
espi-service/src/lib.rs |
Stops exporting the removed task module. |
Comments suppressed due to low confidence (4)
embedded-service/src/service.rs:27
- Docs here still refer to an
init()API, but the trait method is namednew(...). Please update the wording so it matches the current public API and avoids sending users to a non-existentinit()method.
/// A trait for a service that can be run on the EC.
/// Implementations of Service should have an init() function to construct the service that
/// returns a Runner, which the user is expected to spawn a task for.
pub trait Service<'hw>: Sized {
/// A type that can be used to run the service. This is returned by the init() function and the user is
/// expected to call its run() method in an embassy task (or similar parallel execution context on other
/// async runtimes).
type Runner: ServiceRunner<'hw>;
/// Any memory resources that your service needs. This is typically an opaque types that is only used by the service
/// and is not interacted with by users of the service. Must be default-constructible for spawn_service!() to work.
type Resources: ServiceResources;
/// The error type that your `init` function can return on failure.
type ErrorType;
/// Any initialization parameters that your service needs to run.
type InitParams;
/// Initializes an instance of the service using the provided storage and returns a control handle for the service and
/// a runner that can be used to run the service.
fn new(
storage: &'hw mut Self::Resources,
params: Self::InitParams,
) -> impl core::future::Future<Output = Result<(Self, Self::Runner), Self::ErrorType>>;
embedded-service/src/service.rs:51
spawn_service!docs mention creating aOnceLockand callinginit(), but the macro body usesStaticCell+Service::new(...). Please update this doc block to reflect the actual storage mechanism and call flow.
/// Initializes a service, creates an embassy task to run it, and spawns that task.
///
/// This macro handles the boilerplate of:
/// 1. Creating a `static` [`OnceLock`](embassy_sync::once_lock::OnceLock) to hold the service
/// 2. Calling the service's `init()` method
/// 3. Defining an embassy_executor::task to run the service
/// 4. Spawning the task on the provided executor
time-alarm-service/src/lib.rs:300
Resources::new()is only provided via theServiceResourcestrait impl, so callers must import that trait just to construct resources. Consider adding an inherentpub fn new()(orDefault) and have the trait impl forward to it, to make manual instantiation ergonomic.
impl<'hw> embedded_services::service::ServiceResources for Resources<'hw> {
/// Allocate storage for the service's resources.
fn new() -> Self {
Resources { inner: None }
espi-service/src/espi_service.rs:45
Resources::new()is only available via theServiceResourcestrait method, so callers need that trait in scope (or UFCS) to construct resources. If you expect manual instantiation, consider adding an inherentpub fn new()/Defaultand forwarding the trait method to it.
{
fn new() -> Self {
Self { inner: None }
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@asasine Robert mentioned you might have an interest in this - would love to get your feedback :) |
|
Hey @williampMSFT, he's right, this is interesting. I think it has a lot of merit and we've been experimenting with a similar idea internally. We ultimately walked back on something as predefined like We found that the perceived rigidity of a The main benefit we did discover was adding new traits for the behavior, which effectively makes the It's possible that a consolidated |
This makes sense, but I think the tradeoffs might be different in this repo than in Surface / an OEM's code? Embedded-services is sort of a 'library' repo, intended for consumption by many different applications. I'm definitely not suggesting that we require Surface / other OEMs to adopt this pattern in their code, but I think there's value in this sort of consistency in the public API surface for embedded-services, even if it's a bit more annoying for ODP devs to implement - thoughts?
Interesting - I think I may be going for something slightly different here? I'm trying to simplify and standardize how we initialize ODP services so it's easier for users of those services to reason about them, mostly by moving all init parameters for a service to a single place, hiding the implementation details of worker tasks from end users, and coming up with a standard pattern for starting those. It's not strictly necessary to use traits at all to achieve this, though - it could just be a convention we adopt for ODP services that they do all the following:
It seems to me like in this case, the main benefit of using a trait rather than a convention is for the authors of the ODP services - it's an easy way for the compiler to check that they've upheld these conventions - but the client of the API is unlikely to try to be generic over
Not 100% sure I'm parsing this correctly, but I think you're saying that you found value in implementing traits for the behavior of individual services rather than in the common behaviors between all services (e.g. a
What are your thoughts on trying to move ODP services to function this way now? I'm not trying to impose this pattern on Surface code, I'm just trying to push for a set of API conventions that make ODP code more easily usable by Surface (and other OEMs') code :) |
|
I see two facets to this RFC that I want to address:
These overlap when APIs are public but I think they have many orthogonalities. Gonna start with the second one because it's simpler from my perspective. I'm totally on board with Now, the first one: embedded-services as a library. This particular RFC doesn't bring much for me as a consumer of this library. Regarding this question:
Yeah, that's what I mean. It is mostly orthogonal but there's overlap as mentioned because it influences the scope of each Writing tests over a generic If embedded-services is vying to be a library that OEMs adopt, these types of traits are the key for us. There's so much customization that happens internally that rigid tasks become difficult to extend, which leads to extension-through-modification (e.g., #734), a violation of the open-closed principle in SOLID. When the only way to consume embedded-services is through If we can get the right boundaries into ODP, we can start building up a comprehensive suite of crates on both sides of those traits. We can write testabe tasks (less contrived than
What are your thoughts on trying to move ODP's embedded-services crate (and friends) to function this way now? |
Great, will proceed with that, thanks! I think the intended benefits to users of the crates are
Totally on board with adding traits to wrap interfaces for services/logical devices and linking between services via traits rather than direct references.
I bugchecked a little on this one - it definitely makes sense to me to have our interfaces take e.g.
Definitely on board with using traits for interfaces and getting rid of private statics and the like. I think a lot of the private-statics stuff came out of some of the lifetime requirements from the comms service. It required 'static for everything that talked over it, which lead to this pattern of declaring a module-private static instance and having an init() function that returned a handle to it. This also made testing a bit of a nightmare, since you couldn't really blow up an instance of your service and recreate it for the next test.
Bugchecking here as well - when you say "we can write testable tasks", where do you envision those tasks existing? I'm totally fine with and expect that OEMs will write their own tasks to drive their services, but I'd ideally like to not have any tasks as part of the public interface for services in embedded-services, because I think those are implementation details of individual services that we're currently failing to encapsulate, and one of the goals of this change is to improve encapsulation of those details.
I'm definitely on board with ensuring that we have the extension points we need for services such that OEMs don't need to modify any code in embedded-services. I think if an OEM ever does have to do that to achieve their use case, that's indicative of a defect in our API design that we need to address. I think allowing user-provided task replacements is probably not the way, though? It seems like if we need to allow OEMs to control some behavior on a task, the right way to do that would be to take some sort of OEM-provided trait impl at service construction time that our internal, implementation-detail task(s) call into to achieve whatever customization is needed rather than having the OEM control the outer loop? i.e. rather than saying "you must write a task that calls these functions in this order or the service will misbehave:" We'd write a trait that enables extension points as appropriate or something like that (probably with domain-specific names rather than just post_bar, but you get the idea) - that way, the user can't call foo()/bar()/baz() out of order, and if we need to add a quux() in there or something later we can do it without the user needing to update their task.
Definitely on board with using traits for interfaces, getting rid of statics / moving all allocations outside of embedded-services, and increasing flexibility as much as possible. I do want to make sure we're aligned on the task thing, though - I think we do need extensibility points, and some of those extensibility points will need to be invoked on worker tasks, but we need to be careful about making those extensibility points difficult to accidentally use incorrectly, and I think making the user responsible for providing the full task implementation rather than a trait that the service's code calls into is very easy to accidentally use incorrectly. |
👍
Just an easy signature to demonstrate, I don't think tasks should generally be part of the public signature of a library. The primary thing I was advocating for was this:
Whether those other tasks (ODP or OEM) are using "real hardware" or arbitrary extensions is unimportant to the task's implementation.
Agreed on this that tasks are implementation details and using trait objects as API boundaries and extension points feels right and jives wel with Rust's language, type safety, and the above // embedded_services
pub trait MyService {
async fn foo();
async fn bar();
async fn baz();
}
#[derive(Default)]
pub struct MyCanonicalService;
impl MyService for MyCanonicalService {
async fn foo() {
// do the canonical foo things for this service
}
async fn bar() {
// --snip--
}
async fn baz() {
// --snip--
}
}
pub fn init(spawner: Spawner, service: impl Service) {
spawner.spawn(task(service)).unwrap()
}
#[embassy_executor::task]
async fn service_task(service: impl Service) {
loop {
service.foo().await;
service.bar().await;
service.baz().await;
}
}
// OEM
/// Does special OEM things before and after doing standard MyService things.
struct DoSpecialOemThings<S>(S);
impl<S: MyService> MyService for DoSpecialOemThings<S> {
async fn foo() {
// do OEM things before foo
self.0.foo().await; // do other foo things
// do OEM thigns after foo
}
async fn bar() {
// this OEM doesn't want to do bar at all so just no-op
}
async fn baz() {
// --snip--
}
}
embedded_services::init(DoSpecialOemThings(embedded_services::MyCanonicalService::default()));The goal would be to write tests over Hit send too early, will send another comment shortly. |
|
Since we can't write // embedded_services: mostly the same minus init() and now with this new service_task()
pub async fn service_task(impl MyService) -> ! {
loop {
service.foo().await;
service.bar().await;
service.baz().await;
}
}
// OEM: mostly the same, plus the new init() and service_task()
pub async fn init(spawner: Spawner) {
let service = DoSpecialOemThings(embedded_services::MyCanonicalService::default());
spawner.spawn(service_task(service)).unwrap();
}
#[embasy_executor::task]
async fn service_task(service: DoSpecialOemThings<embedded_services::MyCanonicalService>) -> ! {
// e x t e n s i o n woo
embedded_services::service_task(service).await
}And this brings me full-circle to the |
Exactly! This is one of the things I'm trying to solve for with this change. In a world where memory is allocated in a it's something like The It requires a few lines of boilerplate when you instantiate the service, but you were going to need that boilerplate if OEMs are writing their own more-than-one-line tasks anyway, and in the product case, the
This approach means that the client doesn't even see the details of |
|
Totally onboard with the goals here: sidestepping AFIT/dyn-compat with OEM-allocated objects is goodness and I think consistency is good. My concern is that pub trait FooService: Service<Runner: FooRunner> {}
pub trait FooRunner {
fn foo(&self);
}(removed async and storage for clarity) This is what I came up with, and it's got some rough edges. The canoncial impl resembles this: pub struct FooRunnerLoop<R>(R);
impl<R: FooRunner> FooRunnerLoop<R> {
pub fn new(runner: R) -> Self {
FooRunnerLoop(runner)
}
}
impl<R: FooRunner> Runner for FooRunnerLoop<R> {
fn run(self) -> ! {
loop {
self.0.foo();
std::thread::sleep(std::time::Duration::from_secs(1));
}
}
}
impl<R: FooRunner> FooRunner for FooRunnerLoop<R> {
fn foo(&self) {
self.0.foo();
}
}
pub struct Canonical;
pub struct CanonicalRunner;
impl Service for Canonical {
type InitParams = ();
type Runner = FooRunnerLoop<CanonicalRunner>;
fn new(_params: ()) -> (Self, Self::Runner) {
(Canonical, FooRunnerLoop::new(CanonicalRunner))
}
}
impl FooService for Canonical {}
impl FooRunner for CanonicalRunner {
fn foo(&self) {
eprintln!("foo: canonical");
}
}and extension looks like this: pub struct Customized<S>(S);
pub struct CustomizedRunner<R>(R, &'static str);
impl<S: FooService> Service for Customized<S> {
type InitParams = (S::InitParams, &'static str);
type Runner = FooRunnerLoop<CustomizedRunner<S::Runner>>;
fn new(params: (S::InitParams, &'static str)) -> (Self, Self::Runner) {
let (service, runner) = S::new(params.0);
(Customized(service), FooRunnerLoop::new(CustomizedRunner(runner, params.1)))
}
}
impl<S: FooService> FooService for Customized<S> {}
impl<R: FooRunner> FooRunner for CustomizedRunner<R> {
fn foo(&self) {
eprintln!("before foo: {}", self.1);
self.0.foo();
}
}All to insert some logic that runs before the default |
I think the idea is that ideally, an OEM wouldn't have to reach for implementing their own We have an example of this in the form of the eSPI service in this PR. The OEM can define the 'addresses' and types used to refer to and handle serialization/deserialization for each service, including any custom OEM-only services that they want to use - this is the 'RelayHandler' trait - and they pass a type that implements that trait in at service declaration time and an instance of that type at service construction time. The eSPI service calls into that instance to handle that service. A simplified non-Service-trait mostly-non-async version looks something like this - pub trait OemPlugin {
fn foo(&self) -> ()
}
struct ServiceInner<P: OemPlugin> {
oem_plugin: P,
// TODO - additional state as necessary
}
impl<P: OemPlugin> ServiceInner<P> {
fn new(oem_plugin: P) -> Self {
Self { oem_plugin }
}
fn run_first_worker_operation_returns_never(&self) -> ! {
loop {
self.oem_plugin.foo();
self.do_the_actual_thing();
}
}
// TODO implement service logic
}
pub struct Resources<P: OemPlugin> {
inner: Option<ServiceInner<P>>
}
impl<P: OemPlugin> Resources<P> {
pub fn new() -> Self {
Self { inner: None }
}
}
pub struct Runner<'hw, P: OemPlugin> {
inner: &'hw Resources<P>
}
impl<'hw, P: OemPlugin> {
pub async fn run(self) -> ! {
select2(
inner.run_first_worker_operation_returns_never(),
inner.run_second_worker_operation_returns_never());
}
}
pub struct Service<'hw, P: OemPlugin> {
inner: &'hw Resources<P>
}
impl<'hw, P: OemPlugin> Service<'hw, P> {
fn new(resources: &'hw mut Resources, oem_plugin: P) {
let inner = resources.inner.insert(ServiceInner::new(oem_plugin));
(Self {inner}, Runner {inner} )
}
// TODO impl the public-facing API for the service
}Extension is just // in OEM code
struct MyOemPlugin {}
impl OemPlugin for MyOemPlugin {
fn foo(&self) {
eprintln!("before foo: {}")
}
}and then instantiating the service with The
If you want to be able to make direct calls against the service once it's started, you can additionally define a trait for those and impl those against Service - let's call this If an OEM has some customization that can't be achieved via the extension points in In this case, I have a strong preference for (A), but (B) provides an escape valve for anything really crazy / tight scheduling. |
e111f4a to
c79b241
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ner-service pattern
29fd571 to
5d3d0c9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 18 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This is a proof-of-concept attempt to create a common pattern for initializing and starting the services in embedded-services. It demonstrates this pattern on the time-alarm and espi service.
Currently, starting the various services in embedded-services is pretty complicated. Each service has a different set of 'worker tasks' that users have to declare and start, each of which has varying numbers of parameters of varying types. It's very easy to miss one, and whenever a change comes in that changes what state a given task needs to function, that's a breaking change that, in the best case, causes a build failure, and in the worst case causes the service to silently misbehave.
This attempts to solve this disparity with a few tricks:
select!()over a list of worker tasks. This works because worker 'green threads' were already intended to never return, so the select never returns either. This lets us execute them all against a single embassy task.Service,ServiceRunner) tuple. TheServiceRunnerimplementation which has a single method that takes no arguments and never returns. This has a few benefits:ServiceRunnerobject back when the instantiate the service, and if they don't use it the compiler will complain about unused variables. They can bypass the warning with a leading_, but that's an explicit action the programmer has to take. I haven't found a way to outright prevent this class of error, unfortunately.&'hw mut Resourcesas an argument at service creation time. This Resources object is the 'real' service, and by taking it as a mutable borrow rather than a reference-to-oncelock or similar, we avoid prescribing any particular way of storing the memory required to run the service.spawn_service!()macro that does all the boilerplate of declaring storage/tasks and spawning the tasks on Embassy for a given service. In product use cases, this makes it outright impossible to forget to start the runner, and reduces the ~10 lines of somewhat verbose boilerplate required for each service down to zero.This approach aligns with a what a few other embassy services are doing - see embassy_net::new for a good example.
Some notes on testing -
The
spawn_service!()macro isn't useful in test scenarios where you need to run on tokio, but in those situations you can manually callnew()andrun()with stack variables thanks to (3). In order to have the long-running task future terminate at the end of the test, you can simplyselect()overrunner.run()and some test code; when the test code returns, the worker 'green thread' stops and the test exits. This technique is demonstrated in tad_test.rs.