Conversation
| // Note that when a signature of a hash of a larger message is needed, | ||
| // the caller is responsible for hashing the larger message and passing | ||
| // the hash (as digest) and the hash function (as opts) to Sign. | ||
| Sign(rand io.Reader, digest []byte, opts SignerOpts) (signature []byte, err error) |
There was a problem hiding this comment.
there's an obscure usecase for supporting MessageSigner here for restricted TPM keys
There was a problem hiding this comment.
Thanks for the heads up - I took a look at the change and I think someone implementing crypto.MessageSigner, and I think it should "just work" since crypto/tls supports a crypto.MessageSigner as a PrivateKey in the tls.Certificate.
So I think gRPC will pretty blindly pass this through while configuring tls.Config. If the user implements a crypto.MessageSigner with a SignMessage it'll use that
There was a problem hiding this comment.
yeah, if anything implements messagesigner, it'd get automatically pickedup and used.
as mentioned internally, that the advancedtls pakage which AFAIK, is only in go, you can offload TLS now
| The most complex piece of this is the implementation \- C-Core/Cython/Python must handle the calling of the user provided Python signing function from C which must invoke a callback that is passed to it. This will involve creating bridge types between the Python user sign function and the expected `absl::AnyInvocable` as well as bridging the callback that is passed to the user sign function while managing the GIL and asynchronous nature of the signing. This is technically feasible with Cython. A proof-of-concept of this structure [is written here.](https://source.corp.google.com/piper///depot/google3/experimental/users/gregorycooke/python_cpp_wrapping/) | ||
|
|
||
| ```py | ||
| # Example Usage |
There was a problem hiding this comment.
I'd like to restart the conversation we had elsewhere.
Would it make sense to abstract this call away? Make the signer just return the bytes, and the callback will be handled by our custom function? With this approach, we'll avoid the problem where users forgot to call the callback (f.e. by not wrapping their method into try ... finally).
The main point is to abstract away internal implementation details as much as possible.
# Internal wrapper
def _grpc_invoke_custom_private_key_sign(
private_key_sign_fn: CustomPrivateKeySign,
unsigned_data: bytes,
algorithm: SignatureAlgorithm,
on_done: PrivateKeySignDoneCallback,
) -> None:
try:
signed_bytes = private_key_sign_fn(unsigned_data, algorithm)
if signed_bytes:
on_done(signed_bytes, True)
else:
on_done(None, False)
except Exception as e:
logging.warning("Exception during custom private key sign: %s", e)
on_done(None, False)
# User's implementation example
def example_signer(
unsigned_data: bytes, algorithm: SignatureAlgorithm
) -> Optional[bytes]:
# Manually sign the bytes.
signed_bytes = ...
return signed_bytesThere was a problem hiding this comment.
Spoke with @matthewstevenson88 - he explained some of the async patterns our users might apply when CustomPrivateKeySign. One of them is creating a thread that will do the computation, passing on_done callback to the thread, while immediately returning from the CustomPrivateKeySign. This is a good argument for not introducing an internal wrapper.
I need think about this some more, and consider a few nuances:
- Whether a thread will actually be useful for offloading CPU-bound tasks - because if GIL is stuck on this thread waiting for the signing to complete, there won't be any parallelization/concurrency (unless it calls into a Python C extension that disables GIL)
- Whether we still can use the wrapper approach, but with a different interface, like using the Future interface.
Futureshould make it easier for the user to work with variousPoolExecutors. - Whether we should also allow an
asyncmethod signature (makeprivate_key_sign_fnaUnion[CustomPrivateKeySign, CustomPrivateKeySignAsync]) - which may be required if it's common for the HW offloading libraries the do the signing to use async/await pattern (asyncio python).
There was a problem hiding this comment.
For the reference,
GIL and performance considerations
Unlike the multiprocessing module, which uses separate processes to bypass the global interpreter lock (GIL), the threading module operates within a single process, meaning that all threads share the same memory space. However, the GIL limits the performance gains of threading when it comes to CPU-bound tasks, as only one thread can execute Python bytecode at a time. Despite this, threads remain a useful tool for achieving concurrency in many scenarios.
— https://docs.python.org/3/library/threading.html#gil-and-performance-considerations
There was a problem hiding this comment.
Have you had the chance to consider these options? We need to move relatively quickly on this, and that prototype that I made earlier in the year seemed viable? Can we keep this and disable the GIL?
Can we allow for Future or asyncio with the cython integration?
I'll lean on you for these decisions since you are the Python expert
There was a problem hiding this comment.
I've been toying around with how we wrap this more
The C++ layers knows it is wrapping Python. What if we just make the C++ wrapper layer responsible for the async.
i.e.
... implementing the interface ...
class PrivateKeySignerPyWrapper : PrivateKeySigner {
... details, assume self_->sign() is calling a cython method that's holding a python callable and is going to call into Python, here's where we _actually_ call it ...
thread::Detach({}, [s = std::string(input), self = shared_from_this(),
context] {
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
// This is calling into Python
self->sign_(s, CompletionCallbackForPy, context, self->user_data_);
PyGILState_Release(gstate);
});
}
We could go even further in this case, the python sign function doesn't need to know about a callback, calling the callback, or handling async stuff - it just simply returns signed data sync.
thread::Detach({}, [s = std::string(input), self = shared_from_this(),
context] {
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
// This is calling into Python
std::string signed_data = self->sign_(s, context, self->user_data_);
PyGILState_Release(gstate);
// C++ wrapping layer is responsible for the async and calling the callback
CompletionCallback(signed_data);
});
| // To use, set the provider on the TlsCredentialsOptions | ||
| ``` | ||
|
|
||
| ### Python |
There was a problem hiding this comment.
We should mention that gevent is not supported, both in the gRFC and the ssl_channel_credentials_with_custom_signer docstring.
A107-tls-private-key-offloading.md
Outdated
|
|
||
| ``` | ||
|
|
||
| We won't significantly refactor the Python API surface \- instead we will allow the `private_key` input to be a signing function. |
There was a problem hiding this comment.
I don't think this in relevant in the last iteration - we provided a new method instead of modifying the existing one to support private_key
A107-tls-private-key-offloading.md
Outdated
|
|
||
| # Now the user is in their application configuring gRPC | ||
| # Create creds with the custom signer | ||
| creds = ssl_channel_credentials_with_custom_signer(<some_root>, example_signer, <some_chain>) |
There was a problem hiding this comment.
note - we require arguments to be passed as keyword arguments now.
| creds = ssl_channel_credentials_with_custom_signer(<some_root>, example_signer, <some_chain>) | |
| creds = ssl_channel_credentials_with_custom_signer( | |
| private_key_sign_fn=example_signer, | |
| root_certificates=b"<some_root>", | |
| certificate_chain=b"<some_chain>", | |
| ) |
markdroth
left a comment
There was a problem hiding this comment.
Thanks for writing this up!
Please let me know if you have any questions. Thanks!
| to configure this feature will rely on a new certificate provider | ||
| implementation. | ||
|
|
||
| We will create an `InMemoryCertificateProvider` that takes a set of root certs |
There was a problem hiding this comment.
There are really 3 distinct API changes here, and I think it would be helpful to discuss each of them separately:
- Adding
InMemoryCertificateProvider, and deprecatingStaticDataCertificateProvider. - Allowing a separate cert provider for root and identity certs.
- Adding the new
PrivateKeySignerAPI.
I think we should first list the changes, then talk about each one in detail in its own subsection. For each one, we should describe why we're making the change and then document the API we're introducing.
There was a problem hiding this comment.
Before the subsections describing each of the three changes in detail, please add a bulletted list of the three changes, with just a few words about each to say why we're doing it. That way, the reader has a bit of context about why we're making these changes before we dive into the details.
A107-tls-private-key-offloading.md
Outdated
|
|
||
| | Metric | Type | Unit | Labels | Description | | ||
| | :---- | :---- | :---- | :---- | :---- | | ||
| | `grpc.security.handshaker.offloaded_private_key_sign_duration` | Histogram | float64/double s | `grpc.target, grpc.security.handshaker.offloaded_private_key_sign.algorithm, grpc.security.handshaker.offloaded_private_key_sign.status` | How long the offloaded private key signing took | |
There was a problem hiding this comment.
I don't think we actually implemented this metric in C-core. Do we still need to do that?
Implementing this in C-core raises a layering question. The TLS TSI handshaker is the thing that has this data to export, but the non-per-call metrics APIs are gRPC-specific APIs, which means that the TSI implementation would need to use gRPC-specific APIs. Won't that make it harder for this TSI implementation to be used outside of gRPC? Are we okay with that?
There was a problem hiding this comment.
We have no security metrics yet (that's happening ~now, we are working on the concept review) - this will be included in that work.
The layering question is one of the prime questions for our concept and design.
There was a problem hiding this comment.
If we haven't implemented this as part of this feature, how about we leave the metric out of this gRFC and include it in the forthcoming gRFC for all of the security-related metrics?
markdroth
left a comment
There was a problem hiding this comment.
This is moving in the right direction!
Please let me know if you have any questions. Thanks!
| * [A66] | ||
| * [A79] |
There was a problem hiding this comment.
Please change these lines to:
* [A66: OpenTelemetry Metrics][A66]
* [A79: Non-per-call Metrics Architecture][A79]
| The updates will return an `absl::Status` to the caller, and further a | ||
| `ValidateCredentials` API is provided. | ||
|
|
||
| ```c |
There was a problem hiding this comment.
This should say c++ instead of just c.
| the user. This provides a strict super-set of functionality of the current | ||
| `StaticDataCertificateProvider` - calling `UpdateRoot` and `UpdateIdentity` on | ||
| an `InMemoryCertificateProvider` at setup is the same as a | ||
| `StaticDataCertificateProvider`. |
There was a problem hiding this comment.
This should explicitly state that we are going to deprecate StaticDataCertificateProvider.
| void set_identity_certificate_provider(grpc_core::RefCountedPtr<grpc_tls_certificate_provider> certificate_provider); | ||
| #### Splitting Certificate Providers into Identity and Root Providers | ||
| We will also decouple the root provider and the identity provider in [the | ||
| tls\_credentials\_options](http://google3/third_party/grpc/src/core/credentials/transport/tls/grpc_tls_credentials_options.h;l=77;rcl=731487339) |
There was a problem hiding this comment.
Links to codesearch don't work in OSS. Please use github links instead.
| ```c++ | ||
| class TlsCredentialsOptions { | ||
| public: | ||
| [[deprecated( |
There was a problem hiding this comment.
Please put a comment here saying that this is the existing method, which is now marked as deprecated.
Also, please add a comment before the last two methods indicating that those are the new methods being added in this design.
| | Name | Disposition | Description | | ||
| | Label Name | Disposition | Description | | ||
| | :---- | :---- | :---- | | ||
| | `grpc.target` | required | Indicates the target of the gRPC channel for which this handshake is occurring | |
There was a problem hiding this comment.
The description for this label should note that it's the existing label defined in A66.
| debugging failures. | ||
|
|
||
| | Name | Disposition | Description | | ||
| | Label Name | Disposition | Description | |
There was a problem hiding this comment.
Suggest adding a sentence in its own paragraph before this table:
"""
The new metric will use the following labels:
"""
| length. For example, `RsaPkcs1Sha256`. For more, see the `SignatureAlgorithm` | ||
| enum in the C section above. | ||
|
|
||
| | Metric | Type | Unit | Labels | Description | |
There was a problem hiding this comment.
Suggest adding a sentence before this table:
"""
Here is the new metric we will add:
"""
| | `grpc.status` | optional | The status code return for the private key sign function. From [A66] | | ||
|
|
||
| The `grpc.tls.private_key_sign_algorithm` label will contain both a name and key | ||
| length. For example, `RsaPkcs1Sha256`. For more, see the `SignatureAlgorithm` |
There was a problem hiding this comment.
We will want the values for this label to be the same in all languages. Is that going to be a problem if we define the values via a C++ enum?
A107-tls-private-key-offloading.md
Outdated
|
|
||
| | Metric | Type | Unit | Labels | Description | | ||
| | :---- | :---- | :---- | :---- | :---- | | ||
| | `grpc.security.handshaker.offloaded_private_key_sign_duration` | Histogram | float64/double s | `grpc.target, grpc.security.handshaker.offloaded_private_key_sign.algorithm, grpc.security.handshaker.offloaded_private_key_sign.status` | How long the offloaded private key signing took | |
There was a problem hiding this comment.
If we haven't implemented this as part of this feature, how about we leave the metric out of this gRFC and include it in the forthcoming gRFC for all of the security-related metrics?
gRFC for TLS Private Key Offloading