Implement partial success logging in OTLP exporters#4805
Implement partial success logging in OTLP exporters#4805michaelsafyan wants to merge 12 commits intoopen-telemetry:mainfrom
Conversation
…levels to avoid changing default behaviors. Apply test and formatting fixes.
This appears to cause an issue with "tox -e docs".
DylanRussell
left a comment
There was a problem hiding this comment.
Why put this log behind an env var ? Why not just always log when the field is set -- I'm assuming it's only set when there was a partial success -- it doesn't sound too noisy to me.
I'm uncertain as to the level of noisiness, and so I wanted to minimize disruption to existing users. I could imagine it being somewhat annoying if every batch included one persistently malformed span that just ended up creating log spam. A more sophisticated approach might be to opt-in to recording every partial success, while elevating the verbosity and logging the partial success unconditionally if every record was a failure (if |
|
We already have the I think it's a bit weird to put just 1 log message behind this env var |
|
Thanks for the review, @DylanRussell . I updated the logic to no longer make the logging conditioned on |
| ) -> ExportServiceRequestT: | ||
| pass | ||
|
|
||
| def _log_partial_success(self, partial_success): |
There was a problem hiding this comment.
nit: can we just inline this code instead of having a function (same thing below)
There was a problem hiding this comment.
We add a DuplicateFilter to the logger on line 116 (as an aside I don't like the github code review tool, which doesn't let me add a comment to a line that wasn't modified...)
I'm thinking we should update that to reference record.lineno instead of record.msg which could be unique.... and that is sufficient to suppress noisy logging, and also avoid the endless logging issues...
If you want to take a stab at that change here that'd be great, if not I think we can proceed without that and I can make it separately..
I don't think we should special case the logging exporter because we have this duplicate filter thing.
There was a problem hiding this comment.
Thanks, @DylanRussell ! Updated accordingly. Please take another look.
Also, any idea what's going on with the failed tests? They look to be unrelated to this change:

There was a problem hiding this comment.
That's some sort of flake we get all the time (can't find a commit hash), IDK why we get that. It's OK to ignore
| ) -> ExportServiceRequestT: | ||
| pass | ||
|
|
||
| def _process_response(self, response): |
There was a problem hiding this comment.
nit: can we inline this code instead ? IMO it's more readable
| @@ -46,17 +46,36 @@ class DuplicateFilter(logging.Filter): | |||
| recursion depth exceeded issue in cases where logging itself results in an exception.""" | |||
|
|
|||
| def filter(self, record): | |||
There was a problem hiding this comment.
IMO I would simplify all this and just have the logic as:
current_log_by_line = (
record.module,
record.pathname,
record.lineno,
time_boundary,
)
previous_log_by_line = getattr(self, "last_source", None)
keep_processing = current_log_by_line != previous_log_by_line
self.last_source = current_log_by_line
return keep_processing
Although I'm realizing now that we should store the previous 10 previous_log_by_line in like a deque or OrderedDict instead.. Right now if duplicate log messages are interleaved we will not filter anything out
Also we have tests for this thing here:
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
|
|
| @@ -11,6 +11,7 @@ | |||
| # See the License for the specific language governing permissions and | |||
There was a problem hiding this comment.
You could also update the HTTP exporters the same way I think ? https://github.com/open-telemetry/opentelemetry-python/tree/main/exporter/opentelemetry-exporter-otlp-proto-http/src/opentelemetry/exporter/otlp/proto/http
Log partial success
Partial successes during OTLP export is now logged.
Fixes #4803
Type of change
How Has This Been Tested?
Added new unit tests covering the relevant cases.
Does This PR Require a Contrib Repo Change?
Checklist: