impl(o11y): introduce error attributes#12189
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refines the tracing telemetry by introducing dedicated attributes for error types, exception types, and status messages. These additions provide a more detailed and standardized way to categorize and understand failures occurring within client-side operations, improving observability and debugging capabilities. The changes ensure that critical error information is consistently captured in OpenTelemetry spans, offering clearer insights into the root causes of issues. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new ErrorTypeUtil class to provide a standardized way of classifying exceptions into specific error types (e.g., client timeout, connection error, authentication error) for OpenTelemetry tracing. It adds new observability attributes (error.type, exception.type, status.message) and integrates this error classification into the SpanTracer to enrich span data upon failed attempts. The changes also include comprehensive unit and integration tests to validate the new error type extraction and tracing functionality. Feedback from the review includes correcting incorrect copyright years, removing redundant semicolons and toString() overrides, and updating Javadoc for accuracy.
sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/ErrorTypeUtil.java
Show resolved
Hide resolved
sdk-platform-java/gax-java/gax/src/test/java/com/google/api/gax/tracing/ErrorTypeUtilTest.java
Show resolved
Hide resolved
...va-showcase/gapic-showcase/src/test/java/com/google/showcase/v1beta1/it/ITOtelErrorType.java
Outdated
Show resolved
Hide resolved
sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/ApiTracer.java
Show resolved
Hide resolved
sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/ErrorTypeUtil.java
Outdated
Show resolved
Hide resolved
sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/ErrorTypeUtil.java
Outdated
Show resolved
Hide resolved
This reverts commit 0ce71dd.
…-attr-error-type-transfer
| import javax.annotation.Nullable; | ||
| import javax.net.ssl.SSLHandshakeException; | ||
|
|
||
| public class ErrorTypeUtil { |
There was a problem hiding this comment.
Ideally other libraries can rely on this logic
There was a problem hiding this comment.
Yes, exactly.
GraalVM failures seem unrelated |
| * The specific error type. Value will be google.rpc.ErrorInfo.reason, a specific Server Error | ||
| * Code, Client-Side Network/Operational Error (e.g., CLIENT_TIMEOUT) or internal fallback. | ||
| */ | ||
| public static final String ERROR_TYPE_ATTRIBUTE = "error.type"; |
There was a problem hiding this comment.
FYI, Wes is adding the same attribute in #12202. Depending on which PR gets merged in first, you may need to rebase this PR based on his changes.
sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/ErrorTypeUtil.java
Outdated
Show resolved
Hide resolved
| private static boolean hasErrorClassInCauseChain( | ||
| Throwable t, Set<Class<? extends Throwable>> errorClasses) { | ||
| Throwable current = t; | ||
| while (current != null) { |
There was a problem hiding this comment.
While this is nice to check all possible causes, I'm not sure it is the correct way. Do you have a concrete example that a cause is nested? It also introduces more questions like what if there is a SocketTimeoutException caused by a ConnectException?
Is this what other languages are doing? If not, I'm leaning towards only checking the top level.
There was a problem hiding this comment.
To double check wrapper exceptions and specify them directly in here. The point of ambiguity is valid.
There was a problem hiding this comment.
I removed the recursive check for now and I'm confirming the need of it internally.
sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/ErrorTypeUtil.java
Show resolved
Hide resolved
...va-showcase/gapic-showcase/src/test/java/com/google/showcase/v1beta1/it/ITOtelErrorType.java
Outdated
Show resolved
Hide resolved
|
|
||
| private String extractErrorMessage(Throwable error) { | ||
| Throwable cause = error; | ||
| while (cause != null) { |
There was a problem hiding this comment.
I don't think we need to recursively get the error message as well.
There was a problem hiding this comment.
Thanks for the catch. Done.
| import javax.annotation.Nullable; | ||
| import javax.net.ssl.SSLHandshakeException; | ||
|
|
||
| public class ErrorTypeUtil { |
|
|
||
| public class ErrorTypeUtil { | ||
|
|
||
| public enum ErrorType { |
There was a problem hiding this comment.
This enum can be package private. I don't think other libraries would use this enum directly?
diegomarquezp
left a comment
There was a problem hiding this comment.
Comments addressed
| import javax.annotation.Nullable; | ||
| import javax.net.ssl.SSLHandshakeException; | ||
|
|
||
| public class ErrorTypeUtil { |
There was a problem hiding this comment.
Yes, exactly.
|
|
||
| public class ErrorTypeUtil { | ||
|
|
||
| public enum ErrorType { |
|
|
||
| private String extractErrorMessage(Throwable error) { | ||
| Throwable cause = error; | ||
| while (cause != null) { |
There was a problem hiding this comment.
Thanks for the catch. Done.
Ports the error.type and status.message telemetry features from sdk-platform-java.