refactor(jdbc): connection properties and data source properties#4102
refactor(jdbc): connection properties and data source properties#4102keshavdandeva wants to merge 2 commits intomainfrom
Conversation
Summary of ChangesHello @keshavdandeva, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request undertakes a substantial refactoring of how connection properties are handled within the BigQuery JDBC driver. The primary goal is to consolidate and simplify the parsing and management of these properties by introducing a DataSource object as the central repository for all connection-related configurations. This change moves away from a fragmented parsing approach to a more cohesive and maintainable design, where the BigQueryConnection directly consumes a pre-populated DataSource instance. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the handling of JDBC connection properties by centralizing them into a DataSource object. While this aims to improve code structure, it has introduced critical security regressions, including the failure to apply proxy and SSL configurations to the BigQuery connection and a risk of CRLF injection due to unsanitized properties in HTTP headers. Additionally, the implementation contains a potential NullPointerException, incorrect OAuthType logic, broken service account impersonation, and incorrect serialization of map-like properties like labels. The removal of a significant number of tests also leaves the new property parsing logic largely untested.
| if (ds.getOAuthType() != 0) { | ||
| this.authProperties.put( | ||
| "OAuthType", BigQueryJdbcOAuthUtility.AuthType.values()[ds.getOAuthType()].name()); | ||
| } |
There was a problem hiding this comment.
This block has a potential NullPointerException and incorrect logic for handling OAuthType.
ds.getOAuthType()can throw an NPE ifoAuthTypeis not set in the connection string. This is becauseDataSource.getOAuthType()returnsintbut the underlying field is a nullableInteger. To fix this, you should change the return type ofDataSource.getOAuthType()toInteger, which would be consistent with other getters changed in this PR (e.g.,getEnableHighThroughputAPI).- The condition
if (ds.getOAuthType() != 0)incorrectly skipsOAuthType=0(GOOGLE_SERVICE_ACCOUNT), which will causegetCredentials()to fail later as it requires theOAuthTypeto be present inauthProperties. - Based on the previous implementation,
OAuthTypeis a required property. The logic should enforce this.
After changing DataSource.getOAuthType() to return Integer, I suggest replacing this block with the following logic:
Integer oAuthType = ds.getOAuthType();
if (oAuthType == null) {
throw new BigQueryJdbcRuntimeException("Connection property 'OAuthType' is required.");
}
this.authProperties.put(
"OAuthType", BigQueryJdbcOAuthUtility.AuthType.values()[oAuthType].name());| boolean found = | ||
| BYOID_PROPERTIES.stream().anyMatch(key::equalsIgnoreCase) | ||
| || OVERRIDE_PROPERTIES.stream().anyMatch(key::equalsIgnoreCase); | ||
| if (!found | ||
| && !key.equalsIgnoreCase("Location") | ||
| && !key.equalsIgnoreCase("OAuthType") | ||
| && !key.equalsIgnoreCase("ProjectId") | ||
| && !key.equalsIgnoreCase("ServiceAccountImpersonationEmail") | ||
| && !key.equalsIgnoreCase("ServiceAccountImpersonationScopes") | ||
| && !key.equalsIgnoreCase("EndpointOverrides") | ||
| && !key.equalsIgnoreCase("ServiceAccountImpersonationTokenLifetime") | ||
| && !key.equalsIgnoreCase("universeDomain")) { | ||
| throw new IllegalArgumentException("Unknown connection property: " + key); | ||
| } |
There was a problem hiding this comment.
This logic for handling unknown properties has two critical issues:
-
Broken Service Account Impersonation: Properties required for service account impersonation (e.g.,
ServiceAccountImpersonationEmail,ServiceAccountImpersonationScopes) are whitelisted here but are not actually processed. They are not added to theDataSourceobject and are not retrieved inBigQueryConnectionto be passed to the credential creation logic. This effectively breaks the service account impersonation feature. -
Confusing Whitelist: The whitelist contains properties like
Location,OAuthType, andProjectIdwhich are already handled by thePROPERTY_SETTERSmap. This is confusing and makes the code harder to maintain.
To fix the impersonation issue, you should add the impersonation properties to the DataSource class, create setters for them in the PROPERTY_SETTERS map, and then retrieve them in BigQueryConnection to populate the authProperties map correctly.
| Map<String, String> parsedProperties = BigQueryJdbcUrlUtility.parseUrlProperties(url); | ||
| DataSource ds = new DataSource(); | ||
| BigQueryJdbcUrlUtility.setDataSourceProperties(ds, parsedProperties); |
There was a problem hiding this comment.
The refactoring of the BigQueryConnection constructor has introduced a significant security regression. The new implementation fails to initialize critical security-related fields such as httpTransportOptions and transportChannelProvider. These fields are essential for configuring proxy settings (ProxyHost, ProxyPort, etc.) and SSL truststore settings (SSLTrustStore, SSLTrustStorePwd) for the BigQuery client.
Additionally, the new constructor fails to populate authProperties with BYOID_PROPERTIES and overrideProperties with EndpointOverrides.
As a result, any security or authentication configurations provided in the JDBC URL are silently ignored. The driver will connect directly to BigQuery using default JVM settings, bypassing mandatory proxies and custom truststores, and failing to use specified external account credentials or custom endpoints. This can lead to unauthorized data exposure or man-in-the-middle attacks if the application relies on these settings for security.
| } | ||
| if (this.labels != null) { | ||
| connectionProperties.setProperty( | ||
| BigQueryJdbcUrlUtility.LABELS_PROPERTY_NAME, this.labels.toString()); |
There was a problem hiding this comment.
Using this.labels.toString() is incorrect for serializing the labels map. It will produce a string like {key1=value1, key2=value2}, but the property parser expects a format like key1=value1,key2=value2. This will break the functionality of passing labels via the DataSource.
You should manually format the map into a comma-separated string of key-value pairs. For example:
String labelsString = this.labels.entrySet().stream()
.map(e -> e.getKey() + "=" + e.getValue())
.collect(java.util.stream.Collectors.joining(","));
connectionProperties.setProperty(
BigQueryJdbcUrlUtility.LABELS_PROPERTY_NAME, labelsString);Note that you will need to add an import for java.util.stream.Collectors.
| BigQueryJdbcUrlUtility.DEFAULT_METADATA_FETCH_THREAD_COUNT_VALUE; | ||
| } | ||
|
|
||
| this.requestReason = ds.getRequestReason(); |
There was a problem hiding this comment.
The RequestReason property from the JDBC URL is used to populate the x-goog-request-reason HTTP header without proper sanitization. Since the value is URL-decoded in BigQueryJdbcUrlUtility.parseUrlProperties, an attacker can inject CRLF sequences (%0D%0A) into the JDBC URL to perform HTTP header injection.
Similarly, the PartnerToken property (parsed in BigQueryJdbcUrlUtility.parsePartnerTokenProperty) is included in the user-agent header and can contain literal newlines if they are present in the JDBC URL.
Attacker-controlled headers can be used to bypass security controls or manipulate API behavior.
| if (eqIdx != -1) { | ||
| String key = part.substring(0, eqIdx).trim(); | ||
| String val = part.substring(eqIdx + 1).trim(); | ||
| properties.put(key, CharEscapers.decodeUriPath(val)); |
There was a problem hiding this comment.
The parseUrlProperties method decodes property values using CharEscapers.decodeUriPath. When these decoded values are subsequently used in HTTP headers (such as RequestReason being used for the x-goog-request-reason header), it enables CRLF injection if the values are not sanitized. Any property intended for use in an HTTP header must be validated to ensure it does not contain control characters like \r or \n.
| if (ds.getHttpConnectTimeout() != null) { | ||
| this.httpConnectTimeout = ds.getHttpConnectTimeout(); | ||
| } else { | ||
| this.httpConnectTimeout = null; | ||
| } |
| for (Map.Entry<String, String> entry : props.entrySet()) { | ||
| if (entry.getKey().equalsIgnoreCase(property)) { | ||
| return entry.getValue(); | ||
| } | ||
| } |
There was a problem hiding this comment.
The parseUrlProperties method returns a TreeMap that uses String.CASE_INSENSITIVE_ORDER. Therefore, you can directly use props.get(property) instead of iterating through the map to find the key in a case-insensitive manner. This will make the code more concise and efficient.
return props.get(property);
b/429272203