[SOLR-18130][WIP] Unified connection string CloudSolrClient.Builder implementation#4260
Conversation
dsmiley
left a comment
There was a problem hiding this comment.
Great start! Heck; this is even contributable as-is... say maybe with some additonal usages of this nifty constructor.
@epugh in the CLI (or anywhere you know of) can this be used? From my experience, this is useful for clients creating CloudSolrClient in a generic way with respect to where the state comes from.
| * <p>The parser automatically detects the mode based on the presence of a scheme: | ||
| * | ||
| * <ul> | ||
| * <li>If any part starts with {@code http://} or {@code https://}, the entire string is | ||
| * treated as a list of HTTP(S) URLs. | ||
| * <li>Otherwise, it treated as a ZooKeeper connection string (with optional chroot) | ||
| * </ul> | ||
| * | ||
| * <p><b>Important:</b> Mixed schemes (e.g. zk hosts + HTTP URLs) are not allowed and will | ||
| * result in an error when building the client. | ||
| * | ||
| * <p>Usage examples: | ||
| * | ||
| * <pre>{@code | ||
| * // ZooKeeper with chroot | ||
| * new CloudSolrClient.Builder("zk1:2181,zk2:2181,zk3:2181/solr"); | ||
| * | ||
| * // ZooKeeper without chroot | ||
| * new CloudSolrClient.Builder("zk1:2181,zk2:2181,zk3:2181"); | ||
| * | ||
| * // Direct HTTPS connections | ||
| * new CloudSolrClient.Builder("https://solr1:8983/solr,https://solr2:8983/solr"); | ||
| * }</pre> | ||
| * |
There was a problem hiding this comment.
I would drop all this honestly
| * @param connectionString a string specifying either ZooKeeper connection string or HTTP(S) | ||
| * Solr URLs | ||
| * @throws IllegalArgumentException if string is null, empty, or malformed | ||
| * @see CloudClientConnectionString#parse(String) for parsing logic |
There was a problem hiding this comment.
no; that class seems should be a private detail, not a public API
| public Builder(String connectionString) { | ||
| CloudClientConnectionString connStr = CloudClientConnectionString.parse(connectionString); | ||
| if (connStr.isZk()) { | ||
| this.zkHosts = List.copyOf(connStr.getQuorumItems()); |
There was a problem hiding this comment.
no point in copying... the result of parsing isn't shared / at-risk of shared use
| import java.util.Objects; | ||
|
|
||
| /** Universal connection string parser logic. */ | ||
| public final class CloudClientConnectionString { |
There was a problem hiding this comment.
Are you familiar with Java records?
There was a problem hiding this comment.
Indeed, the record is more suitable
| throw new IllegalArgumentException("Connection string must not be null or empty"); | ||
| } | ||
| connectionString = connectionString.trim(); | ||
| List<String> parts = |
There was a problem hiding this comment.
IMO the very first thing after trimming to do is see if "://" is found and bifercate 2 different parse methods based on that. Very simple. The ZK side would then, first thing, extract off the trailing chroot, if present, before doing comma splitting.
Just trying to suggest something simple.
There was a problem hiding this comment.
Yes, it really is easier this way
| Arrays.stream(connectionString.split(",")) | ||
| .map(String::trim) | ||
| .filter(s -> !s.isEmpty()) | ||
| .toList(); |
There was a problem hiding this comment.
instead use org.apache.solr.common.util.StrUtils#split
There was a problem hiding this comment.
I did it, but some tests failed because StrUtils#split doesn't trim spaces - I added trim() via Java stream api
| if (parts.isEmpty()) { | ||
| throw new IllegalArgumentException( | ||
| "No valid hosts/urls found in connection string: " + connectionString); | ||
| } |
There was a problem hiding this comment.
can instead be in the constructor of the connection string
solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudClientConnectionString.java
Show resolved
Hide resolved
| CloudClientConnectionString cloudClientConnectionString = | ||
| CloudClientConnectionString.parse(connectionString); | ||
| if (cloudClientConnectionString.isZk()) { | ||
| String zkHostNoChroot = connectionString.split("/")[0]; |
There was a problem hiding this comment.
the chroot should be a field of cloudClientConnectionString
There was a problem hiding this comment.
This is no chroot - is zkHosts joined via comma.
I can replace it by
String zkHostNoChroot = connectionString.substring(0,connectionString.length()-cloudClientConnectionString.zkChroot().length());
Or
String zkHostNoChroot = connectionString.substring(0, connectionString.length() - cloudClientConnectionString.zkChroot().length());
But IMO, currently implementation seems more pretty
There was a problem hiding this comment.
cloudClientConnectionString should contain the chroot if there is one. And you just parsed it.
FWIW the name cloudClientConnectionString is terrible... it is in fact not a string but a record
| } | ||
|
|
||
| /** | ||
| * Provide a universal connection string that can represent either: |
There was a problem hiding this comment.
This method we're documenting does not "provide a connection string". Javadocs should start with what the method does and secondarily say something about the arguments.
disclaimer: I'm a stickler for javadoc style. And plenty of javadocs here are non-compliant.
| * Provide a universal connection string that can represent either: | |
| * Creates a client builder based on a connection string of 2 possible formats: |
…nt.Builder SOLR-18130: add universal connection string support for CloudSolrClient.Builder
…on string in SolrClientCache
7d8cb9b to
1ed3f01
Compare
|
Thanks for the review! I’ve squashed the fixes addressing your comments into the original commit. |
|
Maybe out of scope, but what about passing authentication info too? |
Can you elaborate? Exactly what/where? |
| CloudClientConnectionString cloudClientConnectionString = | ||
| CloudClientConnectionString.parse(connectionString); | ||
| if (cloudClientConnectionString.isZk()) { | ||
| String zkHostNoChroot = connectionString.split("/")[0]; |
There was a problem hiding this comment.
cloudClientConnectionString should contain the chroot if there is one. And you just parsed it.
FWIW the name cloudClientConnectionString is terrible... it is in fact not a string but a record
| import org.apache.solr.common.util.StrUtils; | ||
|
|
||
| /** Universal connection string parser logic. */ | ||
| public record CloudClientConnectionString(boolean isZk, List<String> quorumItems, String zkChroot) { |
There was a problem hiding this comment.
Lets drop the suffix String here :-)
There was a problem hiding this comment.
And IMO it'd be more at home as an inner class of CloudSolrClient since it's only used for it.
| zkHosts = connectionString.substring(0, slashIndex); | ||
| zkChroot = connectionString.substring(slashIndex); | ||
| } | ||
| List<String> quorumItems = StrUtils.split(zkHosts, ',').stream().map(String::trim).toList(); |
There was a problem hiding this comment.
what test failed due to spaces within the string? IMO that'd be erroneous.
| } | ||
| } | ||
|
|
||
| public static boolean isValidHttpUrl(String s) { |
There was a problem hiding this comment.
Perhaps it's a matter of taste / preference but if I were coding this, I simply wouldn't bother with this validation. The user will get an error eventually anyway if they screw up so badly as to malform their URL.
| * http://solr1:8983/solr,http://solr2:8983/solr} | ||
| * </ul> | ||
| * | ||
| * <p>Usage examples: |
There was a problem hiding this comment.
from this point forward, it's superfluous. The examples ought to speak for themselves.
| * @param connectionString a string specifying either ZooKeeper connection string or HTTP(S) | ||
| * Solr URLs | ||
| * @throws IllegalArgumentException if string is null, empty, or malformed | ||
| * @since 11.0.0 |
| public void testBuildQuorumForZkWithSpaces() { | ||
| CloudClientConnectionString parsed = | ||
| CloudClientConnectionString.parse( | ||
| " zookeeper1:2181 , zookeeper2:2181 , zookeeper3:2181 /solr "); |
There was a problem hiding this comment.
I can't imagine someone doing this. IMO connection strings should already be free of whitespace.
There was a problem hiding this comment.
By adding a test like this, you signal to everyone who looks at it -- this can happen. But really, nobody has been doing that.
https://issues.apache.org/jira/browse/SOLR-18130
Description
This patch introduces universal connection string support for CloudSolrClient.Builder
and replaces the existing usage in SolrClientCache.
Current state:
Proposed next steps:
for backward compatibility.
This is a work-in-progress PR for early feedback and discussion.
Solution
CloudSolrClient.Builderconstructor that handles a connection string that can be either a comma-separated list of HTTP(s) URLs in Solr or a list of Zookeeper hosts ending with /chroot.CloudSolrClient.Builderconstructor that accepts zookeeper parameters in favor of a constructor with a generic connection string.Tests
Added unit tests in
CloudHttp2SolrClientTest,SolrClientCacheTestWritten new unit test
CloudClientConnectionStringTestChecklist
Please review the following and check all that apply:
mainbranch../gradlew check.