Conversation
- Search can go to any instance, not just the leader (diagram + text) - Faceted search marked as partial (facet-search endpoint unsupported) - Conversational search marked as not working with useNetwork - Failover section rewritten: no automatic failover currently exists - Replace incorrect addRemotes/removeRemotes top-level API with correct remotes object (null to remove) and shard-level addRemotes/removeRemotes - Fix setup guide: PATCH /network only to leader, not each instance - Merge remotes + shards into single request (required by validation) - Add missing writeApiKey to remote configurations - Fix step numbering after merging steps 3 and 4
|
Preview deployment for your docs. Learn more about Mintlify Previews.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDocs reframe replication as read-scaling (no automatic search failover), replace failover guidance with remote-availability behavior, consolidate Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client as Client
participant Any as Receiving Instance
participant RemoteA as Remote (shard A)
participant RemoteB as Remote (shard B)
participant Merge as Merge/Aggregator
Client->>Any: Search request (network topology present)
Any->>RemoteA: Query one replica for shard A
Any->>RemoteB: Query one replica for shard B
RemoteA-->>Any: Results A
RemoteB-->>Any: Results B
Any->>Merge: Merge & dedupe results
Merge-->>Client: Combined response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
resources/self_hosting/sharding/configure_replication.mdx (1)
15-15: Split Line 15 to reduce ambiguity and improve readability.Line 15 is dense and can be read as health-based failover behavior. Splitting it and explicitly separating replica selection from failover helps consistency with Line 90.
Suggested rewrite
-When you configure shards, each shard can be assigned to one or more remotes. If a shard is assigned to multiple remotes, Meilisearch replicates the data to each of them. During a search with `useNetwork: true`, Meilisearch queries each shard exactly once, picking one of the available remotes for each shard. This avoids duplicate results. +When you configure shards, each shard can be assigned to one or more remotes. +If a shard is assigned to multiple remotes, Meilisearch replicates the data to each of them. +During a search with `useNetwork: true`, Meilisearch queries each shard exactly once and picks one configured remote for that shard. +This avoids duplicate results.As per coding guidelines, "Prefer shorter sentences; if a sentence runs over approximately 40 words, consider splitting or simplifying."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/configure_replication.mdx` at line 15, The sentence about shard querying is too dense and ambiguous; split it into two clear sentences in the paragraph that mentions useNetwork: true so it first states that during a search Meilisearch queries each shard exactly once and picks one available remote per shard (preventing duplicate results), and then add a second sentence that separately explains failover behavior (that if the chosen remote is unavailable Meilisearch may try other remotes for that shard), mirroring the explicit phrasing used around Line 90 and improving readability around the reference to useNetwork: true.resources/self_hosting/sharding/overview.mdx (1)
28-33: Use simpler diagram labels than “fan out.”In Line 28-Line 33, replacing “fan out” with plain wording improves accessibility for non-native readers.
Suggested wording tweak
- Any -->|fan out| R1[Remote ms-00<br/>shard-a, shard-c] - Any -->|fan out| R2[Remote ms-01<br/>shard-a, shard-b] - Any -->|fan out| R3[Remote ms-02<br/>shard-b, shard-c] + Any -->|send query| R1[Remote ms-00<br/>shard-a, shard-c] + Any -->|send query| R2[Remote ms-01<br/>shard-a, shard-b] + Any -->|send query| R3[Remote ms-02<br/>shard-b, shard-c]As per coding guidelines, "Prefer short, direct sentences and avoid jargon when a simpler term exists in documentation."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/overview.mdx` around lines 28 - 33, Replace the jargon label "fan out" on the edges between Any and R1/R2/R3 with a simpler phrase such as "send to" (e.g., change Any -->|fan out| R1 to Any -->|send to| R1) to improve readability and accessibility; update all three connectors (Any -->|fan out| R1, Any -->|fan out| R2, Any -->|fan out| R3) consistently so diagram labels are short and plain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@resources/self_hosting/sharding/configure_replication.mdx`:
- Line 15: The sentence about shard querying is too dense and ambiguous; split
it into two clear sentences in the paragraph that mentions useNetwork: true so
it first states that during a search Meilisearch queries each shard exactly once
and picks one available remote per shard (preventing duplicate results), and
then add a second sentence that separately explains failover behavior (that if
the chosen remote is unavailable Meilisearch may try other remotes for that
shard), mirroring the explicit phrasing used around Line 90 and improving
readability around the reference to useNetwork: true.
In `@resources/self_hosting/sharding/overview.mdx`:
- Around line 28-33: Replace the jargon label "fan out" on the edges between Any
and R1/R2/R3 with a simpler phrase such as "send to" (e.g., change Any -->|fan
out| R1 to Any -->|send to| R1) to improve readability and accessibility; update
all three connectors (Any -->|fan out| R1, Any -->|fan out| R2, Any -->|fan out|
R3) consistently so diagram labels are short and plain.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 85ea94ad-7a3f-4737-a45c-24bcbdeff099
📒 Files selected for processing (4)
resources/self_hosting/sharding/configure_replication.mdxresources/self_hosting/sharding/manage_network.mdxresources/self_hosting/sharding/overview.mdxresources/self_hosting/sharding/setup_sharded_cluster.mdx
- No automatic search failover between replicas: retries same remote 3x - "Search continues" when leader down is misleading: traffic must be routed away from downed instances via load balancer - "All shards remain available" softened to "data still exists" - Warning about remotes+shards together only applies to initial setup - Removed "handle failover" from card description
- Fix error code: not_a_leader → not_leader (matches actual engine code) - Add distinct attribute row to feature table (requires v1.40+) - Add warning about search errors during topology changes (known bug) - Note that removed instances retain local data after network removal - Soften "all shards remain available" to "data still exists" - Clarify remotes+shards requirement is for initial setup only
qdequele
left a comment
There was a problem hiding this comment.
Review: cross-checked against PR #3525
I reviewed this PR against #3525 (the developer's own modernization of the sharding guide) to validate the technical accuracy of the fixes.
What's correct and well-aligned ✅
- Search goes to any instance — the mermaid diagram and prose in
overview.mdxcorrectly replace the "leader fans out" model. Confirmed by #3525. writeApiKeyadded to all remote configurations. Confirmed by #3525.- Single
PATCH /networkto the leader only —setup_sharded_cluster.mdxcorrectly consolidates the old per-instance config into one request. Confirmed by #3525. not_leadererror code — replaces the oldnot_a_leader. Confirmed by #3525.- Faceted search = Partial (no
useNetworkon/facet-search). Confirmed by #3525 implicitly and the PR description cites a// TODOin the Rust code. - Conversational search = No (chat completions have no
useNetworkfield). Confirmed by #3525 feature table omission. - Removed automatic failover claims and replaced with "Remote availability" section. Consistent with #3525 which says the engine rebalances via
NetworkTopologyChangebut makes no mention of built-in replica failover. addRemotesinside shard definitions (not at the top level). Confirmed by #3525.NetworkTopologyChangewarning on the overview. Consistent with #3525.- Step renumbering in
setup_sharded_cluster.mdx(steps 5→4, 6→5 after merging shard setup). Correct.
Two claims that need developer confirmation ⚠️
1. "Retries the same remote up to 3 times"
In configure_replication.mdx, the Remote availability section says:
If that replica is unreachable, Meilisearch retries the same remote up to 3 times but does not automatically try another replica holding the same shard.
This specific retry count is not mentioned anywhere in PR #3525 or the OpenAPI spec. It may have been found in the Rust source, but since it's a behavioral claim that affects how users design their infrastructure setup, it should be confirmed with the engine team before publishing. If the retry count changes or is wrong, it will actively mislead users configuring load balancers.
Ask: can the dev team confirm this retry behavior and count?
2. Removing a remote with null
In manage_network.mdx, the "Remove a remote" section uses:
{
"remotes": {
"ms-03": null
}
}PR #3525 does not show how to remove a remote — it only demonstrates adding remotes and using addRemotes/removeRemotes inside shard definitions. The null pattern is reasonable (it follows JSON Merge Patch semantics) but is unconfirmed by the developer's own guide or the OpenAPI spec.
Ask: is "ms-03": null the correct way to remove a remote from the network, and does it also automatically remove it from all shard assignments as stated?
Minor
configure_replication.mdxline 15 (CodeRabbit's nitpick): the sentence about shard querying is genuinely dense. Worth splitting for clarity regardless of the above issues.overview.mdxmermaid "fan out" labels (CodeRabbit's second nitpick): replacing with "send query" is a low-effort improvement worth doing.
Summary
The PR is technically sound and well-aligned with what the developer intended in #3525. The two flagged items are the only blockers — everything else can be merged as-is once those are verified.
…ide shard instead
…ute row, partial update semantics)
…sed search, rebalancing steps, validation errors
Sources for the changes in this PRFrom PR #3525 (developer's sharding guide modernization)
From Linear SCA-21 (engine design spec by Louis Dureuil)
|
Kerollmops
left a comment
There was a problem hiding this comment.
Nice documentation addition.
However, now that Meilisearch defaults to useNetwork: true, it would be great to ensure we don't include it in the examples and in the Features Compatibility table. It still makes sense to expose that this parameter exists, but only in the reference. What do you think?
Another point about the setup sharded cluster page is that I think it would be more valuable to show a full-sharded setup rather than an in-between one. You already showcased a full-replicated setup just after.
|
|
||
| 1. **Compute new shards**: each instance uses rendezvous hashing to determine which documents now belong to which shard under the new topology. | ||
| 2. **Export and import**: documents are sent to remotes that now own them. | ||
| 3. **Delete stale data**: once all remotes confirm their imports are complete, each instance deletes the documents it no longer owns. Search switches to the new shard definitions at this point. |
There was a problem hiding this comment.
I think that's actually how it behaves, but I prefer to summon @dureuill on that point.
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
There was a problem hiding this comment.
🧹 Nitpick comments (5)
resources/self_hosting/sharding/manage_network.mdx (2)
92-92: Consider incorporating reviewer feedback.A past review comment suggested rewording this line to add clarity about document IDs:
"each instance uses rendezvous hashing on document IDs to determine which documents belong to which shard under the new topology"
The current wording omits "on document IDs" which provides useful context about how rendezvous hashing is applied.
📝 Suggested wording
-1. **Compute new shards**: each instance uses rendezvous hashing to determine which documents now belong to which shard under the new topology. +1. **Compute new shards**: each instance uses rendezvous hashing on document IDs to determine which documents belong to which shard under the new topology.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/manage_network.mdx` at line 92, Update the sentence under "Compute new shards" to explicitly state that rendezvous hashing is applied to document IDs; change the line "each instance uses rendezvous hashing to determine which documents now belong to which shard under the new topology" to include "on document IDs" (e.g., "each instance uses rendezvous hashing on document IDs to determine which documents belong to which shard under the new topology") so readers understand the hashing input.
123-126: Consider simplifying the search example.A past review comment suggested removing the redundant
"useNetwork": truesince it defaults totruewhen a network topology is defined. However, keeping it explicit may help users understand the feature is active.If you prefer explicitness for documentation purposes, this can remain as-is. Otherwise:
📝 Simplified example
--data-binary '{ - "q": "batman", - "useNetwork": true, + "q": "batman", "filter": "_shard = \"shard-a\"" }'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/manage_network.mdx` around lines 123 - 126, The JSON search example includes a redundant "useNetwork": true key — since useNetwork defaults to true when a network topology is defined, remove the "useNetwork": true line from the example (the block containing "q": "batman", "useNetwork": true, "filter": "_shard = \"shard-a\"" ) to simplify the snippet; if you choose to keep explicitness for clarity, instead add a short inline comment explaining that it’s optional and defaults to true.resources/self_hosting/sharding/overview.mdx (3)
63-63: Consider noting the default behavior.A past review comment suggested simplifying since
useNetworkdefaults totruewhen a network topology is defined. Consider adding a note thatuseNetwork: trueis optional in these examples.📝 Suggested addition
-To search across all instances, add `useNetwork: true` to your search request: +To search across all instances, use `useNetwork: true` in your search request. This is optional when a network topology is configured, as it defaults to `true`:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/overview.mdx` at line 63, Clarify that useNetwork is optional by noting its default: update the sentence that currently reads "To search across all instances, add `useNetwork: true` to your search request:" to mention that when a network topology is defined `useNetwork` defaults to true, so explicitly setting `useNetwork: true` is optional in these examples; reference the `useNetwork` flag in the text and add a short parenthetical or follow-up sentence explaining the default behavior.
39-39: Clarify thatuseNetworkdefaults totrue.A past review comment noted that
useNetworkdefaults totruewhen a network topology is defined. The current wording "whenuseNetwork: trueis set" may imply it's always required to be explicit.📝 Suggested clarification
-3. **Search**: when `useNetwork: true` is set, the instance receiving the request fans out the search to all remotes, then merges and ranks the combined results +3. **Search**: when `useNetwork: true` is set (the default when a network is configured), the instance receiving the request fans out the search to all remotes, then merges and ranks the combined results🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/overview.mdx` at line 39, The sentence about Search should clarify that useNetwork defaults to true when a network topology is defined; update the wording around the useNetwork mention (the literal token `useNetwork`) so it reads something like “when useNetwork: true (useNetwork defaults to true if a network topology is defined), the instance receiving the request fans out...”, ensuring the doc explicitly states the default behavior rather than implying it must be set explicitly.
21-21: Consider splitting this long sentence.This sentence is approximately 55 words, exceeding the guideline of ~40 words. Consider splitting for readability.
📝 Suggested split
-Replication assigns the same shard to more than one remote. This ensures your data is stored redundantly across instances. During a network search, Meilisearch ensures each shard is queried exactly once, either from a remote shard or from the local one (chosen randomly, favoring the local one). This guarantees each shard is queried exactly once, so results are never duplicated regardless of how many replicas exist. +Replication assigns the same shard to more than one remote, ensuring your data is stored redundantly across instances. During a network search, Meilisearch queries each shard exactly once, picking either the local shard or a remote (chosen randomly, favoring the local one). This guarantees results are never duplicated regardless of how many replicas exist.As per coding guidelines: "Prefer shorter sentences; if a sentence runs over approximately 40 words, consider splitting or simplifying."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/overview.mdx` at line 21, Split the long sentence that begins "Replication assigns the same shard to more than one remote..." into two (or three) shorter sentences for readability: 1) describe replication and redundancy (e.g., "Replication assigns the same shard to more than one remote, ensuring data is stored redundantly across instances."), and 2) describe the search behavior (e.g., "During a network search, Meilisearch queries each shard exactly once, choosing either a remote replica or the local shard—favoring the local one—so results are never duplicated."). Update the sentence(s) that reference shard querying to remove repeated clauses like "This guarantees each shard is queried exactly once" so the wording is concise and non-redundant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@resources/self_hosting/sharding/manage_network.mdx`:
- Line 92: Update the sentence under "Compute new shards" to explicitly state
that rendezvous hashing is applied to document IDs; change the line "each
instance uses rendezvous hashing to determine which documents now belong to
which shard under the new topology" to include "on document IDs" (e.g., "each
instance uses rendezvous hashing on document IDs to determine which documents
belong to which shard under the new topology") so readers understand the hashing
input.
- Around line 123-126: The JSON search example includes a redundant
"useNetwork": true key — since useNetwork defaults to true when a network
topology is defined, remove the "useNetwork": true line from the example (the
block containing "q": "batman", "useNetwork": true, "filter": "_shard =
\"shard-a\"" ) to simplify the snippet; if you choose to keep explicitness for
clarity, instead add a short inline comment explaining that it’s optional and
defaults to true.
In `@resources/self_hosting/sharding/overview.mdx`:
- Line 63: Clarify that useNetwork is optional by noting its default: update the
sentence that currently reads "To search across all instances, add `useNetwork:
true` to your search request:" to mention that when a network topology is
defined `useNetwork` defaults to true, so explicitly setting `useNetwork: true`
is optional in these examples; reference the `useNetwork` flag in the text and
add a short parenthetical or follow-up sentence explaining the default behavior.
- Line 39: The sentence about Search should clarify that useNetwork defaults to
true when a network topology is defined; update the wording around the
useNetwork mention (the literal token `useNetwork`) so it reads something like
“when useNetwork: true (useNetwork defaults to true if a network topology is
defined), the instance receiving the request fans out...”, ensuring the doc
explicitly states the default behavior rather than implying it must be set
explicitly.
- Line 21: Split the long sentence that begins "Replication assigns the same
shard to more than one remote..." into two (or three) shorter sentences for
readability: 1) describe replication and redundancy (e.g., "Replication assigns
the same shard to more than one remote, ensuring data is stored redundantly
across instances."), and 2) describe the search behavior (e.g., "During a
network search, Meilisearch queries each shard exactly once, choosing either a
remote replica or the local shard—favoring the local one—so results are never
duplicated."). Update the sentence(s) that reference shard querying to remove
repeated clauses like "This guarantees each shard is queried exactly once" so
the wording is concise and non-redundant.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b42c0764-2472-4147-a4ef-bc1fa9fd71df
📒 Files selected for processing (3)
resources/self_hosting/sharding/configure_replication.mdxresources/self_hosting/sharding/manage_network.mdxresources/self_hosting/sharding/overview.mdx
…plication description, full-sharding setup, no-fallback note, fix wording throughout
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
resources/self_hosting/sharding/overview.mdx (1)
94-94:⚠️ Potential issue | 🟡 MinorContradictory guidance on
useNetworkin multi-search.This line says to "Add
useNetwork: trueto individual queries," but the note at lines 65-67 states thatuseNetworkdefaults totruewhen a network topology is defined. Since the example at lines 105-106 correctly omitsuseNetwork, consider updating this text for consistency.-Network search works with [multi-search](/capabilities/multi_search/getting_started/federated_search) and [federated search](/capabilities/multi_search/getting_started/federated_search). Add `useNetwork: true` to individual queries within a multi-search request: +Network search works with [multi-search](/capabilities/multi_search/getting_started/federated_search) and [federated search](/capabilities/multi_search/getting_started/federated_search):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/overview.mdx` at line 94, The guidance is inconsistent about useNetwork: update the sentence that currently reads "Add `useNetwork: true` to individual queries within a multi-search request" to clarify that `useNetwork` defaults to true when a network topology is defined and only needs to be explicitly set when overriding that default; reference the `useNetwork` flag and the multi-search example that omits it (the example near the later multi-search block) so readers understand that omitting `useNetwork` is fine if the network topology is present, and show that adding `useNetwork: true` is only necessary to explicitly enable network search per-query.
🧹 Nitpick comments (1)
resources/self_hosting/sharding/setup_sharded_cluster.mdx (1)
4-4: Description mentions replication, but setup has none.The description says "replication for redundancy," but the guide sets up a cluster without replication (as noted in lines 103-105). Consider updating the description to match the actual setup.
-description: Configure Meilisearch instances into a sharded cluster with replication for horizontal scaling and high availability. +description: Configure Meilisearch instances into a sharded cluster for horizontal scaling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@resources/self_hosting/sharding/setup_sharded_cluster.mdx` at line 4, The frontmatter description ("Configure Meilisearch instances into a sharded cluster with replication for horizontal scaling and high availability.") claims replication but the guide's sharding setup only creates primary shards with no replicas; either remove the replication wording from the description or add instructions to configure replicas (e.g., show how to start replica instances, set replica_count, and configure cluster routes) so the doc's description matches the actual steps — update the "description" text accordingly or add replica configuration steps in the sharding setup section.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@resources/self_hosting/sharding/overview.mdx`:
- Line 94: The guidance is inconsistent about useNetwork: update the sentence
that currently reads "Add `useNetwork: true` to individual queries within a
multi-search request" to clarify that `useNetwork` defaults to true when a
network topology is defined and only needs to be explicitly set when overriding
that default; reference the `useNetwork` flag and the multi-search example that
omits it (the example near the later multi-search block) so readers understand
that omitting `useNetwork` is fine if the network topology is present, and show
that adding `useNetwork: true` is only necessary to explicitly enable network
search per-query.
---
Nitpick comments:
In `@resources/self_hosting/sharding/setup_sharded_cluster.mdx`:
- Line 4: The frontmatter description ("Configure Meilisearch instances into a
sharded cluster with replication for horizontal scaling and high availability.")
claims replication but the guide's sharding setup only creates primary shards
with no replicas; either remove the replication wording from the description or
add instructions to configure replicas (e.g., show how to start replica
instances, set replica_count, and configure cluster routes) so the doc's
description matches the actual steps — update the "description" text accordingly
or add replica configuration steps in the sharding setup section.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5ef157ff-dd12-49d4-a56d-384cd19f09c9
📒 Files selected for processing (4)
resources/self_hosting/sharding/configure_replication.mdxresources/self_hosting/sharding/manage_network.mdxresources/self_hosting/sharding/overview.mdxresources/self_hosting/sharding/setup_sharded_cluster.mdx
🚧 Files skipped from review as they are similar to previous changes (1)
- resources/self_hosting/sharding/manage_network.mdx
Summary
Fixes inaccuracies in the sharding and replication documentation identified during review of PR #3525. Changes verified against the Rust codebase, OpenAPI spec, and Linear project context.
/facet-searchendpoint does not supportuseNetwork(confirmed by// TODOin Rust code)useNetworkfieldaddRemotes/removeRemotesdon't exist at top level: replaced with correct API (add viaremotesobject, remove by setting tonull, useaddRemotes/removeRemotesinside shard definitions)writeApiKeyto remote configurationsProcess
This PR come from the PR #3525 (developer's sharding guide modernization) and the discussion in it. Plus the ressources found with difficulty spread around Linear but mostly SCA-21 (engine design spec by Louis Dureuil). Which I don't know if it's a definitive spec or not. This then has been written by claude code and reviewed by myself without knowing exactly if the content is correct unfortunately.
Summary by CodeRabbit