Skip to content

IBX-9846: Describe Embeddings search API#3029

Open
dabrt wants to merge 5 commits into5.0from
IBX-9846
Open

IBX-9846: Describe Embeddings search API#3029
dabrt wants to merge 5 commits into5.0from
IBX-9846

Conversation

@dabrt
Copy link
Contributor

@dabrt dabrt commented Jan 30, 2026

Question Answer
JIRA Ticket IBX-9846
Versions 4.6, 5.0
Edition All

Describe Embeddings search API

Checklist

  • Text renders correctly
  • Text has been checked with vale
  • Description metadata is up to date
  • PHP code samples have been fixed with PHP CS fixer
  • Added link to this PR in relevant JIRA ticket or code PR

@github-actions
Copy link

github-actions bot commented Jan 30, 2026

@dabrt dabrt requested a review from mikadamczyk February 26, 2026 14:09
@sonarqubecloud
Copy link


- [EmbeddingQueryBuilder](/api/php_api/php_api_reference/classes/Ibexa-Contracts-Core-Repository-Values-Content-EmbeddingQueryBuilder.html):
A fluent builder for constructing `EmbeddingQuery` instances.
It enforces required parameters and integrates embedding queries with the search query pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It enforces required parameters and integrates embedding queries with the search query pipeline
It helps construct queries consistently and integrates embedding queries with the search query pipeline, but you must still provide the required embedding value.

Comment on lines +393 to +394
Validates embedding queries before they are passed to the search engine.
Implementations ensure that the embedding model exists and that vector dimensions match the configured embedding field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Validates embedding queries before they are passed to the search engine.
Implementations ensure that the embedding model exists and that vector dimensions match the configured embedding field
Validates embedding query structure before execution.
Provider/model configuration and vector compatibility are resolved at runtime by the configured embedding and search engine components.

Comment on lines +19 to +20
- [`Ibexa\Contracts\Core\Repository\Values\Content\Query\Embedding`](/api/php_api/php_api_reference/classes/Ibexa-Contracts-Core-Repository-Values-Content-Query-Embedding.html): Represents the semantic input used for similarity search.
Depending on the embedding provider, it can encapsulate text or vector data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [`Ibexa\Contracts\Core\Repository\Values\Content\Query\Embedding`](/api/php_api/php_api_reference/classes/Ibexa-Contracts-Core-Repository-Values-Content-Query-Embedding.html): Represents the semantic input used for similarity search.
Depending on the embedding provider, it can encapsulate text or vector data
- [`Ibexa\Contracts\Core\Repository\Values\Content\Query\Embedding`](/api/php_api/php_api_reference/classes/Ibexa-Contracts-Core-Repository-Values-Content-Query-Embedding.html): Represents the vector input used for similarity search.
It stores embedding values as float arrays, while providers generate those vectors from text input.


## Embedding providers

Embedding providers generate vector representations for inputs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Embedding providers generate vector representations for inputs.
Embedding providers generate vector representations for inputs. Out of the box, embedding search integration is provided for TaxonomyEmbedding. If you use a custom embedding value type, implement matching embedding visitors for your search engine (Solr/Elasticsearch). Otherwise, query execution may fail with "No visitor available".

Comment on lines +403 to +423
use Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQueryBuilder;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Embedding;

// Example embedding vector (float[])
$vector = [
0.0123,
-0.9876,
0.4567,
// ...
];

// Create an Embedding instance with a float[] vector
$embedding = new Embedding($vector);

// Build the embedding query with the fluent builder
$embeddingQuery = EmbeddingQueryBuilder::create()
->withEmbedding($embedding)
->setLimit(10) // maximum number of results
->setOffset(0) // result offset for pagination
->setPerformCount(true) // optionally count total matching items
->build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQueryBuilder;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Embedding;
// Example embedding vector (float[])
$vector = [
0.0123,
-0.9876,
0.4567,
// ...
];
// Create an Embedding instance with a float[] vector
$embedding = new Embedding($vector);
// Build the embedding query with the fluent builder
$embeddingQuery = EmbeddingQueryBuilder::create()
->withEmbedding($embedding)
->setLimit(10) // maximum number of results
->setOffset(0) // result offset for pagination
->setPerformCount(true) // optionally count total matching items
->build();
use Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQueryBuilder;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Criterion\ContentTypeIdentifier;
use Ibexa\Contracts\Taxonomy\Search\Query\Value\TaxonomyEmbedding;
$vector = [
0.0123,
-0.9876,
0.4567,
...
];
$embedding = new TaxonomyEmbedding($vector);
$embeddingQuery = EmbeddingQueryBuilder::create()
->withEmbedding($embedding)
->setFilter(new ContentTypeIdentifier('article'))
->setLimit(10)
->setOffset(0)
->setPerformCount(true)
->build();

->build();

// Execute the query via the repository
$results = $repository->findContent($embeddingQuery);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$results = $repository->findContent($embeddingQuery);
use Ibexa\Contracts\Core\Repository\Repository;
use Ibexa\Contracts\Core\Repository\SearchService;
use Ibexa\Contracts\Core\Repository\Values\Content\EmbeddingQueryBuilder;
use Ibexa\Contracts\Taxonomy\Search\Query\Value\TaxonomyEmbedding;
final class ExampleService
{
private SearchService $searchService;
public function __construct(Repository $repository)
{
$this->searchService = $repository->getSearchService();
}
public function searchByEmbedding(array $vector): void
{
$query = EmbeddingQueryBuilder::create()
->withEmbedding(new TaxonomyEmbedding($vector))
->setLimit(10)
->setOffset(0)
->build();
$result = $this->searchService->findContent($query);
foreach ($result->searchHits as $hit) {
// ...
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants