Skip to content

[IMPROVE] Embedding model encoding time for search and file upload  #441

@sahilds1

Description

@sahilds1

Title: [IMPROVE] Embedding model encoding time for search and file upload

Background

Search and file uploading call TransformerModel.get_instance() directly at request time

Current State

from sentence_transformers import SentenceTransformer
import logging
logger = logging.getLogger(__name__)
class TransformerModel:
_instance = None
def __new__(cls):
if cls._instance is None:
logger.info("Loading SentenceTransformer model")
cls._instance = super(TransformerModel, cls).__new__(cls)
cls._instance.model = SentenceTransformer(
'paraphrase-MiniLM-L6-v2')
return cls._instance
@classmethod
def get_instance(cls):
if cls._instance is None:
cls._instance = cls()
return cls._instance

Acceptance Criteria

  • The model is warm before the first search or upload request hits

Approach

Preload SentenceTransformer model at Django startup before traffic is routed to the application instance
Add tests for the embeddings services by pulling apart the core logic to make testing easier

References

Risks and Rollback

Fall back to lazy load using try except block if preloading embedding model at startup fails

Screenshots / Recordings

Related PR

Balancer PR #461

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions