Common Python library for all VCC (Virtual Compliance Center) adapters connecting to ThemisDB.
This library provides shared functionality for VCC adapters:
- ThemisDB HTTP Client - Unified API client for all ThemisDB operations
- Data Processors - Text processing, chunking, and embedding generation
- Configuration Management - Environment-based configuration with Pydantic
- Utilities - Logging, connection validation, error handling
IMPORTANT: This library uses direct HTTP connections to ThemisDB. NO UDS3 framework dependency is required.
- ✅ Direct HTTP access to ThemisDB (no UDS3)
- ✅ Content import with automatic chunking
- ✅ Optional embedding generation (sentence-transformers)
- ✅ AQL query execution
- ✅ Vector similarity search
- ✅ Entity CRUD operations
- ✅ Batch operations support
- ✅ Namespace isolation
- ✅ JWT authentication support
cd adapters/vcc_base
pip install -r requirements.txtFor semantic embeddings (optional):
pip install sentence-transformersfrom vcc_base import ThemisVCCClient
import asyncio
async def main():
client = ThemisVCCClient(
base_url="http://localhost:8765",
namespace="vcc_data"
)
# Health check
health = await client.health_check()
print(health)
# Import content
payload = {
"content": {
"mime_type": "text/plain",
"tags": ["vcc", "test"]
},
"chunks": [
{
"seq_num": 0,
"chunk_type": "text",
"text": "Hello from VCC adapter"
}
],
"edges": []
}
result = await client.import_content(payload)
print(f"Imported content: {result}")
asyncio.run(main())from vcc_base import TextProcessor
processor = TextProcessor(
enable_embeddings=True,
chunk_size=800
)
# Process text into ThemisDB payload
text = "This is a sample document. It has multiple sentences."
payload = processor.process(
text=text,
source="document.txt",
tags=["sample", "vcc"]
)
# payload is ready for client.import_content()from vcc_base import VCCAdapterConfig
# Load from environment variables or .env file
config = VCCAdapterConfig()
print(config.themis_url) # http://127.0.0.1:8765
print(config.enable_embeddings) # True
print(config.chunk_size) # 800
# Override defaults
config = VCCAdapterConfig(
themis_url="http://prod-themis:8765",
themis_namespace="production",
enable_embeddings=False
)from vcc_base import ThemisVCCClient, TextProcessor, VCCAdapterConfig
import asyncio
async def ingest_document(filename: str):
# Load configuration
config = VCCAdapterConfig()
# Initialize client and processor
client = ThemisVCCClient(
base_url=config.themis_url,
namespace=config.themis_namespace
)
processor = TextProcessor(
enable_embeddings=config.enable_embeddings,
chunk_size=config.chunk_size
)
# Read and process document
with open(filename, 'r', encoding='utf-8') as f:
text = f.read()
payload = processor.process(
text=text,
source=filename,
tags=["vcc", "document"]
)
# Import to ThemisDB
result = await client.import_content(payload)
print(f"Imported {filename}: {result}")
asyncio.run(ingest_document("sample.txt"))THEMIS_URL- ThemisDB base URL (default: http://127.0.0.1:8765)THEMIS_AUTH_TOKEN- Optional JWT authentication tokenENABLE_EMBEDDINGS- Enable embedding generation (default: true)
Methods:
health_check()- Check ThemisDB healthimport_content(payload)- Import content with chunksquery_aql(query, bind_vars)- Execute AQL queryvector_search(embedding, k, filter_expr)- Vector similarity searchget_entity(key)- Get entity by keyput_entity(key, data)- Create/update entitydelete_entity(key)- Delete entitybatch_import(entities)- Batch import entities
Methods:
process(text, mime_type, source, tags)- Process text to payloadchunk_text(text)- Split text into semantic chunksembed_text(text)- Generate embedding for text
Fields:
themis_url- ThemisDB base URLthemis_namespace- Namespace for data isolationthemis_auth_token- Optional JWT tokenenable_embeddings- Enable embedding generationembedding_model- Sentence transformer model namechunk_size- Maximum chunk size (chars)batch_size- Batch operation size
┌─────────────────────────────┐
│ VCC Adapter (Clara, │
│ Covina, Veritas, etc.) │
└─────────────┬───────────────┘
│
▼
┌─────────────────────────────┐
│ VCC Base Library │
│ ┌───────────────────────┐ │
│ │ ThemisVCCClient │ │
│ │ TextProcessor │ │
│ │ VCCAdapterConfig │ │
│ └───────────────────────┘ │
└─────────────┬───────────────┘
│ HTTP (no UDS3)
▼
┌─────────────────────────────┐
│ ThemisDB Backend │
│ (http://host:8765) │
└─────────────────────────────┘
This is a shared library used by all VCC adapters. When making changes:
- Ensure backward compatibility with existing adapters
- Add tests for new functionality
- Update this README with new features
- Version appropriately (semver)
vcc_covina_ingestion- Covina system adapter (production)vcc_clara_ingestion- Clara system adaptervcc_veritas- Veritas system adapter
See main ThemisDB project license.