Vector DB
The Vector Database (Vector DB) feature provides semantic search capabilities across your organization's knowledge base. It enables AI agents and chat to retrieve relevant information from documents, enhancing their responses with contextual knowledge.
Overview
Vector DB allows you to:
- Index text documents and attachments for semantic search
- Define search scopes with custom parameters and filters
- Integrate knowledge retrieval into AI agents and chat systems
- Tag and organize documents for structured retrieval
- Control search behavior with advanced preprocessing and postprocessing options
The system uses vector embeddings to understand document meaning, enabling searches based on conceptual similarity rather than just keyword matching.
VectorDB Settings
Global configuration for the Vector Database system.
Configuration Fields
| Field | Description |
|---|---|
| Enabled | Master toggle to activate Vector DB functionality system-wide |
| Include Help Article | When enabled, automatically indexes Help Articles into the knowledge base |
| Status | System field showing the current build status (Built successfully/Updating) |
| Tags | Define custom tags for document categorization and filtering |
Tag Configuration
Tags allow structured metadata on documents for filtered searches:
| Field | Description |
|---|---|
| Name | Unique tag identifier (e.g., "User", "Application") |
| Type | Data type for tag values: string, bool, int64, or float32 |
| System Generated | Indicates if the tag was created automatically by the system |
VectorDB Document
Documents are the core content units indexed in the Vector Database.
Document Fields
| Field | Description |
|---|---|
| Title | Display name for the document, used in search results |
| Type | Content type - either "Text" for direct content or "Attachment" for files |
| Status | Synchronization status (Pending/Syncing/Synced/Error) |
| Text Content | For Text type: Rich text editor content to be indexed |
| Attachment | For Attachment type: File to be processed and indexed |
| Tags | Custom metadata tags for filtering and categorization |
| Reference DocType | System field linking to source DocType |
| Reference Document | System field linking to source document |
| System Generated | Indicates automatic creation by system processes |
| Last Synced | Timestamp of last successful synchronization |
| Error Message | Details of any synchronization failures |
Document Types
Text Documents
- Direct text content entered through the editor
- Supports rich formatting and structured content
- Ideal for FAQs, policies, and knowledge articles
Attachment Documents
- Supported formats: PDF, DOCX, TXT, MD, and more
- Automatically extracts and indexes text content
- Maintains reference to original file
Document Status Flow
- Pending: Newly created, awaiting initial sync
- Syncing: Currently being processed and indexed
- Synced: Successfully indexed and searchable
- Error: Failed to sync, check error message
Managing Documents
Creating Documents
- Navigate to VectorDB Document list
- Click "Add VectorDB Document"
- Enter title and select type
- Add content or attach file
- Optionally add tags for categorization
- Save to queue for synchronization
Bulk Operations
- Use data import for batch document creation
Document Tags
Tags enable precise filtering in searches:
{
"Department": "sales",
"Region": "north",
"Year": 2024,
"Confidential": false
}
VectorDB Search Scope
Search scopes define reusable search configurations with specific parameters and filters.
Scope Configuration
| Field | Description |
|---|---|
| Name | Unique identifier for the search scope |
| Application | Application context for organizational grouping |
| Active | Enable/disable the scope without deletion |
| Filter (JSON) | JSON filter criteria for document selection |
Search Parameters
| Field | Description | Default |
|---|---|---|
| Limit | Maximum number of results to return | 10 |
| Dense Weight | Balance between semantic (1.0) and literal (0.0) search. Range: 0.2-1.0 | 0.5 |
Pre-Processing Options
| Field | Description |
|---|---|
| Rewrite Conversation | Reformulates queries in multi-turn conversations for better context understanding |
Post-Processing Options
| Field | Description | Default |
|---|---|---|
| Rerank Switch | Enable result reranking for improved relevance | Off |
| Rerank Model | Model for reranking: "base-multilingual-rerank" or "m3-v2-rerank" | base-multilingual-rerank |
| Retrieve Count | Number of candidates for reranking (must be ≥ limit) | 25 |
| Rerank Only Chunk | Rerank based on content only, excluding titles | Off |
| Chunk Diffusion Count | Include surrounding text chunks (0-5) | 0 |
| Chunk Group | Aggregate and sort chunks by document order | Off |
| Get Image Link | Generate temporary download links for embedded images | Off |
Filter Examples
The filter JSON uses operators to define conditions on scalar fields (tags). Supported operators include:
- in: Value must be in the specified array
- gt: Greater than
- gte: Greater than or equal to
- lt: Less than
- lte: Less than or equal to
Single Field Filter (String)
{
"op": "in",
"field": "Department",
"value": ["hr", "finance"]
}
Range Filter (Numeric)
{
"op": "range",
"field": "Year",
"gt": 2020,
"lte": 2024
}
Multiple Conditions (AND Logic)
{
"op": "range",
"field": "Priority",
"gte": 1,
"lte": 5
}
Boolean Filter
{
"op": "in",
"field": "Confidential",
"value": [false]
}
Complex Filter Example
For combining multiple filters, structure them appropriately based on your tag schema:
{
"op": "in",
"field": "Status",
"value": ["approved", "published"]
}
Note: The exact filter structure depends on how your tags are configured in VectorDB Settings. Ensure tag names and types match your defined schema.
Integration with AI Systems
AI Agent Integration
When Vector DB is enabled, AI Agents can:
- Search knowledge base using prompts as queries
- Apply search scopes for filtered results
- Include search results as context in responses
- Chain knowledge retrieval with TM AI Chain
To enable for an agent:
- Edit the TM AI Agent
- Check "Use Vector DB" in Search Knowledge section
- Optionally select a Search Scope
- Save and test the agent
AI Chat Integration
Chat agents with Vector DB can:
- Search for relevant information based on user messages
- Maintain conversation context for multi-turn searches
- Apply search scopes for domain-specific knowledge
- Combine document context with search results
Configuration in TM AI Chat Agent:
- Enable "Use Vector DB" option
- Select appropriate Search Scope
- Configure rewrite for multi-turn conversations
Search Behavior
Semantic Search
- Understands meaning and context, not just keywords
- Finds conceptually related content
- Works across languages with multilingual models
- Handles synonyms and related concepts automatically
Hybrid Search
The Dense Weight parameter controls search behavior:
- 1.0: Pure semantic search (meaning-based)
- 0.5: Balanced semantic and keyword matching
- 0.2: Primarily keyword-based with semantic influence
Result Ranking
Initial results are scored by relevance. With reranking enabled:
- Retrieves more candidates (retrieve_count)
- Re-scores using advanced models
- Returns top results (limit)
- Provides more accurate relevance ordering
Permissions
VectorDB Admin Role
- Full control over documents and search scopes
- Manage global VectorDB settings
- View all documents regardless of tags
- Configure system-wide parameters
VectorDB User Role
- Create and manage own documents
- Read access to search scopes
- Cannot modify global settings
- Can use search in permitted contexts
Best Practices
Document Management
- Clear Titles: Use descriptive titles for easy identification
- Consistent Tagging: Establish naming conventions for tags
- Regular Updates: Keep documents current with scheduled syncs
- Error Monitoring: Review and resolve sync errors promptly
Search Optimization
- Dense Weight Tuning: Adjust based on content type (technical=lower, conceptual=higher)
- Scope Definition: Create specific scopes for different use cases
- Reranking Strategy: Enable for critical searches, disable for speed
- Chunk Settings: Use diffusion for context-heavy content
Performance Considerations
- Document Size: Break large documents into logical sections
- Batch Processing: Use bulk operations for large imports
- Search Limits: Balance result quality with response time
- Scope Filters: Narrow searches to relevant document sets
Troubleshooting
Common Issues
Documents stuck in "Pending" status
- Check if VectorDB is enabled in settings
- Review system logs & error logs for sync errors
Poor search results
- Adjust Dense Weight parameter
- Enable reranking for better relevance
- Review document tagging accuracy
Search scope not working
- Verify scope is marked as Active
- Validate JSON filter syntax
- Ensure filtered documents exist
- Check tag values match filter criteria
Integration not finding knowledge
- Confirm Vector DB is enabled
- Check agent/chat has "Use Vector DB" enabled
- Verify search scope permissions
- Test search directly with same query
API Usage
For programmatic access via server scripts:
# Basic search
results = mantera.ai_search_knowledge("pricing policy")
# With search scope
results = mantera.ai_search_knowledge(
query="vacation policy",
search_scope="HR"
)
# With conversation context
results = mantera.ai_search_knowledge(
query="follow-up question",
messages=[
{"role": "user", "content": "previous question"},
{"role": "assistant", "content": "previous answer"}
]
)
Advanced Configuration
Multi-turn Conversations
Enable query rewriting in search scopes for chat applications:
- Edit Search Scope
- Check "Rewrite Conversation"
- System reformulates queries based on chat history
- Improves relevance for follow-up questions
Chunk Management
For document-heavy searches:
- Chunk Diffusion: Includes surrounding paragraphs for context
- Chunk Group: Maintains document structure in results
- Rerank Only Chunk: Focuses scoring on content, not metadata
Custom Filtering
Create complex filters for precise document selection:
- Combine multiple tag conditions
- Use comparison operators for numeric values
- Apply boolean logic for complex criteria
- Reference system fields for dynamic filtering