Vector DB

The Vector Database (Vector DB) feature provides semantic search capabilities across your organization's knowledge base. It enables AI agents and chat to retrieve relevant information from documents, enhancing their responses with contextual knowledge.

Overview

Vector DB allows you to:

Index text documents and attachments for semantic search
Define search scopes with custom parameters and filters
Integrate knowledge retrieval into AI agents and chat systems
Tag and organize documents for structured retrieval
Control search behavior with advanced preprocessing and postprocessing options

The system uses vector embeddings to understand document meaning, enabling searches based on conceptual similarity rather than just keyword matching.

VectorDB Settings

Global configuration for the Vector Database system.

Configuration Fields

Field	Description
Enabled	Master toggle to activate Vector DB functionality system-wide
Include Help Article	When enabled, automatically indexes Help Articles into the knowledge base
Status	System field showing the current build status (Built successfully/Updating)
Tags	Define custom tags for document categorization and filtering

Tag Configuration

Tags allow structured metadata on documents for filtered searches:

Field	Description
Name	Unique tag identifier (e.g., "User", "Application")
Type	Data type for tag values: string, bool, int64, or float32
System Generated	Indicates if the tag was created automatically by the system

VectorDB Document

Documents are the core content units indexed in the Vector Database.

Document Fields

Field	Description
Title	Display name for the document, used in search results
Type	Content type - either "Text" for direct content or "Attachment" for files
Status	Synchronization status (Pending/Syncing/Synced/Error)
Text Content	For Text type: Rich text editor content to be indexed
Attachment	For Attachment type: File to be processed and indexed
Tags	Custom metadata tags for filtering and categorization
Reference DocType	System field linking to source DocType
Reference Document	System field linking to source document
System Generated	Indicates automatic creation by system processes
Last Synced	Timestamp of last successful synchronization
Error Message	Details of any synchronization failures

Document Types

Text Documents

Direct text content entered through the editor
Supports rich formatting and structured content
Ideal for FAQs, policies, and knowledge articles

Attachment Documents

Supported formats: PDF, DOCX, TXT, MD, and more
Automatically extracts and indexes text content
Maintains reference to original file

Document Status Flow

Pending: Newly created, awaiting initial sync
Syncing: Currently being processed and indexed
Synced: Successfully indexed and searchable
Error: Failed to sync, check error message

Managing Documents

Creating Documents

Navigate to VectorDB Document list
Click "Add VectorDB Document"
Enter title and select type
Add content or attach file
Optionally add tags for categorization
Save to queue for synchronization

Bulk Operations

Use data import for batch document creation

Document Tags

Tags enable precise filtering in searches:

{
  "Department": "sales",
  "Region": "north", 
  "Year": 2024,
  "Confidential": false
}

VectorDB Search Scope

Search scopes define reusable search configurations with specific parameters and filters.

Scope Configuration

Field	Description
Name	Unique identifier for the search scope
Application	Application context for organizational grouping
Active	Enable/disable the scope without deletion
Filter (JSON)	JSON filter criteria for document selection

Search Parameters

Field	Description	Default
Limit	Maximum number of results to return	10
Dense Weight	Balance between semantic (1.0) and literal (0.0) search. Range: 0.2-1.0	0.5

Pre-Processing Options

Field	Description
Rewrite Conversation	Reformulates queries in multi-turn conversations for better context understanding

Post-Processing Options

Field	Description	Default
Rerank Switch	Enable result reranking for improved relevance	Off
Rerank Model	Model for reranking: "base-multilingual-rerank" or "m3-v2-rerank"	base-multilingual-rerank
Retrieve Count	Number of candidates for reranking (must be ≥ limit)	25
Rerank Only Chunk	Rerank based on content only, excluding titles	Off
Chunk Diffusion Count	Include surrounding text chunks (0-5)	0
Chunk Group	Aggregate and sort chunks by document order	Off
Get Image Link	Generate temporary download links for embedded images	Off

Filter Examples

The filter JSON uses operators to define conditions on scalar fields (tags). Supported operators include:

in: Value must be in the specified array
gt: Greater than
gte: Greater than or equal to
lt: Less than
lte: Less than or equal to

Single Field Filter (String)

{
  "op": "in",
  "field": "Department",
  "value": ["hr", "finance"]
}

Range Filter (Numeric)

{
  "op": "range",
  "field": "Year",
  "gt": 2020,
  "lte": 2024
}

Multiple Conditions (AND Logic)

{
  "op": "range",
  "field": "Priority",
  "gte": 1,
  "lte": 5
}

Boolean Filter

{
  "op": "in",
  "field": "Confidential",
  "value": [false]
}

Complex Filter Example

For combining multiple filters, structure them appropriately based on your tag schema:

{
  "op": "in",
  "field": "Status",
  "value": ["approved", "published"]
}

Note: The exact filter structure depends on how your tags are configured in VectorDB Settings. Ensure tag names and types match your defined schema.

Integration with AI Systems

AI Agent Integration

When Vector DB is enabled, AI Agents can:

Search knowledge base using prompts as queries
Apply search scopes for filtered results
Include search results as context in responses
Chain knowledge retrieval with TM AI Chain

To enable for an agent:

Edit the TM AI Agent
Check "Use Vector DB" in Search Knowledge section
Optionally select a Search Scope
Save and test the agent

AI Chat Integration

Chat agents with Vector DB can:

Search for relevant information based on user messages
Maintain conversation context for multi-turn searches
Apply search scopes for domain-specific knowledge
Combine document context with search results

Configuration in TM AI Chat Agent:

Enable "Use Vector DB" option
Select appropriate Search Scope
Configure rewrite for multi-turn conversations

Search Behavior

Semantic Search

Understands meaning and context, not just keywords
Finds conceptually related content
Works across languages with multilingual models
Handles synonyms and related concepts automatically

Hybrid Search

The Dense Weight parameter controls search behavior:

1.0: Pure semantic search (meaning-based)
0.5: Balanced semantic and keyword matching
0.2: Primarily keyword-based with semantic influence

Result Ranking

Initial results are scored by relevance. With reranking enabled:

Retrieves more candidates (retrieve_count)
Re-scores using advanced models
Returns top results (limit)
Provides more accurate relevance ordering

Permissions

VectorDB Admin Role

Full control over documents and search scopes
Manage global VectorDB settings
View all documents regardless of tags
Configure system-wide parameters

VectorDB User Role

Create and manage own documents
Read access to search scopes
Cannot modify global settings
Can use search in permitted contexts

Best Practices

Document Management

Clear Titles: Use descriptive titles for easy identification
Consistent Tagging: Establish naming conventions for tags
Regular Updates: Keep documents current with scheduled syncs
Error Monitoring: Review and resolve sync errors promptly

Search Optimization

Dense Weight Tuning: Adjust based on content type (technical=lower, conceptual=higher)
Scope Definition: Create specific scopes for different use cases
Reranking Strategy: Enable for critical searches, disable for speed
Chunk Settings: Use diffusion for context-heavy content

Performance Considerations

Document Size: Break large documents into logical sections
Batch Processing: Use bulk operations for large imports
Search Limits: Balance result quality with response time
Scope Filters: Narrow searches to relevant document sets

Troubleshooting

Common Issues

Documents stuck in "Pending" status

Check if VectorDB is enabled in settings
Review system logs & error logs for sync errors

Poor search results

Adjust Dense Weight parameter
Enable reranking for better relevance
Review document tagging accuracy

Search scope not working

Verify scope is marked as Active
Validate JSON filter syntax
Ensure filtered documents exist
Check tag values match filter criteria

Integration not finding knowledge

Confirm Vector DB is enabled
Check agent/chat has "Use Vector DB" enabled
Verify search scope permissions
Test search directly with same query

API Usage

For programmatic access via server scripts:

# Basic search
results = mantera.ai_search_knowledge("pricing policy")

# With search scope
results = mantera.ai_search_knowledge(
    query="vacation policy",
    search_scope="HR"
)

# With conversation context
results = mantera.ai_search_knowledge(
    query="follow-up question",
    messages=[
        {"role": "user", "content": "previous question"},
        {"role": "assistant", "content": "previous answer"}
    ]
)

Advanced Configuration

Multi-turn Conversations

Enable query rewriting in search scopes for chat applications:

Edit Search Scope
Check "Rewrite Conversation"
System reformulates queries based on chat history
Improves relevance for follow-up questions

Chunk Management

For document-heavy searches:

Chunk Diffusion: Includes surrounding paragraphs for context
Chunk Group: Maintains document structure in results
Rerank Only Chunk: Focuses scoring on content, not metadata

Custom Filtering

Create complex filters for precise document selection:

Combine multiple tag conditions
Use comparison operators for numeric values
Apply boolean logic for complex criteria
Reference system fields for dynamic filtering