Vector DB

The Vector Database (Vector DB) feature provides semantic search capabilities across your organization's knowledge base. It enables AI agents and chat to retrieve relevant information from documents, enhancing their responses with contextual knowledge.

Overview

Vector DB allows you to:

  • Index text documents and attachments for semantic search
  • Define search scopes with custom parameters and filters
  • Integrate knowledge retrieval into AI agents and chat systems
  • Tag and organize documents for structured retrieval
  • Control search behavior with advanced preprocessing and postprocessing options

The system uses vector embeddings to understand document meaning, enabling searches based on conceptual similarity rather than just keyword matching.

VectorDB Settings

Global configuration for the Vector Database system.

Configuration Fields

Field Description
Enabled Master toggle to activate Vector DB functionality system-wide
Include Help Article When enabled, automatically indexes Help Articles into the knowledge base
Status System field showing the current build status (Built successfully/Updating)
Tags Define custom tags for document categorization and filtering

Tag Configuration

Tags allow structured metadata on documents for filtered searches:

Field Description
Name Unique tag identifier (e.g., "User", "Application")
Type Data type for tag values: string, bool, int64, or float32
System Generated Indicates if the tag was created automatically by the system

VectorDB Document

Documents are the core content units indexed in the Vector Database.

Document Fields

Field Description
Title Display name for the document, used in search results
Type Content type - either "Text" for direct content or "Attachment" for files
Status Synchronization status (Pending/Syncing/Synced/Error)
Text Content For Text type: Rich text editor content to be indexed
Attachment For Attachment type: File to be processed and indexed
Tags Custom metadata tags for filtering and categorization
Reference DocType System field linking to source DocType
Reference Document System field linking to source document
System Generated Indicates automatic creation by system processes
Last Synced Timestamp of last successful synchronization
Error Message Details of any synchronization failures

Document Types

Text Documents

  • Direct text content entered through the editor
  • Supports rich formatting and structured content
  • Ideal for FAQs, policies, and knowledge articles

Attachment Documents

  • Supported formats: PDF, DOCX, TXT, MD, and more
  • Automatically extracts and indexes text content
  • Maintains reference to original file

Document Status Flow

  1. Pending: Newly created, awaiting initial sync
  2. Syncing: Currently being processed and indexed
  3. Synced: Successfully indexed and searchable
  4. Error: Failed to sync, check error message

Managing Documents

Creating Documents

  1. Navigate to VectorDB Document list
  2. Click "Add VectorDB Document"
  3. Enter title and select type
  4. Add content or attach file
  5. Optionally add tags for categorization
  6. Save to queue for synchronization

Bulk Operations

  • Use data import for batch document creation

Document Tags

Tags enable precise filtering in searches:

{
  "Department": "sales",
  "Region": "north", 
  "Year": 2024,
  "Confidential": false
}

VectorDB Search Scope

Search scopes define reusable search configurations with specific parameters and filters.

Scope Configuration

Field Description
Name Unique identifier for the search scope
Application Application context for organizational grouping
Active Enable/disable the scope without deletion
Filter (JSON) JSON filter criteria for document selection

Search Parameters

Field Description Default
Limit Maximum number of results to return 10
Dense Weight Balance between semantic (1.0) and literal (0.0) search. Range: 0.2-1.0 0.5

Pre-Processing Options

Field Description
Rewrite Conversation Reformulates queries in multi-turn conversations for better context understanding

Post-Processing Options

Field Description Default
Rerank Switch Enable result reranking for improved relevance Off
Rerank Model Model for reranking: "base-multilingual-rerank" or "m3-v2-rerank" base-multilingual-rerank
Retrieve Count Number of candidates for reranking (must be ≥ limit) 25
Rerank Only Chunk Rerank based on content only, excluding titles Off
Chunk Diffusion Count Include surrounding text chunks (0-5) 0
Chunk Group Aggregate and sort chunks by document order Off
Get Image Link Generate temporary download links for embedded images Off

Filter Examples

The filter JSON uses operators to define conditions on scalar fields (tags). Supported operators include:

  • in: Value must be in the specified array
  • gt: Greater than
  • gte: Greater than or equal to
  • lt: Less than
  • lte: Less than or equal to

Single Field Filter (String)

{
  "op": "in",
  "field": "Department",
  "value": ["hr", "finance"]
}

Range Filter (Numeric)

{
  "op": "range",
  "field": "Year",
  "gt": 2020,
  "lte": 2024
}

Multiple Conditions (AND Logic)

{
  "op": "range",
  "field": "Priority",
  "gte": 1,
  "lte": 5
}

Boolean Filter

{
  "op": "in",
  "field": "Confidential",
  "value": [false]
}

Complex Filter Example

For combining multiple filters, structure them appropriately based on your tag schema:

{
  "op": "in",
  "field": "Status",
  "value": ["approved", "published"]
}

Note: The exact filter structure depends on how your tags are configured in VectorDB Settings. Ensure tag names and types match your defined schema.

Integration with AI Systems

AI Agent Integration

When Vector DB is enabled, AI Agents can:

  1. Search knowledge base using prompts as queries
  2. Apply search scopes for filtered results
  3. Include search results as context in responses
  4. Chain knowledge retrieval with TM AI Chain

To enable for an agent:

  1. Edit the TM AI Agent
  2. Check "Use Vector DB" in Search Knowledge section
  3. Optionally select a Search Scope
  4. Save and test the agent

AI Chat Integration

Chat agents with Vector DB can:

  1. Search for relevant information based on user messages
  2. Maintain conversation context for multi-turn searches
  3. Apply search scopes for domain-specific knowledge
  4. Combine document context with search results

Configuration in TM AI Chat Agent:

  1. Enable "Use Vector DB" option
  2. Select appropriate Search Scope
  3. Configure rewrite for multi-turn conversations

Search Behavior

Semantic Search

  • Understands meaning and context, not just keywords
  • Finds conceptually related content
  • Works across languages with multilingual models
  • Handles synonyms and related concepts automatically

Hybrid Search

The Dense Weight parameter controls search behavior:

  • 1.0: Pure semantic search (meaning-based)
  • 0.5: Balanced semantic and keyword matching
  • 0.2: Primarily keyword-based with semantic influence

Result Ranking

Initial results are scored by relevance. With reranking enabled:

  1. Retrieves more candidates (retrieve_count)
  2. Re-scores using advanced models
  3. Returns top results (limit)
  4. Provides more accurate relevance ordering

Permissions

VectorDB Admin Role

  • Full control over documents and search scopes
  • Manage global VectorDB settings
  • View all documents regardless of tags
  • Configure system-wide parameters

VectorDB User Role

  • Create and manage own documents
  • Read access to search scopes
  • Cannot modify global settings
  • Can use search in permitted contexts

Best Practices

Document Management

  • Clear Titles: Use descriptive titles for easy identification
  • Consistent Tagging: Establish naming conventions for tags
  • Regular Updates: Keep documents current with scheduled syncs
  • Error Monitoring: Review and resolve sync errors promptly

Search Optimization

  • Dense Weight Tuning: Adjust based on content type (technical=lower, conceptual=higher)
  • Scope Definition: Create specific scopes for different use cases
  • Reranking Strategy: Enable for critical searches, disable for speed
  • Chunk Settings: Use diffusion for context-heavy content

Performance Considerations

  • Document Size: Break large documents into logical sections
  • Batch Processing: Use bulk operations for large imports
  • Search Limits: Balance result quality with response time
  • Scope Filters: Narrow searches to relevant document sets

Troubleshooting

Common Issues

Documents stuck in "Pending" status

  • Check if VectorDB is enabled in settings
  • Review system logs & error logs for sync errors

Poor search results

  • Adjust Dense Weight parameter
  • Enable reranking for better relevance
  • Review document tagging accuracy

Search scope not working

  • Verify scope is marked as Active
  • Validate JSON filter syntax
  • Ensure filtered documents exist
  • Check tag values match filter criteria

Integration not finding knowledge

  • Confirm Vector DB is enabled
  • Check agent/chat has "Use Vector DB" enabled
  • Verify search scope permissions
  • Test search directly with same query

API Usage

For programmatic access via server scripts:

# Basic search
results = mantera.ai_search_knowledge("pricing policy")

# With search scope
results = mantera.ai_search_knowledge(
    query="vacation policy",
    search_scope="HR"
)

# With conversation context
results = mantera.ai_search_knowledge(
    query="follow-up question",
    messages=[
        {"role": "user", "content": "previous question"},
        {"role": "assistant", "content": "previous answer"}
    ]
)

Advanced Configuration

Multi-turn Conversations

Enable query rewriting in search scopes for chat applications:

  1. Edit Search Scope
  2. Check "Rewrite Conversation"
  3. System reformulates queries based on chat history
  4. Improves relevance for follow-up questions

Chunk Management

For document-heavy searches:

  • Chunk Diffusion: Includes surrounding paragraphs for context
  • Chunk Group: Maintains document structure in results
  • Rerank Only Chunk: Focuses scoring on content, not metadata

Custom Filtering

Create complex filters for precise document selection:

  • Combine multiple tag conditions
  • Use comparison operators for numeric values
  • Apply boolean logic for complex criteria
  • Reference system fields for dynamic filtering

Discard
Save

On this page

Review Changes ← Back to Content
Message Status Space Raised By Last update on