Skip to content

Syntera Marketplace Documentation

Introduction


Syntera Marketplace is an enterprise-grade platform for deploying, managing, and accessing vector databases, large language models (LLMs), and AI agents on your infrastructure. This documentation provides comprehensive guidance for efficiently utilizing our vector database offerings to power semantic search, recommendations, and other AI-enabled applications.

Table of Contents


  1. Platform Overview
  2. Available Vector Databases
  3. Getting Started
  4. Database Selection Guide
  5. Database Configuration
  6. API Reference
  7. Security & Compliance
  8. Performance Optimization
  9. Troubleshooting
  10. Use Case Examples
  11. Best Practices
  12. FAQs

1. Platform Overview


Syntera Marketplace enables organizations to deploy and manage high-performance vector databases through a streamlined interface. Our vector database solutions are designed to:

1.1 Key Features

  • On-Premises Deployment: Run vector databases on your own infrastructure for maximum data security and compliance
  • Multi-tenant Capability: Support multiple applications and teams with isolated resources
  • Configurable Indices: Optimize for search performance, accuracy, or memory efficiency
  • Performance Monitoring: Track query latency, throughput, and resource utilization
  • Enterprise Support: Dedicated assistance for deployment and optimization

1.2 Vector Database Architecture

The Syntera Vector Database service consists of:

  • Model Repository: Curated collection of open-source vector database solutions
  • Deployment Engine: Automated system for provisioning databases on your infrastructure
  • Management Console: Web interface for configuration and monitoring
  • API Gateway: Secure access point for application integration

1.3 System Architecture


Vector Database System Architecture

2. Available Vector Databases


2.1 Vector Database Options

Database Description Best For
Chroma DB Open-source, fast, and scalable vector database General-purpose vector search, Python applications
Milvus Distributed vector database built on Kubernetes High-scale production deployments
PG Vector Vector extensions for PostgreSQL Applications already using PostgreSQL
Pinecone DB Fully-managed vector database service Real-time, high-performance vector search
Qdrant Vector database with extended filtering capabilities Complex filtering and hybrid search
Weaviate Semantic vector search engine Knowledge graph applications

2.2 Database Comparison

Feature Chroma DB Milvus PG Vector Pinecone DB Qdrant Weaviate
ANN Algorithms HNSW, Flat HNSW, IVF, GPU HNSW, IVF HNSW HNSW HNSW
Metadata Filtering ✓ ✓ ✓ (via SQL) ✓ ✓ ✓
Storage Options In-memory, Disk Distributed PostgreSQL Managed File, Disk Disk
Scaling Model Vertical Horizontal Vertical/Horizontal Managed Horizontal Horizontal
License Apache 2.0 LGPLv3 PostgreSQL Commercial Apache 2.0 BSD
Best Use Case Development, Prototyping Enterprise, High Scale SQL Integration Managed Service Filtering-Heavy Knowledge Graphs

2.3 Indexing Algorithms

  • HNSW (Hierarchical Navigable Small World): Fast approximate search with controllable accuracy
  • Flat: Exact search with highest accuracy but slower performance
  • IVF (Inverted File Index): Balance between memory usage and search speed
  • PQ (Product Quantization): Compression technique for memory efficiency

3. Getting Started


3.1 Account Setup

  1. Registration: Create your organization account at marketplace.syntera.ai
  2. Environment Setup: Configure your infrastructure details
  3. Authentication: Generate API keys for secure access

3.2 Deploying Your First Vector Database

Follow these steps to deploy your first vector database:

  1. Navigate to the Vector Databases section in the Marketplace dashboard
  2. Browse and select a database suitable for your use case
  3. Click "Configure" to set database parameters
  4. Deploy the database to your infrastructure
  5. Retrieve connection details (endpoint URL, API key, credentials)
  6. Test connectivity using the built-in playground
  7. Integrate the API endpoint into your application

3.3 Basic Operations Workflow

A typical vector database workflow includes:

  1. Convert Data to Embeddings: Transform text, images, or other data into vector embeddings
  2. Store Vectors: Add vectors and associated metadata to your database
  3. Create Indices: Build optimized indices for fast similarity search
  4. Query for Similarity: Search for similar vectors based on distance metrics
  5. Filter Results: Apply metadata filters to narrow down search results

4. Database Selection Guide


4.1 Evaluation Criteria

Consider these factors when selecting a vector database:

  • Scale Requirements: Expected vector count and query volume
  • Query Latency Needs: Importance of low-latency responses
  • Accuracy Requirements: Tolerance for approximate results
  • Filtering Capabilities: Need for complex metadata filtering
  • Integration Needs: Compatibility with existing tech stack
  • Resource Constraints: Available infrastructure capacity

4.2 Use Case Matching

Use Case Recommended Databases Important Features
Semantic Search Chroma DB, Weaviate Hybrid search, text preprocessing
Recommendation Systems Milvus, Pinecone DB High throughput, real-time updates
Image Similarity Qdrant, Milvus Multi-modal support, filtering
Document Retrieval Chroma DB, PG Vector Text chunking, metadata storage
Anomaly Detection Qdrant, Pinecone DB Fast indexing, distance metrics
Multi-tenant Applications Milvus, Weaviate Collection isolation, access control

4.3 Hardware Considerations

Different databases have varying hardware requirements:

  • Memory-Optimized: For in-memory databases like Chroma (in-memory mode)
  • Storage-Optimized: For disk-based databases with large vector collections
  • CPU-Intensive: For high query throughput requirements
  • GPU-Accelerated: For databases supporting GPU acceleration (e.g., Milvus)

5. Database Configuration


5.1 Basic Configuration

5.1.1 Database Information

  • Instance Name: Identifier for your deployed database
  • Database Type: Selected vector database solution (e.g., Chroma DB, Milvus)

5.1.2 Embedding Settings

Parameter Type Description Considerations
Embedding Model Selection Model used to generate vectors Dimensionality, domain compatibility
Embedding Dimension Integer Size of vector embeddings Higher dimensions = better accuracy, more resources
Batch Size Integer Records processed together Higher = faster ingestion, more memory
Max Sequence Length Integer Token limit for text inputs Higher = more context, slower processing

5.2 Indexing Configuration

5.2.1 Indexing Strategy

Parameter Type Description Trade-offs
Index Type Selection Algorithm for similarity search Speed vs. accuracy vs. memory
HNSW M Parameter Integer Graph connectivity in HNSW Higher = better recall, more memory
EF Construction Integer Search quality during build Higher = better quality, slower builds
Vector Space Selection Distance metric (cosine, L2, dot) Depends on embedding properties

5.2.2 Collection Settings

  • Default Collection: Name for the primary vector collection
  • Embedding Function: Method for converting data to vectors
  • Metadata Configuration: Schema for additional attributes

5.3 Infrastructure Configuration

  • CPU Allocation: Processing power for query handling
  • Memory Allocation: RAM for index and cache storage
  • Storage Settings: Persistence configuration for vectors
  • Scaling Options: Configuration for handling growth

6. API Reference


6.1 Authentication

All API requests require authentication:

curl -X POST "https://your-db-instance.chroma.ai/api/v1/collections/my_collection/query" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query_embeddings": [[0.1, 0.2, ...]]
  }'

6.2 Common Endpoints

6.2.1 Collections Management

POST /api/v1/collections

Create a new collection:

{
  "name": "my_collection",
  "metadata": {
    "description": "Product catalog embeddings"
  }
}

List collections:

GET /api/v1/collections

Delete a collection:

DELETE /api/v1/collections/my_collection

6.2.2 Vector Operations

Add vectors:

POST /api/v1/collections/my_collection/add
{
  "ids": ["id1", "id2"],
  "embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]],
  "documents": ["text document 1", "text document 2"],
  "metadatas": [
    {"category": "electronics", "price": 99.99},
    {"category": "books", "price": 12.99}
  ]
}

Query vectors:

POST /api/v1/collections/my_collection/query
{
  "query_embeddings": [[0.1, 0.2, ...]],
  "n_results": 5,
  "where": {"category": "electronics"},
  "include": ["documents", "metadatas", "distances"]
}

6.3 Database-Specific APIs

Different vector databases may have additional specialized endpoints:

Chroma DB

POST /api/v1/collections/my_collection/get
{
  "ids": ["id1", "id2"],
  "include": ["embeddings", "documents", "metadatas"]
}

Qdrant

POST /collections/my_collection/points/search
{
  "vector": [0.1, 0.2, ...],
  "limit": 10,
  "filter": {
    "must": [
      {"key": "category", "match": {"value": "electronics"}}
    ]
  }
}

6.4 Response Format

6.4.1 Query Response

{
  "ids": [["id5", "id1", "id8"]],
  "distances": [[0.12, 0.28, 0.35]],
  "documents": [["document text 5", "document text 1", "document text 8"]],
  "metadatas": [[
    {"category": "electronics", "price": 199.99},
    {"category": "electronics", "price": 99.99},
    {"category": "electronics", "price": 149.99}
  ]]
}

6.4.2 Error Response

{
  "error": {
    "code": 400,
    "message": "Invalid request: missing required parameter 'embeddings'",
    "type": "invalid_request_error",
    "param": "embeddings"
  }
}

7. Security & Compliance


7.1 Data Protection

  • Data Isolation: All vector data remains within your infrastructure
  • Encryption at Rest: Configure storage encryption for sensitive vector data
  • Encryption in Transit: TLS for all API communications
  • Data Backup: Scheduled backups of vector collections and metadata

7.2 Access Control

  • API Key Management: Create, rotate, and revoke access credentials
  • Role-Based Access: Configure permissions for different users/services
  • Collection-Level Permissions: Restrict access to specific vector collections
  • IP Restrictions: Limit API access to trusted networks

7.3 Compliance Features

  • Audit Logging: Track all operations for compliance requirements
  • GDPR Support: Features to help meet data protection regulations
  • Data Deletion: Purge vectors and associated metadata on request
  • PII Handling: Best practices for personally identifiable information

8. Performance Optimization


8.1 Index Optimization

Algorithm Parameter Effect Recommendation
HNSW M Controls graph connectivity 12-48 (higher for better accuracy)
HNSW ef_construction Search quality during build 100-500 (higher for better index)
HNSW ef_search Search quality at query time 50-500 (higher for better recall)
IVF nlist Number of clusters sqrt(vector_count) to 4*sqrt(vector_count)
IVF nprobe Clusters to search 1-10% of nlist value

8.2 Resource Allocation Guidelines

Vector Count Dimensions Recommended RAM CPU Storage
< 100K 768 4 GB 2 cores 10 GB SSD
100K - 1M 768 8-16 GB 4 cores 50 GB SSD
1M - 10M 768 32-64 GB 8 cores 200 GB SSD
10M - 100M 768 128+ GB 16+ cores 1+ TB SSD

8.3 Query Optimization

  • Vector Caching: Enable caching for frequently accessed vectors
  • Batch Processing: Group similar requests to reduce overhead
  • Asynchronous Operations: Use non-blocking queries for high-throughput scenarios
  • Metadata Indexing: Create indices on frequently filtered metadata fields
  • Query Planning: Structure complex queries for optimal execution

8.4 Scaling Strategies

  • Vertical Scaling: Increase resources (memory, CPU) for single-node deployments
  • Horizontal Scaling: Add nodes for distributed databases (Milvus, Qdrant, Weaviate)
  • Sharding: Distribute collections across multiple shards
  • Replication: Create read replicas for higher throughput
  • Load Balancing: Distribute queries across available resources

9. Troubleshooting


9.1 Common Issues

Issue Possible Causes Resolution
High Query Latency Insufficient resources, unoptimized index Increase resources, optimize index parameters
Poor Search Quality Inappropriate algorithm, low ef_search Adjust search parameters, switch index type
Out of Memory Errors Index too large for RAM Increase memory, switch to disk-based storage
Slow Indexing High dimensionality, strict parameters Adjust batch size, relax index quality parameters
Connection Failures Network issues, authentication problems Check firewall rules, verify credentials

9.2 Diagnostic Tools

  • Health Check API: Verify service status and component health
  • Metrics Dashboard: Monitor performance and resource utilization
  • Query Profiler: Analyze slow queries to identify bottlenecks
  • Log Analysis: Review logs for errors and warning patterns
  • Connection Tester: Validate network connectivity and authentication

9.3 Common Error Codes

Error Code Description Troubleshooting
400 Bad Request Check request format and parameters
401 Unauthorized Verify API credentials
404 Not Found Confirm collection/endpoint exists
413 Payload Too Large Reduce batch size or vector dimension
429 Too Many Requests Implement rate limiting or increase quota
503 Service Unavailable Check resource utilization and service status

10. Use Case Examples


import chromadb
from sentence_transformers import SentenceTransformer

# Connect to your Syntera-deployed Chroma instance
client = chromadb.Client(
    endpoint="https://your-db-instance.chroma.ai",
    api_key="your-api-key"
)

# Initialize or get collection
collection = client.get_or_create_collection("product_search")

# Load embedding model
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# Add product descriptions
product_texts = [
    "Wireless ergonomic keyboard with RGB backlight",
    "Ultra-thin laptop with 16-hour battery life",
    "Noise-cancelling bluetooth headphones"
]
product_ids = ["product-1", "product-2", "product-3"]
product_embeddings = model.encode(product_texts).tolist()
product_metadata = [
    {"category": "accessories", "price": 79.99},
    {"category": "computers", "price": 1299.99},
    {"category": "audio", "price": 249.99}
]

collection.add(
    embeddings=product_embeddings,
    documents=product_texts,
    metadatas=product_metadata,
    ids=product_ids
)

# Search for products
query = "wireless keyboard for gaming"
query_embedding = model.encode([query]).tolist()

results = collection.query(
    query_embeddings=query_embedding,
    n_results=2,
    include=["documents", "metadatas", "distances"]
)

print("Search results for:", query)
for i, (doc, metadata, distance) in enumerate(zip(
    results['documents'][0], 
    results['metadatas'][0], 
    results['distances'][0]
)):
    print(f"{i+1}. {doc} (Score: {1-distance:.2f})")
    print(f"   Category: {metadata['category']}, Price: ${metadata['price']}")

10.2 Recommendation System

import numpy as np
import requests
import json

API_URL = "https://your-db-instance.qdrant.ai"
API_KEY = "your-api-key"
COLLECTION = "product_recommendations"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

# Function to get user embeddings based on behavior
def get_user_embedding(user_id):
    # In a real system, this would be generated from user behavior
    # For this example, we create a random embedding
    return np.random.rand(768).tolist()

# Get similar products based on user preferences
def get_recommendations(user_embedding, category=None, limit=5):
    query_data = {
        "vector": user_embedding,
        "limit": limit
    }

    # Add category filter if specified
    if category:
        query_data["filter"] = {
            "must": [
                {"key": "category", "match": {"value": category}}
            ]
        }

    response = requests.post(
        f"{API_URL}/collections/{COLLECTION}/points/search",
        headers=headers,
        data=json.dumps(query_data)
    )

    return response.json()

# Get recommendations for user
user_id = "user-123"
user_embedding = get_user_embedding(user_id)
recommendations = get_recommendations(user_embedding, category="electronics")

print(f"Recommended products for user {user_id}:")
for i, result in enumerate(recommendations.get('result', [])):
    product = result['payload']
    score = result['score']
    print(f"{i+1}. {product['name']} (Score: {score:.2f})")
    print(f"   Price: ${product['price']}, Category: {product['category']}")

10.3 Document Retrieval for LLM Context

import requests
import json
from langchain.vectorstores import PGVector
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

# Connect to your PG Vector instance
CONNECTION_STRING = "postgresql+psycopg2://user:[email protected]:5432/vectordb"
COLLECTION_NAME = "document_store"

# Load and split document
loader = TextLoader("company_handbook.txt")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
texts = text_splitter.split_documents(documents)

# Initialize embedding model
embeddings = OpenAIEmbeddings()

# Store documents in vector database
db = PGVector.from_documents(
    documents=texts,
    embedding=embeddings,
    collection_name=COLLECTION_NAME,
    connection_string=CONNECTION_STRING
)

# Create retrieval-based QA chain
retriever = db.as_retriever(search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=retriever
)

# Ask questions using retrieved context
query = "What is our company's work from home policy?"
result = qa_chain.run(query)
print(f"Question: {query}")
print(f"Answer: {result}")

11. Best Practices


11.1 Data Preparation

  • Normalize Input Data: Clean and standardize text before embedding
  • Chunking Strategy: Find optimal document chunk size for your use case
  • Embedding Selection: Choose domain-appropriate embedding models
  • Quality Control: Filter out low-quality or duplicate vectors
  • Metadata Design: Plan metadata schema for efficient filtering

11.2 Index Configuration

Database Parameter Recommendation Notes
Chroma DB hnsw:M 16 for < 1M vectors, 32 for > 1M Higher values improve recall at cost of memory
Milvus nlist sqrt(N) where N is vector count Balance between search speed and accuracy
Qdrant ef_construction 128 for quality, 64 for speed Affects index build time and quality
All Batch Size 10,000 for bulk operations Adjust based on memory constraints

11.3 Production Deployment

  • Staging Environment: Test configuration in staging before production
  • Resource Monitoring: Set up alerts for high utilization
  • Backup Strategy: Regular backups with versioning
  • Scaling Plan: Define triggers and process for scaling resources
  • High Availability: Configure replication for critical applications

11.4 Query Optimization

  • Use Filtering: Apply metadata filters to reduce search space
  • Right-Size Results: Request only needed number of results (n_results)
  • Batch Similar Queries: Combine related searches when possible
  • Include Only Needed Data: Specify fields to return with "include" parameter
  • Monitor Slow Queries: Create logs for queries exceeding latency thresholds

12. FAQs


12.1 General Questions

Q: What's the difference between exact and approximate nearest neighbors search?
A: Exact search guarantees finding the closest vectors but can be slow for large datasets. Approximate methods like HNSW trade perfect accuracy for dramatically improved speed, making them practical for production use with millions of vectors.

Q: How many vectors can a single vector database instance handle?
A: This depends on the specific database, available resources, and configuration. With proper resources, Milvus and Qdrant can handle hundreds of millions of vectors, while Chroma DB and PG Vector typically perform best with tens of millions of vectors per instance.

Q: Can I use different embedding models with the same vector database?
A: Yes, but vectors from different models should be stored in separate collections as their dimensionality and semantic spaces may differ, making cross-model similarity search meaningless.

12.2 Technical Questions

Q: What distance metric should I use for my embeddings?
A: Common choices are cosine similarity for normalized embeddings (most text embeddings), Euclidean (L2) for spatial data, and dot product for some specialized cases. Match the metric to your embedding model's recommendations.

Q: How can I improve search accuracy?
A: Increase the ef_search parameter for HNSW indices, use higher quality embedding models, implement hybrid search combining vector and keyword search, and ensure proper data preprocessing.

Q: How do I handle document updates?
A: Most vector databases support upsert operations where you can update vectors and metadata using the same ID. Some databases (like Chroma) may require deleting and re-adding the vector if the embedding itself changes.

Q: What's the best way to handle very large documents?
A: Split documents into semantically meaningful chunks (paragraphs, sections) of 512-1024 tokens, create embeddings for each chunk, and store with metadata linking back to the source document. This improves retrieval precision.

12.3 Operational Questions

Q: How often should I rebuild indices?
A: For most vector databases, indices are updated incrementally as you add data. Complete rebuilds are rarely needed unless you change fundamental index parameters or experience fragmentation issues after many updates/deletes.

Q: What's the expected query latency?
A: With proper configuration, most vector databases can deliver sub-100ms query times for collections of millions of vectors. Latency increases with collection size, dimensionality, and desired recall accuracy.

Q: How do I monitor vector database health?
A: Monitor system metrics (CPU, memory, disk I/O), database-specific metrics (query latency, index size), and application-level metrics (recall accuracy on known datasets).


Last updated: May 11, 2025