Embeddings API

Create, manage, and search text embeddings using VoyageAI's voyage-3-large model

Introduction

This API allows you to create, retrieve, update, search, and manage text embeddings using VoyageAI's voyage-3-large model. Text embeddings convert textual content into high-dimensional vector representations that capture semantic meaning, enabling powerful similarity searches.

Authentication

All API endpoints require authentication using a personal access token. To use the API:

  1. Sign in to the Filament admin at /admin/login and open Personal Access Tokens.
  2. Ensure the token has the appropriate permissions (embeddings:create, embeddings:read, and/or embeddings:delete)
  3. Include the token in your requests using the Authorization header:
Authorization: Bearer YOUR_TOKEN

Base URL

All endpoints are prefixed with /api/v1.

Endpoints

POST

/embeddings

Generate and store an embedding for a piece of text.

Permissions: create or embeddings:create

Request Body

{
  "content": "This is the text to be embedded",
  "source": "optional-source-information",
  "tag": "optional-tag",
  "metadata": {
    "category": "example-category",
    "custom-field": "custom-value"
  }
}
  • content (required): The text to generate embeddings for
  • source (optional): Information about where the content originated from
  • tag (optional): A tag for categorizing embeddings
  • metadata (optional): JSON object with additional metadata

Response

{
  "message": "1 new embedding created from request content.",
  "original_content_hash": "a1b2c3d4e5f6...",
  "total_chunks_processed": 1,
  "embeddings": [
    {
      "id": 1,
      "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
      "content_preview": "This is the text to be embedded...",
      "model": "voyage-3-large",
      "dimension": 1024,
      "source": "optional-source-information",
      "tag": "optional-tag",
      "metadata": {
        "category": "example-category",
        "custom-field": "custom-value",
        "original_document_hash": "a1b2c3d4e5f6..."
      },
      "created_at": "2025-05-20T12:34:56.000000Z",
      "status": "created"
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

Note: Direct API calls store content as a single embedding without chunk metadata. Embeddings produced by the document ingestion pipeline may include `chunk_index` / `total_chunks` values.

POST

/embeddings/batch

Create multiple embeddings in a single request.

Permissions: create or embeddings:create

Request Body

{
  "items": [
    {
      "content": "First text to embed",
      "source": "source-for-first-text",
      "tag": "tag1",
      "metadata": { "type": "example" }
    },
    {
      "content": "Second text to embed",
      "source": "source-for-second-text",
      "tag": "tag2",
      "metadata": { "type": "example" }
    }
  ]
}

Response

{
  "message": "2 new embedding(s) created.",
  "results": [
    {
      "original_request_index": 0,
      "original_document_hash": "a1b2c3d4e5f6...",
      "total_chunks": 1,
      "embeddings": [
        {
          "id": 1,
          "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
          "content_preview": "First text to embed...",
          "status": "created",
          "metadata": {
            "type": "example",
            "original_document_hash": "a1b2c3d4e5f6..."
          },
          "tag": "tag1",
          "source": "source-for-first-text",
          "model": "voyage-3-large",
          "dimension": 1024
        }
      ]
    },
    {
      "original_request_index": 1,
      "original_document_hash": "g7h8i9j0k1l2...",
      "total_chunks": 1,
      "embeddings": [
        {
          "id": 2,
          "uuid": "fbe07bc0-1234-5678-9abc-1234567890cd",
          "content_preview": "Second text to embed...",
          "status": "created",
          "metadata": {
            "type": "example",
            "original_document_hash": "g7h8i9j0k1l2..."
          },
          "tag": "tag2",
          "source": "source-for-second-text",
          "model": "voyage-3-large",
          "dimension": 1024
        }
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

Note: Batch requests create one embedding per item; document-ingested content may appear as multiple segments with additional metadata.

GET

/embeddings

Retrieve all embeddings for the authenticated user.

Permissions: read or embeddings:read

Query Parameters

  • tag (optional): Filter embeddings by tag
  • limit (optional): Number of results per page (default: 50, max: 100)
  • page (optional): Page number for pagination
  • include_vectors (optional): Set to true to include the embedding vectors in the response

Response

{
  "data": [
    {
      "id": 1,
      "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
      "content": "This is the text to be embedded",
      "source": "optional-source-information",
      "model": "voyage-3-large",
      "dimension": 1024,
      "tag": "optional-tag",
      "metadata": {
        "category": "example-category",
        "custom-field": "custom-value"
      },
      "created_at": "2025-05-20T12:34:56.000000Z",
      "updated_at": "2025-05-20T12:34:56.000000Z"
    }
    // Additional embeddings...
  ],
  "links": {
    "first": "http://localhost/api/v1/embeddings?page=1",
    "last": "http://localhost/api/v1/embeddings?page=1",
    "prev": null,
    "next": null
  },
  "meta": {
    "current_page": 1,
    "from": 1,
    "last_page": 1,
    "path": "http://localhost/api/v1/embeddings",
    "per_page": 50,
    "to": 1,
    "total": 1
  }
}
GET

/embeddings/{uuid}

Retrieve a specific embedding by UUID.

Permissions: read or embeddings:read

Response

{
  "embedding": {
    "id": 1,
    "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
    "content": "This is the text to be embedded",
    "source": "optional-source-information",
    "embedding": [0.123, 0.456, ...],
    "model": "voyage-3-large",
    "dimension": 1024,
    "tag": "optional-tag",
    "metadata": {
      "category": "example-category",
      "custom-field": "custom-value"
    },
    "created_at": "2025-05-20T12:34:56.000000Z",
    "updated_at": "2025-05-20T12:34:56.000000Z"
  }
}
PUT

/embeddings/{uuid}

Update a specific embedding by UUID. You can update content, tag, metadata, or any combination of these fields.

Permissions: create or embeddings:create

Request Body

{
  "content": "Updated text content",  // Optional
  "source": "updated-source-info",    // Optional
  "tag": "new-tag",                   // Optional
  "metadata": {                       // Optional
    "category": "updated-category", 
    "custom-field": "updated-value"
  }
}
  • At least one of content, source, tag, or metadata must be provided
  • If content is updated, a new embedding vector will be generated
  • If only source, tag, and/or metadata are updated, the embedding vector remains unchanged

Response

{
  "message": "Embedding updated successfully",
  "embedding": {
    "id": 1,
    "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
    "content": "Updated text content",
    "source": "updated-source-info",
    "model": "voyage-3-large",
    "dimension": 1024,
    "tag": "new-tag",
    "metadata": {
      "category": "updated-category",
      "custom-field": "updated-value"
    },
    "updated_at": "2025-05-20T13:45:30.000000Z"
  }
}

Error Cases

  • 404 Not Found: Embedding with the specified UUID does not exist
  • 409 Conflict: When updating content to something that already exists for this user
  • 422 Unprocessable Entity: No valid fields provided for update
DELETE

/embeddings/{uuid}

Delete a specific embedding by UUID.

Permissions: delete or embeddings:delete

Response

{
  "message": "Embedding deleted successfully"
}
POST

/embeddings/search

Search for embeddings using a text query that will be converted to a vector for similarity search.

Permissions: read or embeddings:read

Request Body

{
  "query": "The text to search for",
  "tag": "optional-tag-filter",          // Optional
  "threshold": 0.7,                      // Optional (default: 0.7)
  "limit": 20,                           // Optional (default: 20, max: 100)
  "include_content": true                // Optional (default: true)
}
  • query (required): The text query to search for similar embeddings
  • tag (optional): Filter results to embeddings with this tag
  • threshold (optional): Cosine similarity threshold (0-1, higher is more similar)
  • limit (optional): Maximum number of results to return
  • include_content (optional): Whether to include the content text in the response

Note: Embedding vectors are not returned in search results for security and bandwidth reasons.

Response

{
  "results": [
    {
      "id": 1,
      "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
      "content": "This is similar to the query text",  // Omitted if include_content=false
      "source": "example-source-information",
      "tag": "example-tag",
      "similarity": 0.89,
      "metadata": {
        "original_document_hash": "a1b2c3d4e5f6...",
        "total_chunks": 1,
        "category": "example-category"
      },
      "created_at": "2025-05-20T12:34:56.000000Z"
    },
    // Additional results...
  ],
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}
POST

/embeddings/search/vector

Search for embeddings using a raw vector for similarity search.

Permissions: read or embeddings:read

Request Body

{
  "vector": [0.1, 0.2, 0.3, ...],       // Must match dimension of stored embeddings
  "tag": "optional-tag-filter",          // Optional
  "threshold": 0.7,                      // Optional (default: 0.7)
  "limit": 20,                           // Optional (default: 20, max: 100)
  "include_content": true                // Optional (default: true)
}
  • vector (required): The vector to use for similarity search
  • tag (optional): Filter results to embeddings with this tag
  • threshold (optional): Cosine similarity threshold (0-1, higher is more similar)
  • limit (optional): Maximum number of results to return
  • include_content (optional): Whether to include the content text in the response

Note: Embedding vectors are not returned in search results for security and bandwidth reasons.

Response

{
  "results": [
    {
      "id": 1,
      "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
      "content": "This is similar to the query vector",  // Omitted if include_content=false
      "source": "example-source-information",
      "tag": "example-tag",
      "similarity": 0.85,
      "metadata": {
        "original_document_hash": "a1b2c3d4e5f6...",
        "total_chunks": 1,
        "category": "example-category"
      },
      "created_at": "2025-05-20T12:34:56.000000Z"
    },
    // Additional results...
  ]
}
POST

/ask

Ask a question and receive an answer generated by gpt-4o-mini using the highest scoring semantic search matches as context.

Permissions: read or embeddings:read

Request Body

{
  "question": "How does the queue worker reconnect to the database?",
  "tag": "docs",                     // Optional filter for context snippets
  "threshold": 0.65,                  // Optional (default: 0.7)
  "limit": 3,                         // Optional (default: 3, max: 20)
  "temperature": 1.0,                 // Optional (default: 1.0)
  "max_completion_tokens": 600,       // Optional (default: 600)
  "project_uuid": "f4138f8e-4f0d-4d9d-9860-1af9e8f5139a"
}
  • question (required): Natural language question to answer
  • tag (optional): Restrict candidate snippets to a specific tag
  • threshold (optional): Minimum similarity score for context entries
  • limit (optional): Maximum number of snippets to include (higher increases prompt size)
  • temperature (optional): Sampling temperature passed to OpenAI
  • max_completion_tokens (optional): Maximum tokens the model may generate (default: 600)
  • project_uuid (required): Project scope for search and auditing

Note: The assistant cites context items using bracketed numbers that correspond to the order in the response payload. Configure each project's System Prompt to steer tone or formatting. The default gpt-4o-mini model balances quality and speed; adjust the temperature to trade determinism for creativity.

Response

{
  "answer": "Queue workers reconnect by listening for the JobProcessing event and restarting the database connection if it has been lost.\n\nSources: [1]",
  "context": [
    {
      "rank": 1,
      "uuid": "eae07bc0-1234-5678-9abc-1234567890ab",
      "project_uuid": "f4138f8e-4f0d-4d9d-9860-1af9e8f5139a",
      "source": "docs/queues.md",
      "tag": "docs",
      "similarity": 0.912,
      "content": "Queue workers reconnect by...",
      "metadata": {
        "category": "queues"
      }
    }
  ],
  "metadata": {
    "question": "How does the queue worker reconnect to the database?",
    "model": "gpt-4o-mini",
    "finish_reason": "stop",
    "usage": {
      "prompt_tokens": 542,
      "completion_tokens": 74,
      "total_tokens": 616
    },
    "search": {
      "query": "How does the queue worker reconnect to the database?",
      "threshold": 0.65,
      "count": 1,
      "model": "voyage-3-large"
    },
    "options": {
      "threshold": 0.65,
      "limit": 3,
      "tag": "docs"
    }
  }
}

When no snippets exceed the threshold, the assistant responds that the answer could not be determined from the available context.

Examples

Create an Embedding

curl -X POST https://your-domain.com/api/v1/embeddings \
  -H 'Authorization: Bearer YOUR_API_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "content": "This is a sample text to embed",
    "source": "API documentation example",
    "tag": "documentation",
    "metadata": {
      "category": "examples",
      "version": "1.0"
    }
  }'

Batch Create Embeddings

curl -X POST https://your-domain.com/api/v1/embeddings/batch \
  -H 'Authorization: Bearer YOUR_API_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "items": [
      {
        "content": "First sample text",
        "source": "Sample source 1",
        "tag": "sample"
      },
      {
        "content": "Second sample text",
        "source": "Sample source 2",
        "tag": "sample"
      }
    ]
  }'

List All Embeddings

curl -X GET https://your-domain.com/api/v1/embeddings \
  -H 'Authorization: Bearer YOUR_API_TOKEN'

List Embeddings with Specific Tag

curl -X GET https://your-domain.com/api/v1/embeddings?tag=documentation \
  -H 'Authorization: Bearer YOUR_API_TOKEN'

Get Specific Embedding

curl -X GET https://your-domain.com/api/v1/embeddings/eae07bc0-1234-5678-9abc-1234567890ab \
  -H 'Authorization: Bearer YOUR_API_TOKEN'

Update an Embedding

curl -X PUT https://your-domain.com/api/v1/embeddings/eae07bc0-1234-5678-9abc-1234567890ab \
  -H 'Authorization: Bearer YOUR_API_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "content": "Updated sample text",
    "source": "Updated source information",
    "tag": "updated-tag"
  }'

Delete an Embedding

curl -X DELETE https://your-domain.com/api/v1/embeddings/eae07bc0-1234-5678-9abc-1234567890ab \
  -H 'Authorization: Bearer YOUR_API_TOKEN'

Search Embeddings by Text

curl -X POST https://your-domain.com/api/v1/embeddings/search \
  -H 'Authorization: Bearer YOUR_API_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "semantic search example",
    "threshold": 0.75,
    "limit": 3
  }'

Search Embeddings by Vector

curl -X POST https://your-domain.com/api/v1/embeddings/search/vector \
  -H 'Authorization: Bearer YOUR_API_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "vector": [0.1, 0.2, 0.3, ...],
    "threshold": 0.8,
    "limit": 10,
    "include_content": true
  }'

Error Responses

The API will return appropriate error responses with status codes:

  • 400 Bad Request: Malformed request
  • 401 Unauthorized: Missing or invalid authentication
  • 403 Forbidden: Valid authentication but insufficient permissions
  • 404 Not Found: Requested resource not found
  • 422 Unprocessable Entity: Validation errors
  • 500 Internal Server Error: Server-side error

Example Error Response

{
  "error": "Validation failed",
  "details": {
    "content": ["The content field is required."]
  }
}

Setup Requirements

  1. Set your VoyageAI API key in the .env file:
    VOYAGE_AI_API_KEY=your_api_key_here
  2. Run database migrations:
    php artisan migrate
  3. Create an API token with appropriate permissions in the web UI.