API Reference

Sources API

Complete API reference for adding, listing, retrieving, deleting, and refreshing knowledge sources that power your Chatsby agents.

The Sources API lets you manage the knowledge base that powers your agents. Sources are the content your agent draws from when answering questions — websites, documents, raw text, and structured Q&A pairs.

The Source Object

FieldTypeDescription
idstringUnique identifier (e.g., src_abc123defg). Read-only.
typestringSource type: website, text, file, or qa.
contentstringThe source data. Contents vary by type (see Source Types).
statusstringProcessing state: pending, processing, trained, or failed. Read-only.
character_countintegerCharacters extracted after processing. Read-only. null until trained.
agent_idstringThe agent this source belongs to. Read-only.
error_messagestringError description if status is failed. null otherwise. Read-only.
created_atstringISO 8601 creation timestamp. Read-only.
{
  "id": "src_abc123defg",
  "type": "website",
  "content": "https://acme.com/help",
  "status": "trained",
  "character_count": 24580,
  "agent_id": "agent_1a2b3c4d5e",
  "error_message": null,
  "created_at": "2025-03-10T08:15:00Z"
}

Source Types

Each type accepts different values in the content field and has different processing behavior.

TypeContent FieldProcessingNotes
websiteFull URL (including https://)Chatsby crawls the page and extracts text.Set crawl_subpages: true to follow links on the same domain.
textRaw text stringIndexed immediately.Max 100,000 characters per source.
fileFile reference (from upload)PDF, DOCX, and TXT supported. Content is extracted and indexed.Max file size: 10 MB.
qaJSON array stringEach Q&A pair is indexed individually for precise matching.Format: [{"question": "...", "answer": "..."}]. Max 500 pairs.

Processing States

Sources go through a processing pipeline after creation:

pending → processing → trained
                     → failed
  • pending — Source has been created but processing has not started yet.
  • processing — Content is being crawled, extracted, or indexed. This typically takes 10-60 seconds for text and Q&A, and 1-5 minutes for websites and files.
  • trained — Processing is complete. The content is available to the agent for answering questions.
  • failed — Processing encountered an error. Check the error_message field for details.

When a source reaches trained or failed status, a source.trained or source.failed webhook event is fired if you have webhooks configured. Use webhooks to trigger downstream actions like notifying your team or updating your UI.

Add a Source

Adds a new knowledge source to an agent. Processing begins immediately and runs asynchronously.

POST /v1/agents/{agent_id}/sources

Path Parameters

ParameterTypeDescription
agent_idstringThe agent to add the source to.

Request Body

ParameterTypeRequiredDescription
typestringYesSource type: website, text, file, or qa.
contentstringYesSource data (varies by type — see Source Types).
crawl_subpagesbooleanNoFor website type only. Crawl links on the same domain. Default: false.

cURL — Website Source

curl -X POST https://api.chatsby.co/v1/agents/agent_1a2b3c4d5e/sources \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "website",
    "content": "https://acme.com/help",
    "crawl_subpages": true
  }'

cURL — Text Source

curl -X POST https://api.chatsby.co/v1/agents/agent_1a2b3c4d5e/sources \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "text",
    "content": "Acme Corp offers three plans: Starter ($9/mo), Pro ($29/mo), and Enterprise (custom pricing). All plans include unlimited agents."
  }'

cURL — Q&A Source

curl -X POST https://api.chatsby.co/v1/agents/agent_1a2b3c4d5e/sources \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "qa",
    "content": "[{\"question\": \"What are your business hours?\", \"answer\": \"Monday to Friday, 9 AM to 6 PM EST.\"}, {\"question\": \"How do I reset my password?\", \"answer\": \"Click Forgot Password on the login page and follow the email instructions.\"}]"
  }'

JavaScript

const response = await fetch(
  'https://api.chatsby.co/v1/agents/agent_1a2b3c4d5e/sources',
  {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      type: 'website',
      content: 'https://acme.com/help',
      crawl_subpages: true,
    }),
  }
);
 
const source = await response.json();
console.log(source.id);     // "src_xyz456abcd"
console.log(source.status); // "pending"

Response 201 Created

{
  "id": "src_xyz456abcd",
  "type": "website",
  "content": "https://acme.com/help",
  "status": "pending",
  "character_count": null,
  "agent_id": "agent_1a2b3c4d5e",
  "error_message": null,
  "created_at": "2025-03-15T12:00:00Z"
}

Error Responses

StatusCodeDescription
400missing_required_fieldThe type or content field is missing.
400invalid_parameterInvalid source type or malformed URL.
404resource_not_foundThe specified agent does not exist.
422validation_errorContent exceeds size limits or Q&A JSON is malformed.

List Sources

Retrieves all sources for a given agent, with pagination.

GET /v1/agents/{agent_id}/sources

Path Parameters

ParameterTypeDescription
agent_idstringThe agent whose sources to list.

Query Parameters

ParameterTypeDefaultDescription
page_sizeinteger20Results per page (1 - 100).
cursorstringPagination cursor from a previous response.
statusstringFilter by status: pending, processing, trained, failed.
typestringFilter by source type: website, text, file, qa.

cURL

curl "https://api.chatsby.co/v1/agents/agent_1a2b3c4d5e/sources?status=trained" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "data": [
    {
      "id": "src_abc123defg",
      "type": "website",
      "content": "https://acme.com/help",
      "status": "trained",
      "character_count": 24580,
      "agent_id": "agent_1a2b3c4d5e",
      "error_message": null,
      "created_at": "2025-03-10T08:15:00Z"
    },
    {
      "id": "src_def456ghij",
      "type": "text",
      "content": "Acme Corp offers three plans...",
      "status": "trained",
      "character_count": 156,
      "agent_id": "agent_1a2b3c4d5e",
      "error_message": null,
      "created_at": "2025-03-10T08:20:00Z"
    }
  ],
  "has_more": false,
  "next_cursor": null
}

Retrieve a Source

Retrieves the details of a single source by its ID.

GET /v1/sources/{source_id}

Path Parameters

ParameterTypeDescription
source_idstringThe unique identifier of the source.

cURL

curl https://api.chatsby.co/v1/sources/src_abc123defg \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

Returns the full source object.

Error Responses

StatusCodeDescription
404resource_not_foundNo source exists with the given ID.

Delete a Source

Permanently removes a source from the agent's knowledge base. The agent will no longer use this content when answering questions. This action is irreversible.

DELETE /v1/sources/{source_id}

Path Parameters

ParameterTypeDescription
source_idstringThe unique identifier of the source to delete.

cURL

curl -X DELETE https://api.chatsby.co/v1/sources/src_abc123defg \
  -H "Authorization: Bearer YOUR_API_KEY"

JavaScript

await fetch('https://api.chatsby.co/v1/sources/src_abc123defg', {
  method: 'DELETE',
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
});

Response 204 No Content

No response body. The source has been permanently deleted and removed from the agent's knowledge base.

Error Responses

StatusCodeDescription
404resource_not_foundNo source exists with the given ID.

Refresh a Source

Re-crawls a website source to pick up content changes. This is useful when the underlying webpage has been updated and you want the agent's knowledge to reflect the latest version.

Refreshing creates a new processing job. The source status transitions back through processing before returning to trained.

POST /v1/sources/{source_id}/refresh

Path Parameters

ParameterTypeDescription
source_idstringThe unique identifier of the website source to refresh.

cURL

curl -X POST https://api.chatsby.co/v1/sources/src_abc123defg/refresh \
  -H "Authorization: Bearer YOUR_API_KEY"

JavaScript

const response = await fetch(
  'https://api.chatsby.co/v1/sources/src_abc123defg/refresh',
  {
    method: 'POST',
    headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
  }
);
 
const source = await response.json();
console.log(source.status); // "processing"

Response 200 OK

{
  "id": "src_abc123defg",
  "type": "website",
  "content": "https://acme.com/help",
  "status": "processing",
  "character_count": 24580,
  "agent_id": "agent_1a2b3c4d5e",
  "error_message": null,
  "created_at": "2025-03-10T08:15:00Z"
}

The character_count retains the previous value during re-processing and updates when the refresh completes. During processing, the agent continues to use the previously trained content — there is no downtime.

Error Responses

StatusCodeDescription
400invalid_operationThe source is not a website type. Only website sources can be refreshed.
404resource_not_foundNo source exists with the given ID.
409conflictThe source is already being processed. Wait for the current job to complete.

Webhook Event: source.trained

When a source finishes processing successfully, Chatsby sends a source.trained webhook event to your configured endpoints.

{
  "event": "source.trained",
  "created_at": "2025-03-15T12:05:00Z",
  "data": {
    "id": "src_xyz456abcd",
    "type": "website",
    "content": "https://acme.com/help",
    "status": "trained",
    "character_count": 31200,
    "agent_id": "agent_1a2b3c4d5e",
    "created_at": "2025-03-15T12:00:00Z"
  }
}

If processing fails, a source.failed event is sent instead, with the error_message field populated. See the Webhooks guide for configuration details and security verification.