Skip to content

API Reference

MegaBrain exposes a REST API for search, ingestion, and health checking.


Base URL

  • Development: http://localhost:8080
  • Production: https://your-domain.com

Authentication

Currently, authentication is not enforced. Future versions will support API keys, OAuth 2.0, and JWT tokens. Source control tokens (GitHub, GitLab, Bitbucket) are configured server-side via application.properties or environment variables.


Health Check

GET /q/health

Response (200 OK):

{
  "status": "UP",
  "message": "MegaBrain is running"
}


Search Code

GET /api/v1/search

Query Parameters:

Parameter Type Required Default Description
query string Yes - Search query string. Supports Lucene syntax: AND/OR/NOT, "phrase queries", wildcards (*, ?), field:value. Also supports structural queries: implements:InterfaceName, extends:ClassName, usages:TypeName.
limit integer No 10 Maximum number of results (1-100)
mode string No HYBRID Search mode: HYBRID, KEYWORD, or VECTOR
transitive boolean No false Enable transitive relationship traversal for structural queries
depth integer No 5 Maximum traversal depth for transitive queries (1-10)
language string No - Filter by programming language (e.g., java, python). Multiple values supported.
repository string No - Filter by repository name. Multiple values supported.
file_path string No - Filter by file path prefix
entity_type string No - Filter by entity type: class, method, field, function
include_field_match boolean No false Include per-field match scores in results (uses Lucene Explanation API)

Example Request:

curl "http://localhost:8080/api/v1/search?query=getUserName&limit=5&language=java&mode=HYBRID"

Response (200 OK):

{
  "results": [
    {
      "content": "public String getUserName() { return this.userName; }",
      "entity_name": "UserService.getUserName",
      "entity_type": "method",
      "source_file": "src/main/java/com/example/UserService.java",
      "language": "java",
      "repository": "my-app",
      "score": 0.95,
      "line_range": {
        "start": 42,
        "end": 44
      },
      "doc_summary": "Returns the user name for the current session",
      "is_transitive": false,
      "relationship_path": null,
      "field_match": null
    }
  ],
  "total": 1,
  "page": 1,
  "size": 5,
  "facets": {
    "language": [
      { "value": "java", "count": 150 },
      { "value": "typescript", "count": 42 }
    ],
    "repository": [
      { "value": "my-app", "count": 120 }
    ],
    "entity_type": [
      { "value": "method", "count": 85 },
      { "value": "class", "count": 45 }
    ]
  }
}

Search Response Format

SearchResponse:

Field Type Description
results SearchResult[] Array of search results
total integer Total number of matching results
page integer Current page number
size integer Page size
facets Map Available filter values with counts

SearchResult:

Field Type Description
content string Code content snippet
entity_name string Fully qualified entity name
entity_type string Entity type (class, method, field, function)
source_file string Source file path
language string Programming language
repository string Repository name
score float Relevance score (0.0 - 1.0)
line_range LineRange Start and end line numbers
doc_summary string Documentation summary (if available)
is_transitive boolean Whether found via transitive traversal
relationship_path string[] Traversal path for transitive results (e.g., ["Interface", "AbstractClass", "ConcreteClass"])
field_match FieldMatchInfo Per-field match details (when include_field_match=true)

FieldMatchInfo:

Field Type Description
matched_fields string[] Fields that matched the query
scores Map Per-field relevance scores

Search Modes

Mode Description
HYBRID Combines keyword search (Lucene) and vector search (pgvector) with weighted scoring. Default mode.
KEYWORD Uses only Lucene keyword search. Faster, no vector database required.
VECTOR Uses only vector similarity search. Requires pgvector configured.

Structural Queries

Structural queries leverage the dependency graph for relationship-aware search:

Syntax Description Example
implements:InterfaceName Find all classes implementing an interface implements:Repository
extends:ClassName Find all subclasses of a class extends:AbstractService
usages:TypeName Find all usages including polymorphic call sites usages:UserService

When transitive=true, these queries traverse the full inheritance hierarchy up to the configured depth.


Ingestion

Trigger Ingestion

POST /api/v1/ingestion
Content-Type: application/json

Request Body:

{
  "url": "https://github.com/user/repo",
  "branch": "main",
  "provider": "github"
}

Field Type Required Description
url string Yes Repository URL
branch string No Branch to ingest (defaults to default branch)
provider string Yes Source control provider: github, gitlab, bitbucket

Response (200 OK): Streams progress events via Server-Sent Events (SSE) with stage, message, and percentage fields. The CLI ingest command consumes the same progress stream and displays it in the terminal.


CLI

The MegaBrain CLI is available when running the application in CLI mode (e.g. java -jar megabrain-runner.jar or the megabrain native executable). The ingest command supports --source, --repo, --branch, --token, and --incremental. Run megabrain ingest --help to see full usage and options.

# Show top-level help
megabrain --help

# Show ingest command usage and options
megabrain ingest --help

# Ingest a repository (required: --source, --repo)
megabrain ingest --source github --repo olexmal/MegaBrain
megabrain ingest --source github --repo owner/repo --branch develop --token YOUR_TOKEN --incremental

# Search code (when implemented)
megabrain search --query "dependency graph builder" --limit 5

MCP Server

MegaBrain supports the Model Context Protocol (MCP) for LLM tool integration.

  • Transport: stdio (primary), SSE (secondary)
  • Purpose: Expose search, ingestion, and dependency tools to LLM clients
  • See EPIC-08 documentation for detailed protocol and tool schemas

RAG

POST /api/v1/rag and POST /api/v1/rag/stream provide RAG (retrieval-augmented generation) Q&A. AC6 (first token within 2s) is validated by an integration test with a mocked RAG service (time-to-first-token < 2000 ms); production compliance with a real LLM is validated by demo or APM. The first-token test is tagged as a performance test and runs only with the performance Maven profile.


Error Responses

All errors follow this format:

{
  "error": "Error Type",
  "message": "Human-readable error message",
  "timestamp": "2026-02-19T10:30:00Z",
  "path": "/api/v1/endpoint"
}

HTTP Status Codes:

Code Meaning
200 OK Success
201 Created Resource created
400 Bad Request Invalid request (e.g., missing query, depth out of range)
404 Not Found Resource not found
429 Too Many Requests Rate limit exceeded
500 Internal Server Error Server error