Skip to content

Supported Endpoints

Summary

Nexus supports eight public endpoints:

  • GET /v1/models
  • GET /v1/balance
  • POST /v1/chat/completions
  • POST /v1/completions
  • POST /v1/messages
  • POST /v1/messages/count_tokens
  • POST /v1/responses
  • POST /v1/responses/compact

GET /v1/models

Returns the models available to your key.

Use this endpoint before configuring a client or when switching models.

GET /v1/balance

Returns the current balance for the API key used in the request.

POST /v1/messages

Primary endpoint for Claude Code and Claude-compatible clients.

Supports streaming.

Claude models are available through this endpoint when they appear for your key in GET /v1/models.

POST /v1/messages/count_tokens

Endpoint for pre-counting tokens in Claude-compatible clients.

POST /v1/chat/completions

Endpoint for OpenAI-compatible clients that use the chat.completions format.

Use the regular JSON response mode for this endpoint.

Claude models with openai support can also use this endpoint.

POST /v1/completions

Endpoint for OpenAI-compatible clients that use the completions format.

Use the regular JSON response mode for this endpoint.

POST /v1/responses

Primary endpoint for Codex and response-compatible clients.

Supports streaming.

POST /v1/responses/compact

Endpoint for compact response workflows.

Supports streaming.

Output Token Limit

It is better to pass the response limit explicitly:

  • max_tokens for messages;
  • max_completion_tokens or max_tokens for chat.completions;
  • max_tokens for completions;
  • max_output_tokens for responses and responses/compact.

If the limit is not provided, Nexus applies a safe default automatically.