文档

OpenAI 兼容端点

响应

创建支持流式传输、推理、先验响应状态以及可选远程 MCP 工具的响应。

cURL(非流式)
curl https://:1234/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-20b",
    "input": "Provide a prime number less than 50",
    "reasoning": { "effort": "low" }
  }'
有状态的后续请求

使用先前响应中的 id 作为 previous_response_id

curl https://:1234/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-20b",
    "input": "Multiply it by 2",
    "previous_response_id": "resp_123"
  }'
流式
curl https://:1234/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-20b",
    "input": "Hello",
    "stream": true
  }'

您将收到 SSE 事件,例如 response.createdresponse.output_text.deltaresponse.completed

工具与远程 MCP(可选)

在应用中启用远程 MCP(开发者 → 设置)。使用 MCP 服务器工具的示例负载

curl https://:1234/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm/granite-4-micro",
    "input": "What is the top trending model on hugging face?",
    "tools": [
      {
        "type": "mcp",
        "server_label": "huggingface",
        "server_url": "https://hugging-face.cn/mcp",
        "allowed_tools": [
          "model_search"
        ]
      }
    ]
  }'

此页面的源代码可在 GitHub 上查看