For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://modelgates.ai/docs/_mcp/server.

Reasoning

This API is in beta stage and may have breaking changes.

The Responses API Beta supports advanced reasoning capabilities, allowing models to show their internal reasoning process with configurable effort levels.

Reasoning Configuration

Configure reasoning behavior using the reasoning parameter:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: 'What is the meaning of life?',    reasoning: {      effort: 'high'    },    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': 'What is the meaning of life?',        'reasoning': {            'effort': 'high'        },        'max_output_tokens': 9000,    }) result = response.json()print(result)
bash
curl -X POST https://modelgates.ai/api/v1/responses \  -H "Authorization: Bearer YOUR_MODELGATES_API_KEY" \  -H "Content-Type: application/json" \  -d '{    "model": "openai/o4-mini",    "input": "What is the meaning of life?",    "reasoning": {      "effort": "high"    },    "max_output_tokens": 9000  }'

Reasoning Effort Levels

The effort parameter controls how much computational effort the model puts into reasoning:

Effort LevelDescription
minimalBasic reasoning with minimal computational effort
lowLight reasoning for simple problems
mediumBalanced reasoning for moderate complexity
highDeep reasoning for complex problems

Complex Reasoning Example

For complex mathematical or logical problems:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: [      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'Was 1995 30 years ago? Please show your reasoning.',          },        ],      },    ],    reasoning: {      effort: 'high'    },    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': [            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'Was 1995 30 years ago? Please show your reasoning.',                    },                ],            },        ],        'reasoning': {            'effort': 'high'        },        'max_output_tokens': 9000,    }) result = response.json()print(result)

Reasoning in Conversation Context

Include reasoning in multi-turn conversations:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: [      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'What is your favorite color?',          },        ],      },      {        type: 'message',        role: 'assistant',        id: 'msg_abc123',        status: 'completed',        content: [          {            type: 'output_text',            text: "I don't have a favorite color.",            annotations: []          }        ]      },      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'How many Earths can fit on Mars?',          },        ],      },    ],    reasoning: {      effort: 'high'    },    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': [            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'What is your favorite color?',                    },                ],            },            {                'type': 'message',                'role': 'assistant',                'id': 'msg_abc123',                'status': 'completed',                'content': [                    {                        'type': 'output_text',                        'text': "I don't have a favorite color.",                        'annotations': []                    }                ]            },            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'How many Earths can fit on Mars?',                    },                ],            },        ],        'reasoning': {            'effort': 'high'        },        'max_output_tokens': 9000,    }) result = response.json()print(result)

Streaming Reasoning

Enable streaming to see reasoning develop in real-time:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: 'Solve this step by step: If a train travels 60 mph for 2.5 hours, how far does it go?',    reasoning: {      effort: 'medium'    },    stream: true,    max_output_tokens: 9000,  }),}); const reader = response.body?.getReader();const decoder = new TextDecoder(); while (true) {  const { done, value } = await reader.read();  if (done) break;   const chunk = decoder.decode(value);  const lines = chunk.split('\n');   for (const line of lines) {    if (line.startsWith('data: ')) {      const data = line.slice(6);      if (data === '[DONE]') return;       try {        const parsed = JSON.parse(data);        if (parsed.type === 'response.reasoning.delta') {          console.log('Reasoning:', parsed.delta);        }      } catch (e) {        // Skip invalid JSON      }    }  }}
python
import requestsimport json response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': 'Solve this step by step: If a train travels 60 mph for 2.5 hours, how far does it go?',        'reasoning': {            'effort': 'medium'        },        'stream': True,        'max_output_tokens': 9000,    },    stream=True) for line in response.iter_lines():    if line:        line_str = line.decode('utf-8')        if line_str.startswith('data: '):            data = line_str[6:]            if data == '[DONE]':                break            try:                parsed = json.loads(data)                if parsed.get('type') == 'response.reasoning.delta':                    print(f"Reasoning: {parsed.get('delta', '')}")            except json.JSONDecodeError:                continue

Response with Reasoning

When reasoning is enabled, the response includes reasoning information:

json
{  "id": "resp_1234567890",  "object": "response",  "created_at": 1234567890,  "model": "openai/o4-mini",  "output": [    {      "type": "reasoning",      "id": "rs_abc123",      "encrypted_content": "gAAAAABotI9-FK1PbhZhaZk4yMrZw3XDI1AWFaKb9T0NQq7LndK6zaRB...",      "summary": [        "First, I need to determine the current year",        "Then calculate the difference from 1995",        "Finally, compare that to 30 years"      ]    },    {      "type": "message",      "id": "msg_xyz789",      "status": "completed",      "role": "assistant",      "content": [        {          "type": "output_text",          "text": "Yes. In 2025, 1995 was 30 years ago. In fact, as of today (Aug 31, 2025), it's exactly 30 years since Aug 31, 1995.",          "annotations": []        }      ]    }  ],  "usage": {    "input_tokens": 15,    "output_tokens": 85,    "output_tokens_details": {      "reasoning_tokens": 45    },    "total_tokens": 100  },  "status": "completed"}

Best Practices

  1. Choose appropriate effort levels: Use high for complex problems, low for simple tasks
  2. Consider token usage: Reasoning increases token consumption
  3. Use streaming: For long reasoning chains, streaming provides better user experience
  4. Include context: Provide sufficient context for the model to reason effectively

Next Steps