For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://modelgates.ai/docs/_mcp/server.

Web Search

This API is in beta stage and may have breaking changes.

The Responses API Beta supports web search integration, allowing models to access real-time information from the internet and provide responses with proper citations and annotations.

The web search plugin (plugins: [{ id: "web" }]) shown below is deprecated. Use the modelgates:web_search server tool instead, which works with both the Chat Completions and Responses APIs via the tools array.

Web Search Plugin

Enable web search using the plugins parameter:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: 'What is ModelGates?',    plugins: [{ id: 'web', max_results: 3 }],    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': 'What is ModelGates?',        'plugins': [{'id': 'web', 'max_results': 3}],        'max_output_tokens': 9000,    }) result = response.json()print(result)
bash
curl -X POST https://modelgates.ai/api/v1/responses \  -H "Authorization: Bearer YOUR_MODELGATES_API_KEY" \  -H "Content-Type: application/json" \  -d '{    "model": "openai/o4-mini",    "input": "What is ModelGates?",    "plugins": [{"id": "web", "max_results": 3}],    "max_output_tokens": 9000  }'

Plugin Configuration

Configure web search behavior:

ParameterTypeDescription
idstringRequired. Must be "web"
enginestringSearch engine: "native", "exa", "firecrawl", "parallel", or omit for auto
max_resultsintegerMaximum search results to retrieve (1-25, default 5)
include_domainsstring[]Restrict results to these domains (supports wildcards like *.substack.com)
exclude_domainsstring[]Exclude results from these domains

See the Web Search plugin docs for full details on engine selection, domain filter compatibility, and pricing.

X Search Filters (xAI only)

When using xAI models (e.g. x-ai/grok-4.1-fast), you can pass x_search_filter as a top-level request parameter to filter X/Twitter search results:

json
{  "model": "x-ai/grok-4.1-fast",  "input": "What are people saying about AI?",  "plugins": [{ "id": "web" }],  "x_search_filter": {    "allowed_x_handles": ["ModelGatesAI"],    "from_date": "2025-01-01",    "enable_image_understanding": true  }}
ParameterTypeDescription
allowed_x_handlesstring[]Only include posts from these handles (max 10)
excluded_x_handlesstring[]Exclude posts from these handles (max 10)
from_datestringStart date (ISO 8601, e.g. "2025-01-01")
to_datestringEnd date (ISO 8601, e.g. "2025-12-31")
enable_image_understandingbooleanAnalyze images in posts
enable_video_understandingbooleanAnalyze videos in posts

allowed_x_handles and excluded_x_handles are mutually exclusive. See the Web Search plugin docs for full details.

Use structured messages for more complex queries:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: [      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'What was a positive news story from today?',          },        ],      },    ],    plugins: [{ id: 'web', max_results: 2 }],    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': [            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'What was a positive news story from today?',                    },                ],            },        ],        'plugins': [{'id': 'web', 'max_results': 2}],        'max_output_tokens': 9000,    }) result = response.json()print(result)

Online Model Variants

The :online variant is deprecated. Use the modelgates:web_search server tool instead.

Some models have built-in web search capabilities using the :online variant:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini:online',    input: 'What was a positive news story from today?',    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini:online',        'input': 'What was a positive news story from today?',        'max_output_tokens': 9000,    }) result = response.json()print(result)

Response with Annotations

Web search responses include citation annotations:

json
{  "id": "resp_1234567890",  "object": "response",  "created_at": 1234567890,  "model": "openai/o4-mini",  "output": [    {      "type": "message",      "id": "msg_abc123",      "status": "completed",      "role": "assistant",      "content": [        {          "type": "output_text",          "text": "ModelGates is a unified API for accessing multiple Large Language Model providers through a single interface. It allows developers to access 100+ AI models from providers like OpenAI, Anthropic, Google, and others with intelligent routing and automatic failover.",          "annotations": [            {              "type": "url_citation",              "url": "https://modelgates.ai/docs",              "start_index": 0,              "end_index": 85            },            {              "type": "url_citation",              "url": "https://modelgates.ai/models",              "start_index": 120,              "end_index": 180            }          ]        }      ]    }  ],  "usage": {    "input_tokens": 15,    "output_tokens": 95,    "total_tokens": 110  },  "status": "completed"}

Annotation Types

Web search responses can include different annotation types:

URL Citation

json
{  "type": "url_citation",  "url": "https://example.com/article",  "start_index": 0,  "end_index": 50}

Complex Search Queries

Handle multi-part search queries:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: [      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'Compare OpenAI and Anthropic latest models',          },        ],      },    ],    plugins: [{ id: 'web', max_results: 5 }],    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': [            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'Compare OpenAI and Anthropic latest models',                    },                ],            },        ],        'plugins': [{'id': 'web', 'max_results': 5}],        'max_output_tokens': 9000,    }) result = response.json()print(result)

Web Search in Conversation

Include web search in multi-turn conversations:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: [      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'What is the latest version of React?',          },        ],      },      {        type: 'message',        id: 'msg_1',        status: 'in_progress',        role: 'assistant',        content: [          {            type: 'output_text',            text: 'Let me search for the latest React version.',            annotations: [],          },        ],      },      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'Yes, please find the most recent information',          },        ],      },    ],    plugins: [{ id: 'web', max_results: 2 }],    max_output_tokens: 9000,  }),}); const result = await response.json();console.log(result);
python
import requests response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': [            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'What is the latest version of React?',                    },                ],            },            {                'type': 'message',                'id': 'msg_1',                'status': 'in_progress',                'role': 'assistant',                'content': [                    {                        'type': 'output_text',                        'text': 'Let me search for the latest React version.',                        'annotations': [],                    },                ],            },            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'Yes, please find the most recent information',                    },                ],            },        ],        'plugins': [{'id': 'web', 'max_results': 2}],        'max_output_tokens': 9000,    }) result = response.json()print(result)

Monitor web search progress with streaming:

typescript
const response = await fetch('https://modelgates.ai/api/v1/responses', {  method: 'POST',  headers: {    'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',    'Content-Type': 'application/json',  },  body: JSON.stringify({    model: 'openai/o4-mini',    input: [      {        type: 'message',        role: 'user',        content: [          {            type: 'input_text',            text: 'What is the latest news about AI?',          },        ],      },    ],    plugins: [{ id: 'web', max_results: 2 }],    stream: true,    max_output_tokens: 9000,  }),}); const reader = response.body?.getReader();const decoder = new TextDecoder(); while (true) {  const { done, value } = await reader.read();  if (done) break;   const chunk = decoder.decode(value);  const lines = chunk.split('\n');   for (const line of lines) {    if (line.startsWith('data: ')) {      const data = line.slice(6);      if (data === '[DONE]') return;       try {        const parsed = JSON.parse(data);        if (parsed.type === 'response.output_item.added' &&            parsed.item?.type === 'message') {          console.log('Message added');        }        if (parsed.type === 'response.completed') {          const annotations = parsed.response?.output            ?.find(o => o.type === 'message')            ?.content?.find(c => c.type === 'output_text')            ?.annotations || [];          console.log('Citations:', annotations.length);        }      } catch (e) {        // Skip invalid JSON      }    }  }}
python
import requestsimport json response = requests.post(    'https://modelgates.ai/api/v1/responses',    headers={        'Authorization': 'Bearer YOUR_MODELGATES_API_KEY',        'Content-Type': 'application/json',    },    json={        'model': 'openai/o4-mini',        'input': [            {                'type': 'message',                'role': 'user',                'content': [                    {                        'type': 'input_text',                        'text': 'What is the latest news about AI?',                    },                ],            },        ],        'plugins': [{'id': 'web', 'max_results': 2}],        'stream': True,        'max_output_tokens': 9000,    },    stream=True) for line in response.iter_lines():    if line:        line_str = line.decode('utf-8')        if line_str.startswith('data: '):            data = line_str[6:]            if data == '[DONE]':                break            try:                parsed = json.loads(data)                if (parsed.get('type') == 'response.output_item.added' and                    parsed.get('item', {}).get('type') == 'message'):                    print('Message added')                if parsed.get('type') == 'response.completed':                    output = parsed.get('response', {}).get('output', [])                    message = next((o for o in output if o.get('type') == 'message'), {})                    content = message.get('content', [])                    text_content = next((c for c in content if c.get('type') == 'output_text'), {})                    annotations = text_content.get('annotations', [])                    print(f'Citations: {len(annotations)}')            except json.JSONDecodeError:                continue

Annotation Processing

Extract and process citation information:

typescript
function extractCitations(response: any) {  const messageOutput = response.output?.find((o: any) => o.type === 'message');  const textContent = messageOutput?.content?.find((c: any) => c.type === 'output_text');  const annotations = textContent?.annotations || [];   return annotations    .filter((annotation: any) => annotation.type === 'url_citation')    .map((annotation: any) => ({      url: annotation.url,      text: textContent.text.slice(annotation.start_index, annotation.end_index),      startIndex: annotation.start_index,      endIndex: annotation.end_index,    }));} const result = await response.json();const citations = extractCitations(result);console.log('Found citations:', citations);
python
def extract_citations(response_data):    output = response_data.get('output', [])    message_output = next((o for o in output if o.get('type') == 'message'), {})    content = message_output.get('content', [])    text_content = next((c for c in content if c.get('type') == 'output_text'), {})    annotations = text_content.get('annotations', [])    text = text_content.get('text', '')     citations = []    for annotation in annotations:        if annotation.get('type') == 'url_citation':            citations.append({                'url': annotation.get('url'),                'text': text[annotation.get('start_index', 0):annotation.get('end_index', 0)],                'start_index': annotation.get('start_index'),                'end_index': annotation.get('end_index'),            })     return citations result = response.json()citations = extract_citations(result)print(f'Found citations: ')

Best Practices

  1. Limit results: Use appropriate max_results to balance quality and speed
  2. Handle annotations: Process citation annotations for proper attribution
  3. Query specificity: Make search queries specific for better results
  4. Error handling: Handle cases where web search might fail
  5. Rate limits: Be mindful of search rate limits

Next Steps