AI SEO crawler check API Documentation - VebAPI

AI SEO crawler check

Checks whether a website allows AI bots (e.g., GPTBot, Google-Extended, PerplexityBot)

AI SEO crawler check

GET
GET /seo/aiseochecker

Parameters

website
Required query
string

enter the website url you want to check in

X-API-KEY
Required header
string

Your vebapi api key

Example Request


curl -X GET "https://vebapi.com/api/seo/aiseochecker?website=codeconia.com" \
  -H "X-API-KEY: YOUR_API_KEY" \
  -H "Content-Type: application/json"

Response

{
    "url": "codeconia.com",
    "robots_found": true,
    "ai_access": {
        "GPTBot": true,
        "ChatGPT-User": true,
        "Google-Extended": true,
        "AnthropicBot": true,
        "ClaudeBot": true,
        "PerplexityBot": true,
        "CCBot": true,
        "Amazonbot": true,
        "Bytespider": true,
        "facebookexternalhit": true,
        "cohere-ai": true,
        "YouBot": true,
        "NeevaBot": true,
        "ai-crawler": true,
        "Applebot": true,
        "Baiduspider": true,
        "Sogou": true,
        "YandexBot": true,
        "PhindBot": true,
        "DuckDuckBot": true,
        "Yeti": true,
        "360Spider": true,
        "ias-va": true
    },
    "ai_bots_allowed": true,
    "suggestions": [
        "Your site is currently open for AI bots. You're AI-friendly!",
        "You can still improve by providing structured data (schema.org) for better AI comprehension.",
        "Add clear content usage terms if you want to allow or limit AI training use."
    ]
}

What it does 

Checks whether a website allows AI bots (e.g., GPTBot, Google-Extended, PerplexityBot) to crawl and use its content. It reads the site’s robots.txt, evaluates access rules for major AI/LLM crawlers, and returns an allow/deny matrix plus practical suggestions (e.g., what to add/change in robots.txt) if access is blocked.

Why it’s useful (benefits)

  • Know your AI exposure: Quickly see if LLMs can crawl or train on your content.

  • Compliance & policy control: Verify that your robots.txt reflects your intended AI policy.

  • Easy fixes: Get specific suggestions to enable/limit AI crawling.

  • Bulk auditing: Integrate into your CI/SEO pipelines to monitor many domains.


Base URL

https://vebapi.com

Endpoint

GET /api/seo/aiseochecker

Authentication

Send your API key in the header: X-API-KEY: YOUR_API_KEY


Input Parameters

Name In Type Required Description Example
website query string Yes Domain or URL to check. Subdomains are treated as provided (robots.txt is fetched from that host). codeconia.com or https://codeconia.com

Notes
• The service fetches https://<host>/robots.txt (or http if https is unavailable).
• If website is a full URL, the host component is used for robots.txt.


Output (Response Body)

Field Type Description Example
url string The normalized host or site you asked to check. "codeconia.com"
robots_found boolean Whether a robots.txt file was found and parsed successfully. true
ai_access object Key/value map of major AI/LLM crawler user-agents to boolean access. true = allowed, false = blocked. See example on your page
ai_bots_allowed boolean Overall flag: true if all tracked AI bots are allowed, false if one or more are blocked. true
suggestions string[] Human-readable suggestions based on the analysis (how to enable or improve AI crawling/compliance). See example on your page

ai_access keys (typical)

GPTBot, ChatGPT-User, Google-Extended, AnthropicBot, ClaudeBot, PerplexityBot, CCBot, Amazonbot, Bytespider, facebookexternalhit, cohere-ai, YouBot, NeevaBot, ai-crawler, Applebot, Baiduspider, Sogou, YandexBot, PhindBot, DuckDuckBot, Yeti, 360Spider, ias-va

Meaning: For each listed bot, true = allowed by current robots.txt rules; false = blocked (via Disallow for that UA or wildcard rules that match it).


Example

You already display an example request and response on the page, so they’re not duplicated here.


Status Codes (typical)

  • 200 OK – Analysis succeeded.

  • 400 Bad Request – Missing/invalid website parameter.

  • 401 Unauthorized – Missing/invalid X-API-KEY.

  • 404 Not Found – Host reachable but no robots.txt and no fallback (you’ll still get robots_found: false if host exists).

  • 500 Server Error – Unexpected error while fetching/parsing.


Implementation Notes & Best Practices

  • To allow GPTBot explicitly, add for example:

    User-agent: GPTBot
    Allow: /
    
  • To opt out of certain AI bots, specify:

    User-agent: GPTBot
    Disallow: /
    
  • Prefer host-specific rules if you have multiple subdomains.

  • Re-check after changes; CDN caches can delay robots.txt propagation.