AI visibility for small businesses depends on 20 verifiable signals across three categories: technical access (can AI bots reach and parse your site?), content quality (does your content directly answer queries?), and authority signals (do external sources reference your business?). This checklist covers all 20, ordered by impact-to-effort ratio.
Category 1 — Technical access (high impact, low effort)
These are binary checks. Either the signal is present or it is not. Fix missing items first — they block all downstream optimization.
1. GPTBot allowed in robots.txt
Check yourdomain.com/robots.txt. User-agent: GPTBot must not be blocked. If absent, add: User-agent: GPTBot
Allow: /
2. PerplexityBot allowed in robots.txt
Same check for User-agent: PerplexityBot. Both GPTBot and PerplexityBot should be explicitly allowed.
3. Google-Extended allowed in robots.txt
Google's AIO training crawler uses the Google-Extended user agent. Allow it to be selected by Google AI Overviews.
4. ClaudeBot allowed in robots.txt
Anthropic's Claude uses ClaudeBot for web retrieval. Allow it for Claude citations.
5. llms.txt present and valid
Serve a llms.txt at yourdomain.com/llms.txt with your business description and key pages. Use the format described in our llms.txt guide.
6. Sitemap.xml submitted and current
Submit your sitemap to Google Search Console and Bing Webmaster Tools. Ensure it reflects your current page structure (auto-generated by most CMS platforms).
7. Pages render without JavaScript
Most AI crawlers do not execute JavaScript. If your key pages require JS to display content (common in React/Next.js SPAs without SSR), crawlers see empty pages. Use server-side rendering or static generation for all content pages.
Category 2 — Content quality (high impact, medium effort)
These changes require editing existing content but have the highest direct impact on citation frequency.
8. Each key page opens with a direct answer (first 40 words)
Rewrite the opening paragraph of your five most important pages. The first sentence should directly answer the page's title question. Do not open with history, context, or preamble.
9. H2/H3 headings match search queries
Use your target question phrases as exact H2 headings ("How to register for Swiss VAT", not "Registration process"). AI systems extract by heading; heading text = query intent alignment.
10. FAQPage schema on key pages
Add JSON-LD FAQPage markup to your top 5 pages. Each FAQ should be a real question with a precise answer under 100 words. Validate at schema.org/FAQPage.
11. Article schema with author and dateModified
All blog posts and resource articles should have Article schema including author (with Person schema), datePublished, and dateModified. Update dateModified when you refresh content.
12. LocalBusiness or Organization schema
Your homepage should include Organization schema (all businesses) or LocalBusiness schema (businesses serving a geographic area) with your name, address, phone, and opening hours if applicable.
13. Content minimum 600 words per key page
Pages under 300 words are routinely excluded by AI re-rankers as thin content. Aim for 600–1,500 words on priority service and resource pages. Length is a proxy for depth; depth is what AI engines select.
14. At least one table or numbered list per page
Tables and numbered lists are extracted by AI systems as self-contained factual units. Pages with structured data formats (not just prose) are more likely to have their content cited verbatim.
Category 3 — Authority signals (high impact, high effort)
These take time but create persistent citation probability across all AI systems, including those without live web search.
15. Business listed in Google Business Profile
Google Business Profile data feeds into Google AIO. A complete, verified profile (with description, categories, services, and photos) increases local query citation probability significantly.
16. Listed in 3+ industry-specific directories
For your sector, identify the three most authoritative directories and ensure your business is listed with consistent NAP (Name, Address, Phone) information. These generate crawlable external references.
17. Mentioned on at least one government or .edu domain
Government and educational domain mentions are high-trust signals for all AI systems. Chamber of commerce listings, local government contractor directories, and university partnership pages count.
18. At least one press mention with dofollow link
A single editorial mention in a regional newspaper or industry publication creates a citation signal visible to AI crawlers. Local press is often more accessible than national publications.
19. Wikipedia mention (if genuinely notable)
If your business has objective notability (founded more than 5 years ago, has been covered in independent press, has a measurable market presence), consider creating or improving a Wikipedia entry. Wikipedia is cited by all major AI systems.
20. Active citation monitoring
Query ChatGPT, Perplexity, and Google AIO with your 10 most important target queries monthly. Record whether your domain appears as a source. This is the only direct measure of AEO performance — and the baseline for improvement.
Score your current state
Use this checklist to score your business: 1 point per item present. A score below 8/20 means foundational technical work is needed first. A score of 8–14/20 means content and schema improvements will drive the most immediate gains. Above 14/20, authority signal building and active monitoring are the leverage points.
Citura's free audit covers all 20 signals and produces a prioritized action list specific to your domain.