# My Spice N Rice — robots.txt # https://www.myspicenrice.ca # ── General crawlers ────────────────────────────────────────── User-agent: * Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html # ── AI / LLM training crawlers ──────────────────────────────── # Allow full crawl; block legal pages from training data User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: YouBot Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html # ── Google Ads bots ─────────────────────────────────────────── User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps Allow: / # ── Crawl delay for bandwidth-heavy bots ───────────────────── User-agent: Baiduspider Crawl-delay: 10 # ── Sitemap ─────────────────────────────────────────────────── Sitemap: https://www.myspicenrice.ca/sitemap.xml