Robots.txt per-store
Magento core lets you edit robots.txt content via Stores → Configuration → Design → Search Engine Robots, but the scope is website. If you have multiple stores under one website (different locales, different B2B/B2C divisions), they all share the same robots.txt — which is wrong when each storefront has different paths to disallow.
The SEO Suite adds a per-store override layer on top of the core's Magento\Robots\Model\Robots.
How it works
A plugin (Byte8\SeoSuite\Plugin\Robots\AppendCustomRules) wraps Magento\Robots\Model\Robots::getData(). When the store has rules configured:
- Mode = Append → core website-level rules + your store-scoped rules concatenated
- Mode = Replace → only your store-scoped rules; core's website-level rules are ignored
Configuration
Stores → Configuration → SEO Suite → Robots.txt (per-store)
| Field | Default |
|---|---|
| Enable store-scoped rules | No |
| Mode | append |
| Custom rules | (empty) |
Per-store overrides supported. Set the field at store-view scope to apply different rules to each storefront under the same website.
Example: append mode
Website-level (Stores → Configuration → Design → Search Engine Robots → custom_instructions):
User-agent: *
Disallow: /checkout/
Disallow: /customer/
Store 1 (UK) custom rules:
Disallow: /uk-only-marketing-page
Store 2 (DE) custom rules:
Disallow: /de-archived-categories/
Store 1's /robots.txt returns:
User-agent: *
Disallow: /checkout/
Disallow: /customer/
Disallow: /uk-only-marketing-page
Store 2's returns:
User-agent: *
Disallow: /checkout/
Disallow: /customer/
Disallow: /de-archived-categories/
Example: replace mode
Useful when the website-level rules are too generic and a store needs a totally different policy.
Website-level: empty
Store 1 (production):
User-agent: *
Disallow: /checkout/
Disallow: /customer/
Disallow: /catalog/product_compare/
Disallow: /wishlist/
Sitemap: https://store.example/sitemap.xml
Store 2 (staging — totally locked):
User-agent: *
Disallow: /
With Mode = Replace on store 2, staging's /robots.txt returns just Disallow: / — none of production's rules leak through.
Recommended baseline rules
For most Magento stores:
User-agent: *
Disallow: /checkout/
Disallow: /customer/
Disallow: /catalog/product_compare/
Disallow: /wishlist/
Disallow: /catalogsearch/
Disallow: /sendfriend/
Disallow: /review/product/
# Layered nav noindex pages — also block crawl entirely for huge catalogs
Disallow: /*?color=
Disallow: /*?size=
Disallow: /*?price=
# Tracking params
Disallow: /*?utm_
Disallow: /*?gclid=
Sitemap: https://store.example/sitemap.xml
Adjust ?color= etc. to match your filter param names.
Verification
Visit /robots.txt on each storefront URL:
curl https://uk.store.example/robots.txt
curl https://de.store.example/robots.txt
The two should differ.
Plugin order
The plugin runs at sortOrder=10. If another module also plugs into Robots::getData(), plugin order may matter — set theirs lower than 10 to run before, higher to run after.
Limitations
- No syntax validation — typos in your rules will silently produce broken robots.txt. Test in Google Search Console's robots.txt tester after editing.
- No sitemap auto-injection — you write
Sitemap: …lines manually. The auto XML sitemap module is roadmapped for v2.9. - One file per store, one file per request — Magento serves a single
/robots.txtper request, scoped to the store the URL matched. Multi-domain setups generally work; subfolder/multi-store-per-domain setups inherit the resolution from Magento's own routing.