Skip to main content

Robots.txt per-store

Magento core lets you edit robots.txt content via Stores → Configuration → Design → Search Engine Robots, but the scope is website. If you have multiple stores under one website (different locales, different B2B/B2C divisions), they all share the same robots.txt — which is wrong when each storefront has different paths to disallow.

The SEO Suite adds a per-store override layer on top of the core's Magento\Robots\Model\Robots.

How it works

A plugin (Byte8\SeoSuite\Plugin\Robots\AppendCustomRules) wraps Magento\Robots\Model\Robots::getData(). When the store has rules configured:

  • Mode = Append → core website-level rules + your store-scoped rules concatenated
  • Mode = Replace → only your store-scoped rules; core's website-level rules are ignored

Configuration

Stores → Configuration → SEO Suite → Robots.txt (per-store)

FieldDefault
Enable store-scoped rulesNo
Modeappend
Custom rules(empty)

Per-store overrides supported. Set the field at store-view scope to apply different rules to each storefront under the same website.

Example: append mode

Website-level (Stores → Configuration → Design → Search Engine Robots → custom_instructions):

User-agent: *
Disallow: /checkout/
Disallow: /customer/

Store 1 (UK) custom rules:

Disallow: /uk-only-marketing-page

Store 2 (DE) custom rules:

Disallow: /de-archived-categories/

Store 1's /robots.txt returns:

User-agent: *
Disallow: /checkout/
Disallow: /customer/

Disallow: /uk-only-marketing-page

Store 2's returns:

User-agent: *
Disallow: /checkout/
Disallow: /customer/

Disallow: /de-archived-categories/

Example: replace mode

Useful when the website-level rules are too generic and a store needs a totally different policy.

Website-level: empty

Store 1 (production):

User-agent: *
Disallow: /checkout/
Disallow: /customer/
Disallow: /catalog/product_compare/
Disallow: /wishlist/

Sitemap: https://store.example/sitemap.xml

Store 2 (staging — totally locked):

User-agent: *
Disallow: /

With Mode = Replace on store 2, staging's /robots.txt returns just Disallow: / — none of production's rules leak through.

For most Magento stores:

User-agent: *
Disallow: /checkout/
Disallow: /customer/
Disallow: /catalog/product_compare/
Disallow: /wishlist/
Disallow: /catalogsearch/
Disallow: /sendfriend/
Disallow: /review/product/

# Layered nav noindex pages — also block crawl entirely for huge catalogs
Disallow: /*?color=
Disallow: /*?size=
Disallow: /*?price=

# Tracking params
Disallow: /*?utm_
Disallow: /*?gclid=

Sitemap: https://store.example/sitemap.xml

Adjust ?color= etc. to match your filter param names.

Verification

Visit /robots.txt on each storefront URL:

curl https://uk.store.example/robots.txt
curl https://de.store.example/robots.txt

The two should differ.

Plugin order

The plugin runs at sortOrder=10. If another module also plugs into Robots::getData(), plugin order may matter — set theirs lower than 10 to run before, higher to run after.

Limitations

  • No syntax validation — typos in your rules will silently produce broken robots.txt. Test in Google Search Console's robots.txt tester after editing.
  • No sitemap auto-injection — you write Sitemap: … lines manually. The auto XML sitemap module is roadmapped for v2.9.
  • One file per store, one file per request — Magento serves a single /robots.txt per request, scoped to the store the URL matched. Multi-domain setups generally work; subfolder/multi-store-per-domain setups inherit the resolution from Magento's own routing.