Index Budget Audit
Magento has a long list of ways it'll waste a search engine's crawl budget by default. The Index Budget Audit detects the worst offenders, persists them to a grid, and offers per-row Apply-Fix actions on the auto-fixable ones.
Why crawl budget matters
Google assigns each site a finite crawl budget — how many URLs Googlebot will fetch from your domain per day. If you waste it on:
- Disabled products that still resolve at their old URL
- Filtered category URLs (
?color=red&size=10) that produce thousands of permutations - Out-of-stock products you'll never restock
- Layered-nav explosion from too many filterable attributes
…then Googlebot never gets around to your important pages. The Index Budget Audit is about reclaiming that budget.
What ships
Five auditors out of the box, all independently togglable:
| Code | Severity | What it detects | Auto-fix? |
|---|---|---|---|
out_of_stock_indexable | warning | Enabled products that are OOS but not noindex'd | ✅ Set meta_robots=noindex,follow |
disabled_url_rewrite | warning | Disabled products that still have url_rewrite rows | ✅ Delete the orphaned rows |
layered_nav_explosion | warning/error | Too many filterable attributes producing combinatorial URL explosion | — (config recommendation) |
canonical_config | warning/error | Magento's catalog/seo/*_canonical_tag configs are off | — (config recommendation) |
cms_missing_meta | info | Active CMS pages with empty meta_title or meta_description | — (links to AI Meta Generator) |
Plus per-row Generate AI Meta action on cms_missing_meta findings when AI is configured.
Running
Once:
bin/magento seosuite:index:audit
Or click Run Audit in the Marketing → SEO Suite → Index Budget Audit toolbar.
Nightly:
Stores → Configuration → SEO Suite → Admin Dashboard → Enable nightly Index Budget audit (cron) = Yes
The cron job byte8_seosuite_index_audit runs at 0 2 * * * daily. With notify_on_errors = Yes, errors post to the admin bell-icon inbox.
Output format
+------------+--------+-------------+-----------+--------+----------------+----------------------+
| Severity | Code | Auditor | Entity | Store | Target / URL | Message |
+------------+--------+-------------+-----------+--------+----------------+----------------------+
| warning | … | out_of_stock_indexable | product | 1 | 1234 / /awp.html | Product "AWP-001" is out of stock but still indexable. |
+ recommendation: Set Meta Robots to "noindex,follow" until restocked, or 301 to a category if the SKU is permanently gone. +
Every finding includes:
severity(error / warning / info)code— programmatic identifierauditor— which auditor produced itentity_type+target_id+store_idfor grid filteringurl— the affected URL where applicablemessage— human descriptionrecommendation— one-line suggested fix
How findings are persisted
Each scan is tagged with a fresh scan_id (scan-YYYYMMDD-HHMMSS). Old scans are purged so the grid only ever shows the latest state — no historical drift, no growing tables.
CLI options
bin/magento seosuite:index:audit
[-l <limit>] # Cap entities scanned per auditor
[-a <codes>] # Comma-separated auditor codes (default: all)
[-f json|table] # Output format
Run a single auditor:
bin/magento seosuite:index:audit -a out_of_stock_indexable
JSON for CI:
bin/magento seosuite:index:audit --format json
Exit code is 1 if any errors found.
Resolving findings
Three resolution paths:
- Apply Fix (per row, on
out_of_stock_indexableanddisabled_url_rewriteonly) — runs the deterministic fixer - Generate AI Meta (per row, on
cms_missing_meta) — calls Claude and queues a suggestion - Mass Dismiss — bulk mark resolved without action (for findings that are intentional/known-acceptable)
A row marked resolved stays in the table but won't show per-row actions. The next scan re-evaluates and either re-creates the finding (if the underlying issue persists) or leaves it absent.