17
Signal
We fixed a crawl budget issue that was blocking 60% of our product pages from being indexed
Posting this because it took us three months to diagnose and I want others to find this faster.
**Symptom:** ~4,000 product pages submitted in sitemap, only ~1,600 indexed. Coverage report showed 'Discovered — currently not indexed' for the missing pages.
**What we tried first (did not work):**
- Re-submitting sitemap
- Requesting indexing manually in GSC
- Improving page speed on unindexed pages
**Actual root cause:**
Googlebot was spending almost all its crawl budget on two things:
1. Infinite URL parameters from faceted navigation (filters were generating millions of unique URLs)
2. Duplicate paginated pages — we had both `?page=2` and `/page/2/` formats live
**The fix:**
1. Added `<meta name="robots" content="noindex,follow">` to all filtered/faceted URLs
2. Implemented `rel=canonical` on all paginated variants pointing to the first page
3. Used `Disallow` in robots.txt for parameter combinations that generated no unique content
4. Reduced crawl noise by ~85% (verified via server log analysis with Screaming Frog Log Analyser)
**Results:**
- Within 6 weeks, indexed pages grew from 1,600 to 3,800
- Organic impressions +51% over 90 days
Always pull your server logs before guessing. The answer is usually in there.
💬 0 comments
👁️ 231 views