Panel: Technical SEO
Patrick Stox
Product Advisor,
Technical SEO, & Brand
Ambassador at Ahrefs
What I learned from
auditing over
1,000,000 websites
Who Is Patrick Stox?
Product Advisor & Technical SEO @ Ahrefs
Writing
•Ahrefs SEO book
•Ahrefs blog, SEL
•SEO Chapter of the Web Almanac 2021
•Technical Review Editor of the Art of SEO
4th edition
Community
•Founded a Technical SEO slack group
•/r/TechSEO moderator on Reddit
•Organize Raleigh SEO Meetup, Raleigh
SEO Conference, Beer & SEO
•Defined the role of SEO for the US
Department of Labor
Entire deck on Hreflang
issues with insights.
https://speakerdeck.com/pa
trickstox/hreflang-study-
and-interesting-issues-
brighton-seo-2023-patrick-
Full Data
http://bit.ly/ahrefs-site-audit-
study
Broken Pages
4XX page 44.082
404 page 40.555
How many hits?
How many links?
Redirect if you can
Free full-text matching
redirect script
https://colab.research.googl
e.com/drive/18lMkaRHK__e
NM6m5FpoyhGDlDAYr3a6P
?usp=sharing
Server Issues
● 5XX page 8.482%
● 500 page 5.001%
● Timed out 12.579%
Did traffic drop?
Are pages still indexed?
Redirects
● 3XX redirect 95.183%
● 302 redirect 25.576%
● Meta refresh redirect 1.763%
Meta refresh redirects are more
common than I thought.
Permanent Redirects
● HTTP 301
● HTTP 308
● Meta refresh 0
● HTTP refresh 0
● JavaScript location
● Crypto redirect
Temporary Redirects (weak signal
forward)
● HTTP 302
● HTTP 303
● HTTP 307 (server side, not the
browser cached one)
● Meta refresh >0
● HTTP refresh >0
Temporary Redirects (at some point)
Keep redirects at
least 1 year
Redirects
● HTTP to HTTPS redirect
88.001%
● HTTPS to HTTP redirect
6.153%
>6% is crazy high for people
redirecting HTTPS to HTTP
Redirects
● Broken redirect12.662%
Redirects
● Redirect chain 45.193%
● Redirect chain too long 0.114%
Google will follow up to 10 hops.
I don’t worry til >5 hops.
Redirects
● Redirect loop 2.162%
Wastes a lot of resources.
Mixed Content
● HTTPS/HTTP mixed
content 13.342%
Content Security Policy
(CSP)
upgrade-insecure-requests
In .htaccess:
<ifModule mod_headers.c>
Header always set Content-Security-
Policy "upgrade-insecure-requests;"
</IfModule>
Indexability
● Noindex 49.719%
Noindex in HTML and HTTP
header 5.776%
Intentional = fine
Unintentional = bad
Nofollow
● Nofollow page 26.34%
● Nofollow in HTML and
HTTP header 0.096%
Nofollows all links, including
internal links.
I’d rarely use this. Maybe on
UGC pages.
Canonicals
● Canonical from HTTP
to HTTPS 3.464%
● Canonical from HTTPS
to HTTP 2.079%
Canonicals
● Canonical points to
redirect 5.877%
These will likely get
followed.
Look at what version is
indexed.
Canonicals
● Canonical points to
4XX 2.598%
Likely ignored.
Canonicals
● Non-canonical page
specified as canonical
one 1.362%
Links (Indexable)
● Page has links to
redirect 62.719%
As long as they resolve, it’s
okay.
Clean up when you get a
chance, but not a big deal.
Links (Indexable)
● Orphan page (has no incoming
internal links) 49.905%
● Page has only one dofollow
incoming internal link
44.397%
These pages aren’t likely
important.
If you didn’t link to them,
how much do you really
care?
Links (Indexable)
● Page has links to
broken page
36.195%
Clean up for UX.
Links (Indexable)
● Page has nofollow outgoing
internal links 26.69%
● Page has nofollow incoming
internal links only 6.218%
Don’t nofollow internal links.
Links (Indexable)
● HTTPS page has internal
links to HTTP 20.536%
Okay-ish if they redirect.
Links (Indexable)
● HTTP page has internal
links to HTTPS 3.427%
Redirect to HTTPS.
Links Not Indexable
● Page has links to redirect
62.719%
● Orphan page (has no incoming
internal links) 49.905%
● +more
Impact is really more on UX.
Shouldn’t have any impact
on traffic or rankings.
Content Indexable
● Meta description tag missing
or empty 72.926%
● Title tag missing or empty
5.545%
Content Indexable
● Title too long 63.193%
● Meta description too short
59.242%
● Meta description too long
54.455%
● Title too short 32.758%
¯_(ツ)_/¯
Google will write titles and
descriptions for you.
Only worry if the pages get
lots of traffic or you think
they’re important.
Content Indexable
● Page and SERP titles
do not match
68.537%
Script to see which titles
were the most changed.
https://colab.research.googl
e.com/drive/1mg3DTWVkgX
0KHD3Hx2Y3WyMAUDjdm3
cB?usp=sharing
Content Indexable
● H1 tag missing or empty59.503%
Usually a quick fix so I’d do
it, but not that important.
Text size is likely more
important.
Content Indexable
● Multiple H1 tags 51.254%
Allowed in HTML5.
Content Indexable
● Low word count 18.208%
Who is the president of
Bulgaria?
You don’t need 1500 words
to answer that!
Content Indexable
● Multiple meta description tags
3.562%
● Multiple title tags 3.036%
Combined into 1 for
indexing in Google.
Content Not
Indexable
● Meta description tag missing
or empty 72.926%
● Title too long 63.193%
● +More
It’s not indexable, don’t
worry about it.
Duplicates
● Duplicate pages without
canonical 15.698%
Google is pretty good at
figuring this out.
Performance
● Pages with poor CLS
0.292%
● Pages with poor LCP
0.249%
● Pages with poor FID 0.018%
Add PageSpeed Insights to
your audits to get CWV (field
data) and Lighthouse (lab
data) at scale
Images
● Missing alt text80.401%
Accessibility issue.
Potential legal issue.
Alt text counts as normal
text on the page.
Images
● Image file size too large
27.25%
Compress your images.
Images
● Image broken 15.42%
● Page has broken image
15.42%
Fix broken ones on
important pages.
Images
● Page has redirected image
9.142%
● Image redirects 8.961%
Low priority as long as they
work.
Images
● HTTPS page links to
HTTP image
12.328%
Content Security Policy
(CSP)
upgrade-insecure-requests
JavaScript
● JavaScript broken 4.823%
● Page has broken JavaScript
4.822%
Potential UX and
functionality issues.
It may even prevent your
content from being seen.
JavaScript
● Page has redirected JavaScript
2.644%
● JavaScript redirects2.626%
Fine as long as it still works.
Clean up if you can, but I
wouldn’t prioritize this.
JavaScript
● HTTPS page links to HTTP
JavaScript 1.663%
Content Security Policy
(CSP)
upgrade-insecure-requests
CSS
● CSS broken 4.25%
● Page has broken CSS 4.25%
Potential UX issues.
CSS
● CSS redirects 1.276%
CSS
● HTTPS page links to
HTTP CSS 1.063%
Content Security Policy
(CSP)
upgrade-insecure-requests
Sitemaps
● 3XX redirect in sitemap
22.838%
● Non-canonical page in sitemap
12.107%
● Noindex page in sitemap
11.602%
Pages in sitemaps are a
canonicalization signal.
They should all be 200s and
the canonical.
Automate your sitemaps.
If manual, I don’t bother.
Sitemaps
Sitemap larger 50MB #N/A
Sitemap with over 50K URLs
#N/A
Sitemap in the wrong format
#N/A
These can keep your
sitemap from being
processed.
IndexNow is better imo.
● Faster
● More than just 200s
● Data protection
● <lastmod> is unreliable
● Broader distribution
Real-time monitoring
coming to Ahrefs, w/
IndexNow
External Pages
● External 3XX redirect
17.252%
Helping another site. I
wouldn’t prioritize this.
External Pages
● External 4XX 9.2%
● External 5XX 4.988%
● External time out 3.688%
Potential UX issue.
Other
● Double slash in URL11.078%
Lots more sites than
expected.
Ignored by browsers and
search engines.
Other
● More than three parameters
in URL8.067%
May indicate bloat.
Other
● Non-canonical page receives
organic traffic 6.55%
● 3XX page receives organic
traffic 6.139%
One page or the other will
rank.
Don’t worry about it unless
you have a specific
preference.
Other
● 4XX page receives
organic traffic 2.141%
● Noindex page receives
organic traffic 0.636%
Recent?
Blocked by robots.txt?
Do Care About
● Not indexed but should be
● Redirect 404 pages w/ links
● Internal linking
● Anything else that’s important
but not working
Thank You!
Try Ahrefs Site Auditor Free
ahrefs.com/awt

What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick Stox