Google’s Advice on Fixing Unwanted Indexed URLs

An
SEO
posted
details
about
a
site
audit
in
which
he
critiqued
the
use
of
a
rel=canonical
for
controlling
what
pages
are
indexed
on
a
site.
The
SEO
proposed
using
noindex
to
get
the
pages
dropped
from
Google’s
index
and
then
adding
the
individual
URLs
to
robots.txt.
Google’s
John
Mueller
suggested
a
solution
that
goes
in
a
different
direction.

Site
Audit
Reveals
Indexed
Add
To
Cart
URLs

An
SEO
audit
uncovered
that
over
half
of
the
client’s
1.43k
indexed
pages
were
paginated
and
“add
to
shopping
cart”
URLs
(the
kind
with
question
marks
at
the
end
of
them).
Google
ignored
the
rel=canonical
link
attributes
and
indexed
the
pages,
which
illustrated
the
point
that
rel=canonical
is
just
a
hint
and
not
a
directive.
Paginated
in
this
case
just
means
the
dynamically
generated
URLs
related
to
when
a
site
visitor
orders
a
page
by
brand
or
size
or
whatever
(this
is
usually
referred
to
as
faceted
navigation).


The
add
to
shopping
cart
URLs 
looked
like
this:

example.com/product/page-5/?add-to-cart=example

The
client
had
implemented
a
rel=canonical
link
attribute
to
tell
Google
that
another
URL
was
the
correct
URL
to
index.


The
SEO’s
solution:

“How
I
plan
on
fixing
this
is
to
no-index
all
these
pages
and
once
that’s
done
block
them
in
the
robots.txt”

SEO
Decisions
Depend
On
Details

One
of
the
most
tired
and
boring
SEO
dad
jokes
is
“it
depends.”
But
saying
“it
depends”
is
no
joke
when
it’s
followed
by
what
something
depends
on
and
that’s
the
crucial
detail
that
John
Mueller
added
to
a
LinkedIn
discussion
that
already
had
83
responses
to
it.

The
original
discussion,
by
an
SEO
who’d
just
finished
an
audit,
addresses
the
technical
challenges
associated
with
controlling
what
gets
crawled
and
indexed
by
Google
and
why
rel=canonical
is
not
an
unreliable
solution
because
it
is
a
suggestion
and
not
a
directive.

A
directive
is
a
command
that
Google
is
obligated
to
follow,
like
a
meta
noindex
rule.
A
rel=canonical
link
attribute
is
not
a
directive,
it’s
treated
as
a
hint
for
Google
to
use
for
deciding
what
to
index.

The
problem
that
the
original
post
described
was
about
managing
a
high
number
of
dynamically
generated
posts
that
were
slipping
into
Google’s
index.

John
Mueller
On
Dealing
With
Unwanted
Indexed
URLs

Mueller’s
take
on
the
problem
was
to
suggest
the
importance
of
reviewing
the
URLs
for
patterns
that
may
give
a
clue
as
to
why
unwanted
URLs
are
getting
indexed
and
then
applying
a
more
granular
(specific)
solution.


He
advised:

“You
seem
to
have
a
lot
of
comments
here
already,
so
my
2
cents
are
more
as
a
random
bystander…


I’d
review
the
URLs
for
patterns
and
look
at
specifics,
rather
than
to
treat
this
as
a
random
list
of
URLs
that
you
want
canonicalized.
These
are
not
random,
using
a
generic
solution
won’t
be
optimal
for
any
site

ideally
you’d
do
something
specific
for
this
particular
situation.
Aka
“it
depends”.


In
particular,
you
seem
to
have
a
lot
of
‘add
to
cart’
URLs

you
can
just
block
these
with
the
URL
pattern
via
robots.txt.
You
don’t
need
to
canonicalize
them,
they
should
ideally
not
be
crawled
during
a
normal
crawl
(it
messes
up
your
metrics
too).


There’s
some
amount
of
pagination,
filtering
in
URL
parameters
too

check
out
our
documentation
on
options
for
that.


For
more
technical
rabbit
holes,
check
out
https://search-off-the-record.libsyn.com/handling-dupes-same-same-or-different

Why
Was
Google
Indexing
URLs
With
Query
Parameters?

A
topic
raised
by
multiple
people
in
the
LinkedIn
discussion
is
the
problem
of
Google
indexing
shopping
cart
URLs
(add
to
shopping
cart
URLs).
No
answers
were
provided
but
it
may
be
something
particular
to
the
shopping
cart
platform
and
solving
that
may
be
limited
to
the
above
described
solutions.

Read

John
Mueller’s
advice
here
.

Scroll to Top
Free & easy backlink link building. The national golf & country club at ave maria.