English Google Webmaster Central office-hours hangout


JOHN MUELLER: Welcome everyone
to today’s Google Webmaster Central Office Hours Hangouts. We have a bunch of questions
that were submitted already. A bunch of people already
here in the hangout ready to ask
questions, I’m sure. My name is John Mueller. I’m a webmaster trends
analyst at Google. That essentially means that
I talk to webmasters, people like you, and I talk
to our engineers and make sure that we’re doing
the right thing on both sides. Before we get started with the
questions that were submitted, do any of you want to
ask a first question? Don’t be shy. SPEAKER 1: I could ask. You probably won’t
know the answer. So [? Bankoff ?]
said Thursday night that they’re rolling out
the payday loan update, but it seems like nobody
really in that category noticed anything changing,
whereas people said it seems like those
impacted by Panda 4 saw a change in their rankings
Thursday night into Friday morning. Do you have any insight
into that at all that you could share? JOHN MUELLER: I don’t
have any details that I can share on that. Sometimes when we start
rolling out these algorithms, we do that incrementally. It’s not that we go from zero to
100% from one day to the next, so maybe this is
something that’s just slightly ramping up. And you’ll see more of
the changes over time. Sometimes they also notice
something last minute that prevents them from
switching it on completely, and they roll it back
for a couple of days to make sure that they can get
things working the way they really should be
working, and then they can reactivate it again. So that’s also a possibility. SPEAKER 1: Thank you. I also wanted to
commend you guys on the new note from
a reviewer feature that you added to
Google’s rejection notices on reconsideration requests. I’ve already seen a few. The question is,
people are asking me, on average, what
percentage of the rejection notices that you’re
going to send are going to have a
note from reviewer. I assume you have no
clue at this point. But I assume you’re
going to want to use it as often as possible. JOHN MUELLER: Yeah. I mean we don’t have any metrics
on something that we haven’t used before, so
that’s kind of tricky. I don’t think we have
any target that we’re aiming for where we’re saying
we want to send this may percent of the reconsideration
requests a personal notice. But we do try to
use that where we can to make sure that people are
going in the right direction, and that they’re not
just running around in circles trying to resolve an
issue that they’ve come into. SPEAKER 1: Cool. I appreciate it. The information that comes
out of that should be useful, so I appreciate you
guys adding that. JOHN MUELLER: Thanks. I’ll pass that on to the guys. All right. Let’s take a look at some of
the questions that we have here. If a site has 5,000 pages and
has a Panda quality issue, would it be a
terrible idea if you would no-index everything
but 20 pages of quality as a starting point, and then
start refreshing the old pages with good quality page-by-page? Would this
immediately fix Panda? So I think, depending
on your website, this could be a good idea. This could be a
catastrophic idea. It kind of depends on
the kind of content that you have there, the kind
of website that you have, whether or not those 5,000 pages
are really useful content that just aren’t quite up to par,
or if they’re really bad pages. Obviously if they’re
really bad pages that you don’t want
to be associated with, then removing them
and making sure that they’re clean before you
actually make them live again is a great idea. But if they’re kind
of reasonable pages, and they’re just not
perfect, then that’s something where I’d probably
try to keep those pages and work on improving them
rather than deleting them completely. With regards to fixing
issues from a quality algorithm like Panda, that’s
something where you probably wouldn’t see an
immediate effect. So it’s not from one day to the
next that this would happen. On the one hand, we have to
recall those days to realize that they’re actually removed. On the other hand,
these algorithms have to reprocess
the data over time to actually start using that for
the next data push that we see. SPEAKER 2: Hi, John. That was my question. I was just wondering that
one of the biggest problems is people trying to understand
where the Panda issue actually is. And so I did wonder
whether, if you did remove pretty
much everything– is my microphone
very low, by the way? Is my microphone low? JOHN MUELLER: No. It’s good now. SPEAKER 2: OK. Yeah, I wondered
whether or not if you did remove a majority
of the pages, that it’s one of
the only signals you can really find where you
can say to yourself, right, let’s just concentrate
on everything. And if you did just
remove all the bad stuff. I wasn’t necessarily
thinking for myself, but there’s so
many people trying to find out how to
discover that Panda issue. And so the question that’s
arised so many times is, if I just got
rid of everything, and Google recrawled it,
and I do a site colon check for my website,
and it now says 20 pages, have I essentially got
rid of my Panda issue? If there’s 20 quality
pages left now. JOHN MUELLER: Not
really like that, because we do have to
refresh the whole algorithm and data first. And the thing to
keep in mind is Panda is primarily a
site-level algorithm, so it’s not something where we’d
say, you removed 4,000 pages, therefore it doesn’t
apply to these anymore. It’s really still essentially
active for the rest of the sites, so
for those 20 pages that you might leave there. With regards to
resolving this issue, it’s more something
where I’d recommend taking a step back and
looking at the website overall, and also
getting a lot of feedback from peers, from users
who are using your site, to make sure that you’re really
doing the right thing there. Because just
deleting those pages and trying to rewrite the
text isn’t necessarily going to change the whole
issue, because there’s more to a website than
just the text on the page. So the whole layout
kind of comes into play. The whole feeling
that a normal user would get when they come to your
pages kind of comes into play and is something that
we try to at least approximate from our algorithms. SPEAKER 2: Yeah. I’ve seen a lot of
people with websites over the last few years where
what they’ve tried to do is create as many pages as
they can to try and catch as many keywords as they can. And you see it all the time. And that’s where so many
people are questioning, is that where my weak
content is, and should I literally just ditch
all of those things. And when it comes down
to it, there really is only a hundred pages left
now of a couple of thousand. A lot of people have done
that, and so the question applies to a large
group of people. JOHN MUELLER: Yeah,
I mean there are sites that spent a lot
of time to just cover all these keywords
and essentially create low-quality pages that
match these keywords, and that’s definitely
something that we would pick up as a low-quality website
in a case like that. But there’s often
more to it than just that, so just
removing those pages where you have low-quality
text doesn’t necessarily make your page a
high-quality website. Because we still have
all those other factors that we’re picking up
to try to understand how the quality of this website is. If you go through
those 23 questions, there are a lot of things where
we’ve tried to approximate, at least algorithmically. And that’s more than just
the text on those pages, but obviously, if you’re
creating thousands of pages just to match those keywords,
and the text on there is bad, then that’s also
going to be bad. SPEAKER 2: Yeah. OK. Thanks, John. JOHN MUELLER: All right. How can Google know if
a site is being attacked with negative SEO, or
the webmaster of the site actually uses bad SEO practices? That’s always a tricky question. It’s something where we work
really hard to make sure our algorithms recognize
this difference, and that they can understand the
difference between these kinds of things. And we definitely have
the manual webspam team, who’s very aware of this kind
of situation, who can also make a kind of a manual
judgment in a case like that. So if they recognize
that there are patterns involved that look
like a competitor’s just went out and bought lots of links
on Fiverr, or wherever, then those are things
the manual webspam team has a lot of experience
where they can pick up on. Whereas if it looks
like this is something that the webmaster has
been doing for a long time, then that’s often a sign that
when the webspam team takes a look at that,
that they’ll say, well, these issues have been
around for quite some time, and either it’s a
competitor who’s really trying to do something
here for the long run, and willing to take into account
that maybe this website will actually be promoted
for a couple of years during this time. But that’s usually not the
kind of situation that we see. So we try to look at a
number of different things to recognize when
something’s actually negative SEO versus
when something is more likely something that
the webmaster did themselves, and overall I think we’re
really good at that, and we pick up on
a lot of signals that are kind of subtle. So that’s something
where I think we’re doing a pretty
good job with that. If you recognize
situations where you think we’re picking
up something incorrectly, you’re welcome to let
us know about that so that we can take a better look. And if you’re doing this
for manual webspam reasons, if a webspam checker,
essentially from our side, looked at your site and placed
a manual action on your site, then you would say, oh, this is
definitely from negative SEO, then giving us that information
and reconsideration request is a great way to bring
that feedback back as well. SPEAKER 1: So John, I did
a poll to my SEO base, and we had about 300 responses
asking if they actually tried negative SEO
and if it worked. And only 8 and 1/2%
said it does not work. That they’ve tried
it, and it doesn’t work, whereas the other
ones said they tried it, and it works. 50% said it works all the
time whenever they tried it, and 30 or so percent
said it works sometimes. Does that concern you in terms
of the perception from the SEO community about negative
SEO working so well and people saying they
actually tried it? Does that concern you at all? JOHN MUELLER: It’s
tricky to say, because we’d have to look
at the individual cases. I mean, we see a
lot of situations where people come to the
forums, or they come to us, and say, hey, I’m being
attacked by negative SEO. And we take a look
at the details, and we realize, well,
either the things that are happening have no effect
at all on the website, or we can tell that the
things that are happening now are the same things
that happened over three or four years ago,
and it’s probably not a competitor
that’s doing that. The tricky part is,
of course, you never see the direct connection
between the negative SEO things and any change on a website. And often you’ll
see changes that are completely
unrelated to each other. SPEAKER 1: No,
what I’m asking is, even if it’s right or wrong,
the percentages of people saying they think it’s possible, or
they have said it’s possible, and they’ve tried it
themselves is pretty alarming. That perception
should concern you. JOHN MUELLER: You
have a scary audience. Very– SPEAKER 1: I have
a scary audience. JOHN MUELLER: That are
attacking their competitors. It’s like to some
extent, you want to have a fair business, right? If you’re active
online, so it’d be interesting to look into
a lot of those details. The hard part, of course,
is the things that we see, that we analyze,
they tend to show that a lot of these
negative SEO things don’t have any effect at all. Or don’t have a negative
effect on that side, so it’s hard to go from a
general survey like that to saying there’s
something specific that’s not working as it should. Because if– SPEAKER 1: No. I’m not asking
about– I’m not saying that anything is not working. I’m saying, it could be. It could not. I agree 100%. When you look into these
things, you see them, you see the detail. I agree with you on that. The question is the perception. I did another poll asking if
the people in the SEO community believe negative SEO is
easier, and like 75% said yeah, it’s a lot easier. So the question
is, the perception that SEOs or webmasters
have around negative SEO is that it’s getting easier,
and there’s concern for that, is that a concern for Google
and them figuring out a way to maybe build tools or
something that helps webmasters and SEOs maybe combat
that and stuff like that? That’s my concern. Not that it’s actually
possible, but do you guys realize it is a
concern in the community, and are you addressing
that in any way? JOHN MUELLER: Yeah. I mean, it’s been a topic since
I’ve been doing stuff online. And from that point of view,
it’s not something new. It’s not something that
we’re not aware of, something that we’re not
taking into the account for the algorithms
on the manual side. I think tools like
the Disavow tool make this a lot easier for the
webmasters involved in that. If you notice that
there’s something crazy happening with
links to your site, for example, then you can
Disavow those and move on. It’s not that you’re completely
helpless against this kind of activity. And I think the stronger,
more problematic kind of negative SEO where people
will actually go to a website, and hack it, and take over
the whole server, all of that, that’s probably more something
that you’d want to handle on a legal basis, not directly
like in [INAUDIBLE] situations. SPEAKER 1: OK. Thank you. SPEAKER 2: John, I do
wonder whether or not some things are
underestimated as well. One of the things that I
noticed years and years ago working for a web agency
and various others was I was working in the travel
industry and the car industry. And both industries
took it upon themselves as part of their
plans to actually use negative SEO as a tactic. And this was years ago. This is happening
today in a big way. A lot of the big industries
are using this tactic. JOHN MUELLER: I don’t know. I don’t think it has that
much of an effect as people actually think. And I think if you’re a
webmaster working on a website, and you have the choice between
investing time and money on improving your
website for the long run, or temporarily harming
a competitor’s site, then a lot of times it just
makes a lot more business sense to work on your website instead. Because that’s going to have
a long-term positive effect. So it’s– SPEAKER 2: This
isn’t a small scale. We’re talking huge
companies who– I’m talking very, very,
very large enterprises. They’ve got hundreds
of webmasters working on their sites. Some of them with teams
of 100 people in India, and very well-known
brands that we know today. I’m not mentioning any
names, but I know for a fact that this is what happens. You’re saying that
they need to spend more time on their own site. Having three people
in a team of 100 working on something like this
is not a big deal for them. The money being spent going to
Fiverr and buying $5 things, it’s not a lot of money
being spent, either. Now whether or not it works
or not is a different matter, but it’s being done. It’s 100% a fact that
it is being done. So it seems to be a cheaper
way to gain rankings on your competitors. JOHN MUELLER: I don’t
agree with that. I think I can see
that it’s definitely being done, but
whether or not it has a positive effect on
your website’s ranking, I totally think that we’re
catching a lot of those things. I definitely take your
concern, and we’ll talk about that with
the team as well. But I think, overall, we’re
really good at catching these kind of things appropriately, so
it’s– I think there are a lot of things that are being done
from SEO point of view that probably don’t make
that much sense. And I imagine this is
probably one of them, especially if you’re a bigger
company investing lots of money in this, that’s
probably a lot of money that you’ve just wasted. But people sometimes waste
money for weird things. SPEAKER 2: In the car
industry specifically, what people used to do was
as soon as a website became available, and somebody
started to rank for it, they would immediately
start negative SEOing it. So if you ever looked at
the history of the website, right from day
one, it looked bad. And so I don’t know whether
all of these factors all get taken into
consideration. But I mean there’s been
cases over the years where I’ve known
people in the industry, they’ve opened up a site. Before they’ve even
gone live, the site has been absolutely
hammered by bad links. So when you look at history, and
you say, what have they done? There’s certain people that
I know for a fact have not done any of these things,
and all of a sudden links are appearing left,
right, and center. And we’re going back three
or four years now, but that– JOHN MUELLER: I think a lot
of that is just totally being overestimated, and
we’re picking up on pretty much all
of these things. I mean, there are
definitely things we could be doing better on. But I think for the
most part, the issues that we see in that regard
are handled appropriately. But let’s take a look
at some other questions. SPEAKER 2: Sure. Lots of other things happen. JOHN MUELLER: One of my sites
had an algorithm penalty that I decided to move
it to a new domain through a three-alarm redirect. All my ranking came back,
but after a few weeks it seems that the old
domain penalty has been redirected to a new domain. Now what should I
do to recover it? Essentially what you
need to do to recover these kinds of issues
is clean up the issues. Just moving everything
to a new domain just essentially forwards
them to the new domain. So that’s not really a
resolution of the issues that you were having. So really taking a step
back, looking at the issues that you might be seeing there,
actually cleaning that up is probably what I’d
recommend doing there. John, we don’t
appear to be getting real-time stats for content
experiments anymore. Based on what we see
with internal tools, has this changed recently? I’m not really sure what you
mean with content experiments. Tom, are you here? Is this Tom? This is the wrong Tom. Maybe not the right Tom. TOM: Hi, John. JOHN MUELLER: Yeah. TOM: Yeah. I’m referring to A/B tests
through Google Analytics. JOHN MUELLER: OK. Yeah. That probably would be in
Analytics, though, then. That wouldn’t be in
Webmaster Tools, right? TOM: No, not Webmaster Tools. JOHN MUELLER: OK. So I can’t really help
with Analytics questions. You’d probably need to
check in the Analytics forum or with someone from the
Analytics team on that. So I don’t really have
any insight on that. Sorry. TOM: OK. Thanks. JOHN MUELLER: How long before
the next Panda refresh? We tend not to
pre-announce these things. I think Panda is an algorithm
that runs more regularly now, so that’s something where
I imagine we’ll probably see updates every couple of
weeks, every couple of months, that kind of timeframe. I’ve had a Panda
penalization for thin content in the rental car industry. It seems to have
affected several websites in this industry in Spain. I’ve just fixed
with great content, but how much time can it take
to get out of this penalty? So generally Panda is
a quality algorithm, so if you have a
lot of thin content on your pages, that’s
something cleaning up would be a good idea
to actually do there. How long it takes is hard to
say, because on the one hand, we have to recrawl
and re-index all of these pages, which
can take differently. So some pages we recrawl
every couple of days. Other pages we recrawl
every couple of weeks or even every couple of
months, so sometimes that takes a while for us
to actually update it from a technical point of view. But then we have to kind of
recompile all the signals that we used to run the
Panda algorithm for, and then we have to do an update
of the Panda algorithm data to actually show that
in the search results. So if you’ve just now
cleaned up this issue, and you’ve significantly cleaned
out all of this thin content, then I imagine you’re
looking at something like maybe a couple of
months for this data to be updated again. So it’s not something you’d
see from one day to the next. It really takes awhile to
actually be reprocessed, recalculated, and updated in
the visible search results. SPEAKER 1: John, for
sites with penalties, algorithmic penalties, does
Google crawl the sites slower or less often because
they have a penalty? Like with specifically Panda,
like low-quality content? Would they crawl less often? JOHN MUELLER: Usually not. So usually we treat crawling
as a kind of technical thing where we’d say we need to
update this data regularly. And it doesn’t
really matter how we see the quality of the website. On the other hand, if the
website is such bad quality that it doesn’t make sense
for us to even crawl at all, then that’s one step that
you could be seeing there. But if it’s just seeing
lower quality content, we crawl it, essentially
just the same as we would normal content. SPEAKER 2: John, would
using robots.txt to block a specific large set
of pages be a problem, or would it speed it up? JOHN MUELLER: Would
it speed it up? Sometimes it could. So for the most part, we
were able to crawl websites as quickly as we
want to crawl them. So we’re not limited by the
website, by the bandwidth through the website,
or anything like that. Sometimes we are limited by
the bandwidth to the website, so in cases like that, if you
robot a part of your website out, so that we don’t have
to consider crawling it, then that can help us to
focus on the other URLs. But for the most part, we
don’t need to worry about that. We can pick up as much
as we need anyway. So it’s really something
where if you’re stuck in a situation where you
think that Google can’t crawl as much as it would like
to from your website, it probably makes more
sense to just make sure that your website is faster, so
that it can be crawled faster. SPEAKER 3: Took
it upon themselves as part of their plans. JOHN MUELLER: Whoops. SPEAKER 2: Yeah. So that being the case, if
you were to use robots.txt, how long would it take before
Google removed that section maybe, rather than using
the URL removal tool, if you were to use robots? Because sometimes
there’s wild cards. And you need to use wild
cards, and you can’t do that with the URL removal tool,
which is something else I think I mentioned in another question. Is there room for
that to come along? If we’ve got the power
to do it in robots.txt, why not give us the same power
to do it in Webmaster Tools? JOHN MUELLER: So with
the robots.txt file, we actually have to try to
recrawl those URLs before we put them into the roboted state. So if you switch on a robots.txt
that disallows everything today, then we’re not going
to remove all the content from your website today. We’re going to wait until we try
to recrawl those pages and see, oh, now this page is blocked,
so we’ll take out the content and just keep that
page index with the URL alone, maybe with
anchors to that page, those kind of things. So that’s not something that
would happen from one day to the next. That would take the normal
crawl cycle to be updated there. And since it takes a normal
crawl cycle, if you just want to remove it from
the search results, putting a no-index on there,
serving a 404, both of those would be just as quick. The URL removal tool
itself essentially just hides those pages from
the search results, so it’s not that it removes
them from the index. It just prevents them from being
shown in the search results. So it’s subtly different
in that regard. With regards to wild cards
for the URL removal tool, it’s something where
we notice that people tend to mess up a
lot with wild cards. I know if we give them wild
cards in the removal tools, then it’s very easy
to take out a big part of the site accidentally,
not actually realize that they’re doing this. So we try to focus primarily
on clear folders, domains, your main website
essentially, and make it so that whatever’s
removed is really obvious to the user what’s
actually hidden in the search results during that time. SPEAKER 2: If we have the power
to do it via robots.txt though, it’s kind of the same thing. Could you not put
another level of security in that says, are
you really sure you know what you’re doing? JOHN MUELLER: Well, if
you ask the webmaster if they know what they’re
doing, they always know what they’re doing. I mean, you guys know
what you’re doing, and you probably make mistakes
from time to time as well. Same thing applies to me. So that’s something where– SPEAKER 2: [INAUDIBLE]
a suggestion. How about if you
actually had a– like you have the testing tool for
the robots.txt in Webmaster Tools, how about it
actually gives you a list of all of
the pages that would be removed if you use this tool? So it’d say, you’ve got
4,000 pages indexed. Here’s a list of them. If you use these wild cards,
this is what will be removed, and then people would actually
be able to identify what it is. They’d go through
the list and go, oh, no, I’m not happy with
that, and they won’t do it. It’s kind of a simple tool. It might be very powerful. JOHN MUELLER: I mean,
that’s a possibility. But I guess, primarily,
the removal tool was really meant for situations where
you have an urgent need to remove something
really quickly. And usually those
aren’t the situations where you say, I
need to take out this subtle section
of my website, but rather where you say, I
need to take out everything under this subdirectory, or
everything on my whole host, because of something really
critical and urgent that happened, I need to have
hidden in the search results. So that’s usually something
where the robots.txt is kind of different,
because it’s really more focusing on the
crawling side rather than on an actual
urgent removal. With the robots.txt,
the other advantage there is that if we see a
lot of links to those pages, we can still show it
in the search results. And sometimes these pages rank
first in the search results, even though they’re blocked
by the robots.txt file, just because we think they’re
really important, because we think that they’re really
useful for the user. We see a lot of
links to those pages. Sometimes we see that
with our own pages as well, where maybe we’ll
have a login page that’s blocked by the robots.txt
file, but they’re looking for this specific tool. And searching for it shows that
tool in the search results. It also shows the description
as blocked by the robots.txt. But it’s still
available for the user. Whereas if you used
the removal tool, then it will be gone completely. There’s no way that it would
be visible at all for the user. SPEAKER 2: So the robots.txt
won’t treat it like a no-follow as well? JOHN MUELLER: Yes, it’ll
be like a no-follow, because we can’t actually
see the content there. We can’t see the
[INAUDIBLE] on the page. SPEAKER 2: I didn’t mean that. I meant no-index. Sorry. JOHN MUELLER: It wouldn’t
treat it as a no-index, because we essentially
just can’t crawl it. We don’t know what’s there. It might be something
really important. It might be something
less important. So we have to focus
on other aspects that we can pick up on to show
it or not show it in the search results. SPEAKER 2: Gotcha. Thanks, John. JOHN MUELLER: All right. Hi, John. We recently lost review rich
snippets in the search results for our IT domain. Markup is OK in the
structure data testing tool, and identical to other domains
where the snippets show. What can we do? I took a quick look at your
site, Tom, and the main issues that we’re seeing there is that
our algorithms– kind of feel that the quality
of the site overall isn’t as great as it could be. So that’s something where maybe
the different CCTLD versions of your site are kind
of subtly different, or maybe there are other
aspects involved there that you might want
to take a look at. But for rich snippets,
we do take a look at the technical side of things. That’s basically the
foundation that we have to have for it first. And then, if there are
issues from a quality side of from a policy side,
those can play a role as well. So that’s something that’s
happening with your site there. So taking a step back
and looking at your site overall to see what you
can do to improve that would be a good idea. Mom and pop e-commerce
sites hit by Panda 4.0. Are features like responsive
design, product reviews, product swatches going to help? SEO professionals are
telling me to add text. So having text is
always a good idea. I didn’t actually look at your
website, so it’s hard for me to make any general
recommendations. But generally when
it comes to Panda, which is a quality
algorithm, we try to understand the
quality of the website and how it works together
with users, how the content is there, what we can pick up on
as general quality signals. So what I’d
recommend doing there is taking a look at the blog
posts from Amit Singhal, maybe two or three years back
now, with 23 questions you can ask yourself regarding
high-quality websites, and going through that
with a group of people who aren’t associated
with your website. And usually when you
do that, they come up with a lot of good ideas on
things where they think maybe your website isn’t as
good as it could be, or maybe there
are specific areas where you could improve on. So that’s essentially what
I’d recommend doing there. Really taking a
step back, looking at the quality of
your site overall, and making sure that you’re
doing all of the right things. With regards to text,
like I mentioned before, just having text on a
page doesn’t necessarily make it high quality. So really making sure that
everything around your site is the best it can
be is something I recommend doing there. Googlebot executes JavaScript,
but not the Google Analytics JavaScript, right? Because that would
inflate traffic numbers. Yes, I believe that’s
the case that we don’t execute the
Analytics JavaScript. We also try not
to execute things that we can recognize
as being tracking scripts from other sites. So if we can recognize that,
we’ll try to block that. If it’s blocked by the
robots.txt file, of course, we can’t pick it up
and process it either. So that’s something
where we’re working to try to make sure
that we’re doing the right thing with
JavaScript there. We have an issue with bad
links on one of our websites, but we don’t have
a manual action, so we can’t make a
reconsideration request. Is there any way to get
in touch with Google and let them remove our
link profile manually? You can post in the
help forum, which is something where we generally
go through and try to look at. The thing with algorithmic
changes on our side is that we don’t have
a way to take out these algorithmic actions. It’s essentially something where
if there’s a manual action, you can resolve that
with the webspam team. You can take action
to clean that up with algorithmic changes. You have to let the
algorithm run through that. So if there’s anything
specific that you think we’re picking up on
incorrectly algorithmically, letting us know through the
help forums, or contacting us directly is
probably a good idea. But we can’t push a
button and free your site from any kind of
algorithmic change that’s happening on our side. SPEAKER 4: Where do we
find that help forum? JOHN MUELLER: The help
forum, I can send you a link. Let me just bring it up. SPEAKER 4: OK Thank you. JOHN MUELLER: It’s essentially
just the normal Google webmaster help forum. We’ll say chat. So that’s a possibility. You can also post on Google+. Maybe add us on Google+, and we
can take a look there as well. OK. We are having some issues
facing to get Google to index our site. Here’s a screenshot. As you can see,
three sitemaps are starting to kick
in for indexing, but one sitemap is
not starting to index. I’d have to take a look
at the specifics there to see what specifically
is happening there. In general, with sitemaps,
it’s important that use of URLs that are 100% the
same as the ones that you actually
have on your website, that you use on your website
that you want to have indexed. So for example, if you submit
a sitemap with all URLs with a trailing slash,
and within your website you went to the URLs
without the trailing slash, then we see the
mismatch there, and we wouldn’t think that these
URLs are actually indexed. The content itself
might be indexed, but since you’re
using different URLs, we’re not going to count that
as being indexed in the Sitemaps feature in Webmaster Tools. We’ve taken down many
pages over the years giving 410 or 404 responses,
but external sites still have links. How can we permanently remove
these URLs from the index, so that we get beyond
the list of 1,000 in the crawl errors
download file? So you can’t remove
them completely from us trying to crawl them. Usually if we’ve
seen a page before, and that turns into 404 or 410,
we’ll still periodically try to recrawl those pages, and
we’ll periodically bring those back up in the Crawl Error
section of Webmaster Tools. The good part here
is that we try to prioritizes the
list in the Crawl Error section in Webmaster Tools
so that those URLs that are actually relevant
to you usually should be on top
of the list there. And if you only see URLs
in the Crawl Error section that are completely random,
that are completely out of date, that you haven’t used
for quite some time, then that essentially means that we
haven’t found anything critical when we crawl your website. And it’s normal for
us to see crawl errors when crawling the website. That’s essentially
a sign that you’re doing it technically right. So I wouldn’t try to
suppress those in any way, but rather just
look at that list. If you see nothing
really critical on top, then you probably won’t find
anything really critical in the rest of the list
or below the first 1,000 that we show there. We use Trustpilot. Oh, wait. Wrong link. How can Google detect
a negative SEO attack? I think we looked at this one. We use Trustpilot
to gather reviews. I was wondering how
we get our five star rating for our services
in the search results. Any help on this would
be greatly appreciated. So when it comes to the
reviews in rich snippets, it’s important for
us that you use reviews that you’ve collected
directly on your site itself. So if you’re collecting these
reviews on some third party sites, then we probably wouldn’t
be showing that for your site. We’d be showing
that on their site when they review your products. So from that point
of view, depending on what you’re
trying to achieve, it might make sense to
collect these reviews directly on your website itself, instead
of using a third-party site. Or if you’re using these
third-party reviews to show up in the
search results as well, then maybe that’s fine
to just keep them there. Our client server doesn’t
support htaccess Java server. How can we solve new
directions that Google likes? Is there an alternative
for 404 errors to redirect without 301 or
htaccess that Google likes? It’s tricky. So I think most
Java servers should support some kind
of redirection. And that’s
essentially what we’re looking for when you’re moving
from one URL to another. We really want to
see a strong sign that when that URL is
accessed, it always sends us directly
to the new URL, so that we can follow that and
actually focus on the new URL. And if you can’t do any
kind of redirect there, then that’s really
hard for us, because we see both of these URLs. Maybe we see the same contact
on both of these URLs, and it’s hard for us to
make a decision which one we should actually index. Some things that you could do
is look at perhaps using the rel canonical to let us know about
your choice of preferred URLs. If you’re moving the
content completely from one URL to another
one, and you don’t have it on the old URL,
then of course, you don’t have the rel
canonical there. So I’d really try to work
with your web developer to make sure that you
can actually set up some kind of redirect from
the old URLs to the new ones. Even with Java servers,
that’s something that should be possible. A lot of websites
have Java servers, Java backends, and they
do redirects as well, so that should be
possible somehow. JOSHUA BERG: John? JOHN MUELLER: Yes. JOSHUA BERG: Hi. I’ve got a question related
to moving sites as well. And this company
I was asked about, there’s two companies
merging together. They’ve got two older
fairly seasoned sites. So they are building
a new site that they want to merge
everything over to. So I was wondering if you had
any suggestions about best practices? Because their idea was
to maybe do it slowly over a period of time,
so that they didn’t lose the customers or
something like that. But then they said
they also wanted to keep all their SEO
benefits and stuff. So I was like, well, that
may not really work so well. Because if you do
it page by page, then it’s not really
best for the user, because the sites are
not going to work well. So do you think moving
one site to the other, or just both of them to the new
location all at the same time? Any suggestions on best practice
for how something like that might be done? I also know that
frequently on site moves, that a company a
site rebuild, there is usually a
deprecation in rankings. I think a lot of that has to
do with Google just trying to understand the new
structure as well. JOHN MUELLER: No. So normal site moves from
one domain to another are getting a lot easier for
us, because we can recognize that kind of situation
a lot easier, and if you give us information
through Webmaster Tools saying, I’ve moved from this
domain to that domain, then that makes it a
lot easier for us. Especially if there’s this
real one-to-one market there, we can start to crawl
maybe a little bit faster, to pick up these changes,
and send all the signals over as quickly as possible. That’s essentially
the easier variation. And like you mentioned,
even there sometimes sites see significant changes in
rankings, at least for awhile. So if you’re going to a
more complex situation where you have two websites that
are essentially being merged into a third, then
that’s something where I imagine
we’ll definitely see fluctuations for
quite some time. Because we have to
process them individually on a per-URL basis, go
through, find those redirects, send those signals individually
over to the new domain, find a way to merge those
signals from those two domains. And that’s not going
to be something that’s going to be trivial on
our side, and probably not something where our algorithms
would recognize that as any kind of a
normal site move, so we’d have to really handle
that on a per-URL basis. With regards to doing it all
at once or kind of staging it, I think doing it
all at once makes it a little bit easier for
us, because we recognize a strong change
within these websites. And we can start working
on processing those signals as quickly as possible. Whereas if you’re doing
this in a staged way, then we’ll see some URLs redirecting,
some URLs not redirecting. We won’t really
know which signals we can keep on the
old domain, which ones we should send
over to the new domain. That makes it a
lot harder for us. So doing it, if possible,
all at the same time makes a lot of sense. But even there, if
you’re merging two sites, it’s probably going to be a
rocky transition from those two sites ranking separately
to one site just right. JOSHUA BERG: Do you
think maybe doing one at a time– one and then
another a few weeks later might be any improvement? JOHN MUELLER: I don’t know. It’s always going to
be a tricky situation, so doing it from one
domain to a new domain is probably easier
for us to process. But even there,
it’s not going to be a matter of a couple of
days or a couple of weeks. It’s going to take a bit
of time to really update all of those signals. So if you’d want to move like
from one domain to [INAUDIBLE] a year later, and if
prod send the rest over, that’s going to be
quite some time. I don’t know if
you have that time. Personally I just
bite the bullet and really do it at a time
where traditionally maybe the websites don’t
have that much traffic. If you can choose maybe a season
where there’s traditionally a downturn on the website, not
many people searching for it, not much traffic
happening on the website, that’s always a good
time to do that. Sometimes you can’t
choose when this happens, so I just find a
way to make sure that technically everything’s
handled correctly, and then just bite the
bullet and actually do that. Usually the hard work
is really in making sure that technically everything
is working correctly. Making sure you have
all the redirects there, making sure that the new
site structure, which is a combination of these both
websites actually works well. That’s already a lot of
work, and that’s probably going to take a little
bit of back and forth until you have it all finalized. All right. There was another
question in between. Who was that? SPEAKER 5: Yeah, I had a
question about iframe embeds. So just like
SlideShare and YouTube, we offer embedding of
some of our content. And I was wondering
how Google treats that. So if some other page
embeds my content, is that treated as a link? Can that be expected in Penguin? JOHN MUELLER: So
depending on how the embed is, we might
see a link there. A lot of sites, when
they offer an embed, they include a link there
to the resource directly. So that’s something,
depending on what you provide as an embed,
could play a role there. Depending on how we can
crawl that embedded content, it kind of depends
on whether or not we could theoretically include
it in that website or not. A lot of times the
embedded content is blocked by robots.txt,
so we wouldn’t be able to pick that up and
include it in that website. But depending on
the type of embed you have there, that’s
something we either pick up the link or not, if
you have a link there or not. If it’s just an iframe, then we
wouldn’t treat that as a link. SPEAKER 5: But if
it’s like a YouTube embed where it’s
an iframe that has a special URL for that
content, and if you go to that page, the
URL that is iframed, you will see that the
rel canonical points to the actual page. JOHN MUELLER: I don’t think
we’d treat that as a link. So if it’s just an iframe,
especially because you can’t add like a
rel nofollow there, that’s not something
where we’d say this is the same or
equivalent to the normal HTML link that we could
find otherwise. But like I mentioned,
a lot of sites, when they offer
their embed code, they also have a
link to the content directly in the embed
code, so that’s usually the place where we would
pick up those links. SPEAKER 5: OK. So that’s good. So then I cannot be
impacted by Penguin then, on the bright side. JOHN MUELLER: Well, at
least from those iframes. Yeah. I think if you have
great content that people want to embed, then
that’s always a good sign. Right? SPEAKER 5: Right. SPEAKER 1: John? JOHN MUELLER: Yeah. SPEAKER 1: I have another
question, if I can. I don’t know if it’s a good
time [INAUDIBLE] another timer. Should I ask? All right. So Miley referenced
at SMX Advanced last week that you
shouldn’t really use a script tag
or stuff like that, because that kind of is a flag,
in terms of hiding content. Miley doesn’t really even know
script tag, or maybe using certain things that
would hide content. And it’s pretty much, she
said, either were automatic, like red flag or
something like that. I tried to get more information
out of her about that, but I don’t think
she’s really– I don’t know how involved
she is in the webmaster team versus other teams. So since you’re very
involved in that area, could you talk a
little bit about that? What’s like an
automatic red flag in terms of hiding
content, maybe? Like things webmasters
should maybe stay away from using no
script, no tags, or whatever you want to call them. JOHN MUELLER: Yeah. It’s something where
traditionally we’ve seen a lot of spammers
misuse those tags and use them to hide content,
use them to [INAUDIBLE] cloak to Googlebot. So traditionally we’ve been very
reluctant to treat that content as something that’s really
relevant for the page. Because if it’s not directly
visible in the browser, then it’s probably not
that important for us, so we might not actually
use it that way. So from that point
of view, I’d try to avoid putting
anything there that you find is important
for your website. And there are sometimes
technical reasons to use these tags, but if
you are putting anything there that you think is
important for your site, then that’s probably not
the best place for that. I’d probably just put that
into the content normally, so that users and
Googlebots see it normally. A lot of these
things have evolved over time, in that they’re
normal HTML technique that makes sense to use sometimes. But spammers pick them up and
try to use them in weird ways. So our algorithms tend
to be a little bit more reluctant in trusting
them completely. And if we can’t trust
them completely, and if they’re good
alternatives for you to use, then I’d recommend
staying away from them if you don’t actually need them
from a technical point of view. SPEAKER 1: So when she said
automatic red flag, what does that– that means you’re
not using any content in there, or that means it gets
put into a bucket where manual Google spam
action superheroes go and check it out? JOHN MUELLER: We try to
avoid doing something where some system
automatically flags content, and someone has to
manually review that, because there’s just so
much content on the internet that that would never scale
from our point of view. But like I mentioned,
if the algorithms are reluctant to
use this content, if the engineers that are
creating these algorithms see that they’ve been misused
for spamming reasons for quite some time, and a lot of
the spam is still online, then the algorithms are going
to be really picky about using that content for
anything useful. So that’s essentially
where we’d say, you can use this for
technical reasons if you absolutely
need to use it, but if there’s
something on your site that you want to have indexed,
and crawled, and visible in search results normally,
then just use the normal ways to actually show that. SPEAKER 1: So
specifically, we should try to avoid the noscript tag. What else would you
recommend us trying to avoid? JOHN MUELLER: I wouldn’t
say try to avoid using it, but try to avoid using
it for any content that you find is important
for your website. SPEAKER 1: OK. So that includes
the noscript tag. Anything else? JOHN MUELLER: I’m
sure there are things that I can’t think
of at a moment. SPEAKER 1: OK. JOHN MUELLER: But a
lot of these things are kind of obvious in
that if you’re a webmaster, and you’re working
on your website, and you have something important
that you want to have crawled and indexed and shown
in the search results, then you’re going to try
to put it on your pages so that users see it as well. So from that point of view,
something like the noscript tag probably isn’t the best place
to put your important content. SPEAKER 1: OK. Thank you. SPEAKER 2: John, I’ve
got a quick question about quality spam we’ve seen,
or spam that we’ve seen over since Panda– the last Panda. I’m going to post a link and
not talk about the brand itself, but it’s an example
of what we’re seeing. There seems to be
a more sites using statistics that are
statistically made up, and ranking well
because of them. The link I posted
in the chat there is an example of a company
where we spoke to over 20 people in our industry, and asked them
whether or not any of the data is correct about
them, and it’s not. And this company has now done
very well out since Panda 4.0, and was doing so before. But the reality is that these
statistic sites, the ones that were gathering data based
on where’s your domain hosted and all these
kind of things, they’re getting smarter. And they’re working out how
to put better data in there at better quality. But at the end of the day,
the data itself is garbage. And so what they’re
doing is, they’re trying to rank well for lots
of keywords, in the hope that you will then
login and correct the information about
your own company, in order for them to
have correct statistics. Kind of almost like a
bullying sort of behavior. What’s your stance on
businesses like that? And are you guys aware that
companies are now kind of just making stuff up? JOHN MUELLER: Well, companies
have been making stuff up since forever, so we have to
assume that this is happening. A lot of the GUI sites have been
generating these kind of things for awhile now. From our point of
view, a lot of this is automatically-generated
content. We try to treat it
as such sometimes. We do a better job
at catching it. Sometimes we don’t do
that well at catching it. So feedback like this is
definitely useful for us to take a look at to see
what these sites are doing. But sometimes they’re
also doing something that’s kind of useful. But it really depends
a bit on what exactly is happening behind the
scenes, how much of this is just auto-generated or
just randomized or even random content. And based on that we have
to take a look at that, either manually or find ways
to algorithmically catch these kind of issues. But it’s definitely
good feedback to have, good example URL to look
at with the team as well. All right. We have one minute left. Who wants the last question? SPEAKER 5: John, if I may
squeeze in a last question. It’s about comment
spam that I’ve seen on my website for the
last several months now. And it’s a different
kind of comment spam, in the sense that it
doesn’t have any links. But the comment itself
is very generic. It’s like I’m so glad
I found your blog, and I’ll come back
again, and when I Google some of
those phrases, I find the exact same comment
left on 100 other websites. So the question is,
how much importance does Google’s algorithm attach
to comments on a web page? Because on the one hand, it’s
good to have good comments. On the other hand,
it’s common some people don’t have good grammar. So how do you tackle that? JOHN MUELLER: So from
our point of view, if it’s hosted on your
website, it’s your content. So we don’t differentiate
that much with regards to who it looks like might
have written this content. If it’s hosted on
your pages, if it’s low-quality content,
that’s something we associate with your website. So in an extreme case where you
have a short blog post that’s just two or three
sentences, and then you have five pages of
really low-quality comments, then when we look at
that page overall, we see all of this low-quality
content, and we’re saying, well, overall this page
doesn’t really look that great. We might recognize that it looks
like other people are leaving comments there, but essentially
it’s on your website. You’ve agreed to
publish this content. You’re promoting this
essentially on your website, so that’s something where if
you see this low-quality content being added to your site,
I’d try to take action to clean that up, even if there
are no links attached to it. SPEAKER 2: Sorry, I lost all
power there for some reason. I just have one other little
point to add to the question that I was asking was
that, in the last Hangout, we talked about another
poor quality site. And almost the instant
that we talked about it, it received a boost. And I was wondering, do
people with high profiles, high Google+ profiles
like yourself and others, if they view a site, does
that actually give it more of a boost for some reason? JOHN MUELLER: No. No. SPEAKER 2: OK. JOHN MUELLER: That would be
a little too easy to game. SPEAKER 2: Yeah. OK. It just seems strange
then that the site that we were talking about
that was specifically poor is ranking even better. JOHN MUELLER: That would
be unrelated to that. Yeah. All right. So with that, we’re out of time. Thank you all for joining. Thanks for all your
questions, and I hope to see you guys again in
one of the future Hangouts. SPEAKER 2: Thanks, John. Enjoy the World Cup. JOSHUA BERG: Thanks so much. JOHN MUELLER: Bye. JOSHUA BERG: See you next week.

3 comments on “English Google Webmaster Central office-hours hangout”

  1. Massimiliano Rubino says:

    Parallelize downloads across hostnames? For example images across subdomains, does this practice interfere with Google position as pictures will not be anymore in the principal domain?

  2. Никита Распутин says:

    Тут хоть какой нибудь русский есть?

  3. Randy Mc says:

    will google internet soon cover all the usa?

Leave a Reply

Your email address will not be published. Required fields are marked *