Keyword traffic & backlinks going to PDFs not webpages

Solvita.com website has this bizarre problem: In Google Webmaster Tools, the content keywords section, all keywords for the top 20 list are coming from PDF’s and no actual HTML pages. The area that documents links, internal and backlinks, also reference only pdfs too.

I have tried numerous backlink checkers and they don’t show backlinks to PDFs, but when I used Google operator site:*solvita.com I see a list of only PDF files.

Is the Google Analytics code configured incorrectly? What would cause this problem?

First of all, this has got nothing to do with Google Analytics. Whatever you do in the way of configuring Analytics, it won’t have any effect on how Google ranks the pages in your site.

In general, the reason that you are seeing so many PDFs in the query results is that Google believes that the PDFs answer the searchers’ queries better than any other pages on the site. It doesn’t mean the other pages are not listed. On the contrary, I just searched for site:solvita.com (without the asterisk), and it found about 150 pages, of which only about 25 were PDFs. Furthermore, searching for “making respiration visible” (which I assume is one of your key expressions), your home page came first in the results.

If you really don’t want the PDFs to appear anywhere in the results, you can use robots.txt to block them. Something like this:

User-Agent: *
Disallow: /*.pdf$

(If you are unsure about how to use robots.txt, you’ll find plenty of threads here on Sitepoint that will help.)

Mike

3 Likes

If you really don’t want the PDFs to appear anywhere in the results, you can use robots.txt to block them. Something like this:

That’s exactly right. And to caveat that, be sure that you want them blocked before you do so. If Google thinks that those are the most relevant content to refer searchers too, perhaps there’s a reason. If you block them, most SRPs will adhere to that request, and then searchers won’t find them - which means potentially less traffic for you. Food for thought, but if you know what you want (stop search traffic to those documents), that’s how to get it.

1 Like

Thanks for the fast response. My client, Solvita, was just concerned about why PDFs were showing up in Google Webmaster Tools reports, and no webpages at all. I agree with what you said, and that’s what I initially told my client. I guess the PDFs are so full of rich relevant content, that people find them very useful.

I also suggested they convert the PDFs to HTML which may enable more people to access them, since some users may not be able to read PDFs or may not want to go to the trouble of opening or downloading PDFs.

Good answer, friend. I don’t think Solvita wants to block the search engines from their PDFs, but is just shocked that in Google Webmaster Tools reports, no webpages, only PDFs were showing up.

It’s odd that various backlink checkers were finding plenty of links to their webpages, but not Google Webmaster Tools. That’s why I thought maybe something weird was going on with the Google Analytics code, like a typo or not placing it correctly in the HTML document.

I suggested that Solvita consider converting these PDFs into HTML as new subpages on their website, which would enable more people to view them, since there might be users who would not be able to view a PDF or not want to go to the effort of opening and downloading the PDFs.

Is that a good suggestion?

If the content lends itself to being either, I’d normally say that a page is always better… you can optimize the layout for various devices, optimize the pages for speed and for SEO, etc… definitely more options with a page than with a PDF. Unless they need a quick and easy to download version - and they really believe it’s something people will want a hard copy of.

2 Likes

That’s a fantastic answer. I was wondering about the pros and cons of providing information via a PDF file. I think the ONLY reason to use a PDF is to have a format that MIGHT look good as a print out. Often, though, they tend to eat up a lot of printer ink and have a lot of graphics that are not important.

I couldn’t agree more. Too often, companies publish documents as PDFs only because the documents were previously published on paper. They take the design and layout of the paper version, convert it to a PDF, and assume that everyone will print the document and read it on paper as before. It would make more sense to forget the old way of doing it, and focus instead on publishing the information as HTML pages.

Going back to your original question, it seems that the problem is not that visitors are finding the PDFs in preference to any othe pages on the site, but rather that GWM is showing the PDFs as the high raking pages. If that’s right, I wouldn’t worry too much. What’s important is the number of people who find the site, and the number that convert to paying customers.

Mike

2 Likes

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.