Why Google Index PDF Files as Images? SEO Perspective

For Google to index your PDF files as images is not a good thing. This simply means your site is not optimized well for search engines. In this article we’ll review two PDF file scenarios and how search engines such as Google, Yahoo and MSN index them. If you’re reading this, then you’re most likely concerned about SEO and how your site ranks. Before you reach out to a search engine optimization agency read about a few basic SEO tips to help your site rank better.

Scenario 1: PDF Files and Search Engines:

Put yourself in this situation. You’re a professional with a great idea for your company’s website. You want to reach out to your customer base by posting a PDF on your website. You picked a PDF format because you have control over how the document looks and you enjoy the piece of mind knowing the document is difficult to edit. Before you post the PDF, consider these three search engine optimization questions.

1. Can you highlight the text?

For SEO purposes there are two types of PDF files: files that the text can be highlighted and files that the text cannot be highlighted. If you cannot highlight the text on your PDF, you’re abandoning potentially valuable SEO content because search engines see your PDF as an image file and not a searchable document (please see scenario 2).

If you can highlight the text within the PDF, then you’re on your way to potential SEO content gold. Now it is time to further optimize the PDF for a user of a search engine query.

2. What is the name of the PDF?

Search engines pay attention to your PDF title. Before posting the PDF to your website consider renaming the PDF to an SEO-friendly title. Unlike files on your computer, the Internet cannot have gaps in URL text, therefore consider using hyphens (-)’s, or underscores (_)’s to convey the title of your document. For example if the title of your document on your computer is “Atlanta SEO Agency.pdf,” consider naming the file “Atlanta-seo-agency.pdf” prior to uploading the content to your website. (Note: the title of a PDF replaces a title tag on a traditional website page).

The result is the PDF posted to your website it will read “” This allows you to gain further value once this PDF has been indexed by a search engine. The first readable line of text within the PDF will display at the title of a search engine query result. Let’s continue how the search engines post PDF results in a search engine query.

3. What text is featured on the first few lines of text?

The second part of a search engine result is a description. On traditional pages this could be a description tag, or a snippet of text that encompasses the keyword phrase being used in the search engine query. A PDF file is treated very specifically in search engine results. In many cases the description is the first few lines of legible text.

For example if your PDF is a whitepaper and the first few lines of text is focused around copy write information or other information that is not intuitive to a search engine user as to the focus of the document, consider revising the document so the first few lines of text speaks specifically to the description of the document.

Scenario 2: PDF is an image:

In the first scenario we described a text-based PDF that could be highlighted. In this scenario we review what happens when a PDF cannot be highlighted, but instead has qualities of an image. An essential rule of search engine optimization is that search engines don’t have eyes. Meaning, if you have text on your page that is an image, the search engine will not be able to read these words.

When a PDF cannot be highlighted the result is a file that has the same value to the search engine as if you took a document, placed it on a scanner, and transferred the scanned image to your computer. In this scenario the search engine cannot differentiate from a PDF that contains the word “apple” and an actual photo of an apple. To the search engine, it’s all just an image.

Curious to know more about how PDF files can be leveraged with search engines? A final piece of advice is to consider incorporating PDF files on your website when it can improve user experience. Search engine algorithms have placed a heightened value of importance on such files and leveraging this files correctly can help your company rank higher for targeted keyword phrases.

As you can see there is a science to optimizing PDF files for search engines. To a savvy individual, implementing some of the best practices described in this article can help improve the visibility of your PDF and your site overall. For more complicated situations you may want to consider reaching out to a SEO agency or a SEO web design agency for additional search engine marketing agency advice.

