Using Google Analytics to prove the SEO Long tail theory

I read an interesting post on Dave Naylor’s blog on Wednesday by David Whitehouse on using Google Analytics to segment short and long tail keywords using regular expressions. This post was then to be discovered as something initially developed by Ben Gott over at search engine land.

Background

Well I’m really bad with regular expressions, but i am good with Excel, I also love data so I thought about all the post’s I read about Search journeys and how the common conception is that the longer the search query the smaller the search volume but larger the conversion rate.

The image below shows the common conception of keyword type by search volume.

Anyway I want to know, “Does a longer keyword actually give a better Conversion rate?”

It’s a big question so I thought how can Google Analytics and some real data help me.

Setting up the test

OK well the good thing about working for a large Internet marketing company is that I get access to hundreds of Analytics accounts, so I took 13 Analytics accounts who have either e-commerce tracking or goals set up and set out about exporting the data.

Creating a custom report in Analytics
The next part of the post is just a quick overview of how to create a custom Google Analytics report which can pull out the number of conversions per keyword.

Ecommerce Conversion Rate

  • Custom Reporting
  • Create a new custom report
  • Metrics –> Site Usage Visits
  • Metrics –> Ecommerce Transactions
  • Metrics –> Total Goal Completions
  • Dimensions –> Traffic Sources –> Source
  • Dimensions –> Traffic Sources –> Keyword
  • View Report Google –> Organic
  • Take 1 years worth of data
  • Add &limit=50000 to top URL
  • Advanced Filter –> keyword Excluding –> Company name
  • Export –> Data as a CSV

Overview of Data

So we now have the data all pasted into an Excel sheet detailing:

  • Keyword
  • Visits
  • Transactions / Goals Complete

We then put a little Excel trickery into the mix adding a column with the code of:

=IF(LEN(TRIM(A2))=0,0,LEN(TRIM(A2))-LEN(SUBSTITUTE(A2,” β€œ,””))+1)

This adds a column with Query Length i.e. how many word made up the search term i.e. Identifying what are short and long tail keywords.

Finally I created another column dividing the No. of Transactions / Goals Complete by the number of visits, hence giving us a Conversion rate.

Finalizing the Data

Now I have all the information in an Excel table, the simple thing is to create a pivot table of the information.
For my first example I have chosen to pivot Query Length as the row and then No. of Visits and Conversion rate as columns. I have chosen to display the average results in columns to give the overall picture.

The Results

The graph below shows the average conversion rate versus the average number of visitors by query length.

Now I used 13 clients data over a total of 166,699 keywords.

We can see a clear picture that from a 1 phrase visit up to a 5 phrase visit the conversion rate is over double.

It’s not as uniform from 5 phrase visits to 10 phrase visits but I think this may shows that people using 5 phrases and above are still unsure about finding the right product or service.

Although the overall trend does show that conversion rate does increase as the search query increases.

Extending the results

The beauty of having data in excel is that it can be manipulated in any way, so I took the data above and filtered the results by removing keywords which had “0” / “Zero” conversions, just to see what the affect was.

Apart from the obvious of average Conversion rate in the data increasing massively and the same for Average number of visits this does correlate with the overall data but shows a more uniform conversion rate by query length.

Conclusion

From the data presented it does show clear support that the long tail theory in SEO still exists and it is still right to assume niche keywords will drive a higher conversion although they have a lower search volume.

If you have any questions on the data please leave a comment.

28 thoughts on “Using Google Analytics to prove the SEO Long tail theory”

  1. What about testing organic keywords from all sources and not just Google? My custom report uses dimensions “medium” then “keyword”, giving a more balanced approach

    • Hi Alex; I agree, it was actually a mistake on my behalf only using Google data πŸ™‚ but it gives the right indication. I could have pulled out a lot more data but it does take a bit of time. See you at the next Manchester SEO event πŸ™‚

  2. Here is the Query Length formula for Open Office users out there:

    =IF(LEN(TRIM(A2))=0;0;LEN(TRIM(A2))-LEN(SUBSTITUTE(A2;” “;””))+1)

    Open Office uses a semicolon to separate parameters instead of a comma.

  3. Nice work, that trend line is hard to argue with. And I cant wait to present it to the next person who does not understand this concept.

    “Buying Phrases” that people use before they make a purchase, are more specific and complex now and as the public gets even more savvy “Buying Questions” are gonna have to be monitored better by campaign managers and small business owners. See Wordtrackers Keyword Questions tool

  4. Your excel formula didnt work for me. I used =LEN(A2)-LEN(SUBSTITUTE(A2,” “,””))+1 instead.

    • Hi Alex, I think the formula didn’t work if you tried to cut and paste into excel because of the speech marks, thanks for the head up though.

  5. Great article Neil! You did an excellent job showing the power of the long-tail. I really like your graphs, but have one piece of constructive criticism. Always label all of your axes. It’s important for usability, and allows readers to instantly interpret your figures. Keep up the good work. πŸ™‚

    • Hi Sean, your very right about the axis, funny you don’t always see it when your writing in the post, good call, cheers Neil

  6. Wow – thats some pretty extensive testing you have done there. Glad to see the results swing in favour of long tail keywords as I use them a lot in all sorts of ways.

    Really impressed you did all of this and put the data up for us all to see.

  7. Hi Neil, great post ! Thanks you for the job. I would like to translate it in french (with a link to this page) for the French’s SEOs. Are you ok ? thanks again.

  8. Hi Neil, great post ! Thanks you for the job. I would like to translate it in french (with a link to this page) for the French’s SEOs. Are you ok ? thanks again. (oups wrong email in the last comment, sorry)

  9. Hi Neil,

    Nice analysis and well presented. I recently did some similar analysis on paid search keywords or different lengths (1-18 words), and found that searches of 4 words or more had up to 200% higher conversion rate.

    http://www.alanmitchell.com.au/techniques/benefits-of-long-tail-keywords/

    Not only that, but since competition for these long-tail keywords were significantly lower than their short-tail counterparts, click prices and therefore cost per action were significantly lower.

    Cheers,
    Alan

  10. great technical analysis of the longtail’s conversion, i can’t do the same from my little niche network, but basically i have seen this happen too, at a smaller scale. The long tail does convert well. Would like to see such a report for various search engines and which search engine traffic converts the best,

    • Good idea about segmenting by Search engine, I’ll use the same control test and update you with the answers, cheers Neil

  11. A very keen way of graphing these results. Definitely one I will be using going forward.

    @Eric C
    Thanks for posting the Query Length formula for Open Office users πŸ˜‰
    I was getting errors until I spotted your comment that Open Office utilizes semicolons to separate parameters instead of commas.

  12. This is very cool indeed. Many thanks for sharing this. One thing I would add though is that this data is from the last year. And the conclusion is that the long tail theory still holds. What I’d say is how much will Caffeine influence things here? It’s pretty clear that the index has got much smaller to enable Google to do much of what it can do in real time. So will the long tail and its attraction drop off? It might still convert very well, as this post shows, but will there be anything like the same volume of it?

  13. Hi Mark, the data is from a full previous year of Google Analytic's information, now I have the test data- in about 3 Months time I will segment the info again to see if these test client have seen an effect from the caffeine update, I'll post it as another blog then, Good call!

  14. Pretty good article, blending disclosure of some data with the methodology you used to perform your analysis. Other people can follow your model to see if they realize similar results.

    I think if you decide to do an analysis of data from other search sources, you should break it out by search source as well as aggregate it. You may notice very different trends and the keyword selections may reveal some interesting demographics.

Leave a comment