The Ancient Geek History of Web Analytics

Posted in Analytics, SES New York

advertising finds a way [where's george?]

The types of marketing conversations that dominated #SESNY 2011 demonstrated a turning-point of analytic maturity for the industry. From the complexities (to some, futility) of attribution modeling, to the advertising implications of Do Not Track legislation, to borderline existential discussions of user intent & behavior, it’s easy to forget just how far our interpretation of web analytics have evolved. Some analytics packages and logfile analyzers (such as IPRO) will be turning 17 along with Justin Bieber this year. Just because we’ve gone from summing “hits” from a single web page to cross-channel user attribution, are we beyond revisiting the basics for refreshed perspective?

In the session Introduction to Analytics at Search Engine Strategies New York, we looked at differences between the core technologies of analytics, uses, as well as advantages & drawbacks of each. Don’t let the “Introductory” angle of this session’s title fool you. Elementary essentials of analytics were addressed, but there were deep technical chestnuts that would pique the interest of even seasoned marketing vets. Read on for a recap of the insightful discussion.

Thom Craver, web and database specialist at the Saunders College of Business, was to be the solo presenter, and our guide through the history of web analytics. He warned the crowd that things might be geeky. We were ready. To prime us, Thom showed a slide highlighting various analytics platforms, and asked the audience to raise hands if they use one or more of these. Several hands remained lowered. To the owners of the hands, Thom recommended they use the Hope & Pray method.

What Are Web Analytics?
Analytics are a lot of things to a lot of people. Everyone has to look for the particular metric that means something to them. The biggest take away from this session should be that you’re actually doing something actionable. You’ve probably already optimized your on-page stuff… things like title tags & ad copy. Then you test. Use analytics to have a deep look at what went wrong and what went right.  Always test & analyze before you repeat!

Log files vs. JavaScript Analytics
The origin of analytics data started in log files and then moved on to JavaScript collectors and cookies. The log file is a server log. Someone went to my site or my server & it happened on this day and time, etc. It’s very geeky; it logs specific files & URLs.

JavaScript Analytics, on the other hand, comes along with your page– meaning it happens within the browser itself. JavaScript can allow you to track many other on-page actions that log files can’t, such as hitting pause on an embedded video.

Analyzing log files presents an issue because you have to define exactly what a page is. Does your page always end with .html, .htm,  or .php?  Every time you have to figure out what a page on your site was created in, you have to go back to your developer & let him/her know when configuring.

Also, consider Ajax. One example of how Ajax affects log files is Twitter, where you may not click on anything to trigger a new page view. It’s all on-page interaction & what makes that appear is a .php on the server somewhere.  If you only look at log files, all these little interactions become additional page views, which over-inflates the true number. Conversely, JavaScript more accurately counts page views & events via on-page actions.

The downside of JavaScript analytics is that it’s often blocked. If you’re B2C & the average Joe is your consumer, it’s not a huge problem. The general public usually uses the least secure features and accepts every cookie. If your customers are Internet-savvy, socially conscious and active in organizations like the EFF, chances are many of them will block cookies & you won’t be able to learn a great deal about their behavior through analytics.

Another thing to take into account is who sends the cookie; is it  you, or a third party? Some users are configured to not accept third party cookies. Consider your solution carefully… you won’t be able to count visitors accurately if you’re not configured properly.

Sample Process: Logs vs. JavaScript
Let’s look at the log file process in practice. Say you have a user on your site… that user goes from homepage to another page, but then goes back to homepage. Now, the user going back to the homepage may not count as a page view because log files are often configured to use a cache copy of the log if the page hasn’t been updated.

Let’s look at that example on the JavaScript side. The user is on your site. They go from the homepage to another page and then back. With JavaScript, it actually logs the additional homepage page view. What happens when that user decides to leave the site by going to www.yourcompetitor.com? Well, if you code the JavaScript properly, you can let it collect where the user left, the next site they went to. Now’s you have the opportunity to set an exit event. Familiar with those really annoying exit events… those awful popup dialog boxes that say “Are you really really sure you want to leave?”  Don’t do that. Use exit events for good; if users are leaving your site, use the data to understand why, so you can better fulfill their needs.

Which method should you choose?
Often, you have marketing guys that need one set of data and the IT guys who want another set of data. The IT guys are concerned with things such as, “How can we handle increased server load?” and “Do we need to upgrade our hosting?” when looking at log files. The Marketing guys are more like, “I don’t care, I want to know how people are moving around my site.” They’re are focused on JavaScript-based analytics packages.

It’s important to understand that neither one reports your visitors with 100% accuracy,  but the point is to visualize overall trends and patterns.

Thom Craver of Saunders College speaking on analytics - SES New York 2011

It’s Definition Time

  • Hit – This is a single request to a web server (code/images etc.). These are all separate log file entries. Loading multiple elements might mean 5-10-15 hits for any given page.
  • Page View – Occurs when a web page and all of its parts are fully displayed.
  • Visitors – Someone who visits your site, but each software tracks them different. Consider that log file analysis of users is all IP-based. For example, when you have a site that gets accessed frequently from a computer lab at a college, one machine might see 50 students (visitors) a day. If you just use log file analysis, it looks like one continuous user. With cookies, each login and browser is unique & you know there are 50 unique visitors.
  • Returning Visitors -  Anyone with a cookie from previously viewing your site.
  • Unique Visitors – Not necessarily a new visitors, they’re defined in time periods.  This can change depending on the stats you’re looking at.
  • Visit Session – The start to finish browsing period of someone surfing your site. If you use log files,  it looks like the same user.
  • Bounces – Users that come to only 1 page and then leave. You can make it time-based or you can define it by certain number of pages viewed.

There are many more things you can learn about users, geeky stuff, through analytics.  Things like computer type, browser, screen resolutions, connection speed and if they’re on a mobile device.

You need to pay attention to all of these. What if your mobile user base grows rapidly, but your site isn’t currently equipped to best serve them? If these users come and can’t use all or part of your site because it’s doesn’t support mobile, they’re going to bounce.

Content Metrics

  • The Landing page – This isn’t always the home page, but it’s the first page a visitor viewed during the visit. If you’re properly optimized, somewhere in the middle of your site should be a typical landing page, too.
  • Exit page – This is the last page a visitor viewed during their visit. This can be misleading if you look at pure log files.

Miscellaneous Content Stats
Look at most popular pages on your site. Also, examine user navigation paths… are your users going to unpredictable areas?  Don’t forget to examine events or on-page interaction.

Tracking Events
Examine users’ time on page & overall time on site. If you have a video player or embedded video on your site, you can look at load times, when people pause, stop or rewind the video, if they hit the mute button (is your audio annoying?) or if they turn the volume up/down (too wide a dynamic range?).

Map interaction
You can tell if people are interacting with your map on your page. This is especially useful if you’re a local business.

Traffic sources
How & from where are people coming to your site? There are 3 main possibilities here:

1. People can enter your site URL directly into a browser.

2. They can find it from search engine results.

3. They can be referred from other sites.

However, there is a 4th type of visit; Google analytics calls it “Other.” (This can be a lot of things, often times it’s visits from email.)

When you want to get deep when looking at traffic sources, start using Google tracking variables. They’re just a little extra tail at the end of your URL that looks something like: ?variable=somevalue&var2=something else

This code does not affect the page delivered (unless you’re doing some dynamic text swapping based on URL variables) but allows you to see these custom sources in Analytics by themselves. It’s also recognized by server logs and scripts.

Thom recommends using the Google URL builder  because it allows you to tag the URL with many different variables that you can use to track usage of specific users groups, channels, campaigns, etc.

Tracking Off-line Visitors
What if you still use traditional media? Posters print, news, & TV? You’re not going to put in a long Google URL for your contact info… no one’s going to visit it. Try fitting, “If you’d like to learn more, visit www.mysite.com?variable=radio&variable2=conservativetalkshow etc. now!” in a 20 second radio spot.

Instead of the long URL, consider using a short redirected URL that’s unique to the campaign. Something like mysite.com/promo (only listed in TV ads), then redirect to your longer landing page URL to track visitors from the TV ad.

Another neat upcoming trend: QR codes. Pay attention to them. Get a smartphone with a barcode reader, go up to a QR code to scan it, and access all different information from the code. You can also put URLs in there. This is very important because QR codes are a medium where you can have a gigantic URL  and you can output the QR codes. This only works for mobile users, but it’s still an additional source to track.

Setting & Tracking Goals
Before you can make sense of Analytics, you have to have a goal first. If you don’t already have one (or multiple), sit down with your marketing team and have a serious talk about it. Look at each of your audiences differently. For example, if you’re a college, some of your audience consists of current students. A lot of them will be checking out the office hours of their professors & finding what they need right away (if it’s there). In this case, a high bounce rate is good; they found exactly what they needed and then left. On the other hand, some of your audience will be prospective students: a high bounce rate here could be very bad.

Like any good marketing campaign, you should create a funnel where you have a broad base of people you target initially and continuously narrow them down until they convert.

On your site, you have certain pages a user must hit before they ultimately convert. Each of these pages/point is a small goal towards a final conversion goal. Here’s a simple example funnel:

Add to Cart – Checkout Page – Confirm Information – Complete Purchase

Where in this funnel people drop out is key to optimizing conversions on your site. It may be a display, connection, privacy, or trust issue. Look at that data with granularity in the funnel.

Thus concluded Thom’s excellent Introduction to Analytics session; an informative examination of analytic basics with some interesting tidbits & takeaways for the masses. For additional post-conference coverage of SES New York 2011, stay tuned to aimClear blog.

Creative Commons License photo credit: woodleywonderworks