Don’t get burned by heatmaps

Eye tracking companies, including GazeHawk, frequently use heatmaps to present the results of a study to customers. The reason for this is simple: heatmaps instantly communicate where study participants looked and for how long. They tell you the hot spots of the site and the places that might be getting overlooked.

That said, heat maps have some major drawbacks. In particular, heatmaps usually:

  • Eliminate the element of time from eye tracking. Heatmaps do not communicate when a user looked at something, only that he or she did so at some point. Moreover, heatmaps invite the reader to forget that this is a problem, since there is often no indication that this axis is getting left out.
  • Do not distinguish between a single person looking at a spot for a long time and a group of people looking at a spot for a moment. This ambiguity can lead to all sorts of problems interpreting the heatmap. We find that unless warned about this problem, readers tend to conflate the results of individual participants with that of the group as a whole and assume that most people’s individual heatmap looks pretty much like the aggregate heatmap. This is usually not true.

When you combine these two problems, you can get large differences between what a study’s “first-glance” aggregate heatmap suggests and what the data actually support. Here’s an example: a few months back, we conducted a study tracking the eye movement of people on NBC.com. The heatmap below shows the combined views of all participants in the study:

Suppose that we’re interested in making sure users understand the layout of the site and what content is available to them. At first glance, everything appears to be working as intended. The screen-spanning Miss USA images up top get a lot of attention, as does the “Spotlight on NBC” spot and top-right ad. Users followed the center column of large pictures down, occasionally flicking to the right or left columns as particular bits of content caught their eye.

Almost everything on the page was viewed at some point, and we can see a few clear but unsurprising trends in the distribution of views, such as a preference for large images and human faces. Mission accomplished?

Not at all. Check out these heatmaps from a few individual participants in the study:

While all participants looked down the green central column of the site, there doesn’t appear to be any pattern besides that (this is true of all participants, not just these four). Some looked to the left, some looked to the right – and others didn’t look anywhere at all, quite unlike what the combined heatmap suggests. If we only looked at the combined heatmap, we might think that the site is fine as-is. After reviewing the individual heatmaps, however, we get the impression that the site might have a bit too much going on; after the middle column, people did not know where to look.

While we’re aware of their flaws, at GazeHawk we think that heatmaps are not beyond redemption. Keeping their weaknesses in mind when using them can prevent misinterpretation and ensure that your conclusions are meaningful. And, if all else fails, you can always review individual participant data and watch the eye tracking as a video.

Next week we’ll look at ways to cluster eye tracking data so that the heatmaps don’t have these problems. Until then, don’t get burned.

Tagged , , | 6 Comments

PHP “require” Performance

We recently went through a round of performance improvements for our website, and got a significant performance boost off of a relatively small change.

Previously, our code had a section at start of every page:

require_once('include/Foo.php'); require_once('include/Bar.php'); . . . require_once('include/Baz.php'); 

Each file was a single class of the same name as the file. While we’re well aware of PHP’s ability to Autoload class files, we chose to list each file out is because of a talk by Rasmus Lerdorf. Rasmus discussed file loading performance. In it he mentioned that __autoload causes issues with opcode caching, and as a result will cause a drop in performance.

Speaking of, if you haven’t heard of opcode caching for PHP, stop now and go read up. As simple sudo apt-get install php-apc will give you an order-of-magnitude speedup on your website. There’s no reason for any production server to not be using it.

Anyway, this may have been true when we only had a includes, but now we have 30 files that we were including on every page load! It was time for some performance testing.

It’s also a fairly well-known fact that require_once is much slower than require…I wasn’t thinking when I used require_once. I also tested the difference between those two calls.

I tested 2 pages. First was our homepage, which only requires 3 of these 30 files. Second was an inner page that requires 5. They were tested with ab, and the numbers listed are the mean response times under high concurrency. Lower is faster.

For reference the autoload code used is:

function __autoload($name) { require('include/' . $name . '.php'); } 

Results

Homepage (3 required files)

require_once: 1579 require: 1318 __autoload: 578 

Inner page (5 required files)

require_once: 1689 require: 1382 __autoload: 658 

Wow! Over a 2x speedup…that’s pretty nice. This led me to wonder: what’s the difference in time when we’re only loading the files we need:

only autoload: 618 5 requires + autoload: 530 only 5 requires: 532 

Actually having autoload called adds significant overhead to the page, but as would be expected just having it enabled but never invoked doesn’t add any overhead.

Conclusion

The main takeaway: if your primary concern is performance, then any file included on all of your pages should be included with a require. Any file that’s included on fewer pages can be loaded through __autoload, especially if it’s only on a few pages. Also, always use APC and never use require_once unless you absolutely have to.

Also, your situation may be different than the situations you see in performance tests, so always run your own metrics. ab is your friend.

Tagged | 21 Comments

Come See GazeHawk at SXSW!

SXSW Accelerator

If you’re going to be at SXSW this weekend keep an eye out for GazeHawk. We’re thrilled to be a finalist in the SXSW Accelerator competition for Most Innovative Startup. We’ll be presenting to the judges Monday, March 14 at around 3:30pm. Hopefully we’ll then move on to present again on Tuesday. Be there and help cheer us on!

We’ll also have a table at the TechCocktail event on Sunday, where you can meet us and a ton of other great startups in a smaller setting (with drinks!). There are still a few tickets left so be sure to grab them while you still can.

We hope everyone has a fun time, whether you’re at SXSW of staying home avoiding the chaos.

Tagged , | Leave a comment

Our New Year’s Resolution For Hiring

Well, another year has come and gone, and it’s time for some reflection. GazeHawk has crossed a number of milestones: we’ve hired our first new team member, moved into our first office, and launched our first redesign. Looking back on the first part of GazeHawk’s life, there are a few things I would change. Most notable, and worth writing about, is how we handle hiring.

When we first started hiring, Joe told me we should reply to every application we got, even if looking at the resume resulted in an instant “no”. I, in my usually procrastinating ways, put these off for so long that replying to the person at that point would be rude (imagine getting a rejection email months after applying to a job, when you’ve all but forgotten about it). I figured a lot of these people were blasting their resume out to a bunch of companies, and why should I waste the time crafting a personalized response when it would barely be read.

Thinking about it more, and as we ramp up hiring more heavily, I realize how incredibly wrong I was. Finding the right people is our biggest challenge right now. We reply to almost every inbound partnership request, support request, and sales request we get. Why should hiring be any different? If it makes one person like us a little more, maybe they will recommend us to a friend who’s a better fit. Maybe they talk about us to a potential client. While these odds are pretty low, that often makes the difference between success and failure in a startup. Plus, I hate waiting for a reply from someone, especially if I’m not sure if it’s even going to come. To expect responses to everything and then not provide them myself is hypocritical.

So here’s GazeHawk’s New Year’s resolution: we will reply to every job inquiry we get from now on. We’re not going to write you an epic poem criticizing your resume, but we will at least tell you where you stand in the process. If you blast us a form email that you send to 10 other companies, we will reply with a form email. If you obviously took the time to look through our website and craft us a personalized email, we will grant you the same courtesy. If anything notable stands out on your resume (good or bad), we may even give you some feedback.

So there you have it. If you’re an engineer who wants to work at an awesome startup on crazy web technologies (things that make you ask “can you really do that?”), please take a look. We promise we’ll get back to you :) .

Tagged | 4 Comments

Switching from SVN to Git: A Startup’s Perspective

While you see a lot of posts about large teams switching from SVN to Git, I haven’t found many discussing the transition for a small team. Since we’re a team of two who just made the leap, I figured I’d comment on our experience.

Why We Originally Chose SVN

One of the many pieces of startup advice that echoes throughout the internet is “don’t optimize prematurely.” This applies all the way from the way you write your code to picking out your desk to choosing version control.

When we started GazeHawk and I had to build up our working environment, the obvious choices for us were SVN and Trac. I had used SVN extensively at both Mozilla and TripAdvisor. I’ve installed Trac before, used it at TripAdvisor, and really appreciate the simplicity and integration it has (nothing like checking in code with a “fixes #1234″ on the end and not having to mark anything in your bug tracker). I got the setup running in half an hour and we were good to go.

We’ve launched, released updates, and generally worked out the kinks in our revision control. Our needs are relatively minimal: we’re two developers working mostly on separate codebases. While it’s strange for a startup to have multiple repositories, our code is very clearly segmented into at least two parts (website and video processing), so.

Why Switch to Git?

Fast-forward 5 months: we’re still just 2 developers (soon to be 3), so our needs are fairly minimal. However, this seemed like a good time to make the switch for a few reasons:

  • Fear: We’re self-hosting SVN on AWS…on our webdev server. Having all of our revision history and development code on one machine, especially one as potentially ethereal as AWS, is terrifying. Whether we move to GitHub or a hosted SVN option, it was definitely time for a change.
  • GitHub: GitHub has a *ton* of social proof behind it. People absolutely love that place. It handles keys very well, and it’s easy to use while still offering a ton of features. None of the hosted SVN options seem to have that many home-grown evangelists behind them.
  • Branching: I had just created my first branch in the GazeHawk SVN repo. I had horrible flashbacks of late nights trying to merge an SVN branch back into trunk, coming to in the fetal position on the floor in a cold sweat. In less dramatic terms: SVN merging sucks.
  • Obligatory “it’s the cool thing to do” comment: We don’t use Ruby on Rails, or Node.js, or any other trendy language/tool. We use what works best for our needs. I figured we may as well be one of the cool kids with something: version control seemed like a good choice.

The Transition

A weekend of tinkering and everything is moved over to GitHub. svn2git makes the import easy (and leaves all my old tags in place). github-trac, while a little simplistic, at least lets me reference bugs in commit messages, meaning we can still integrate with Trac (the GitHub issue tracker is a little lightweight even for us). GitHub’s docs are perfect and make the move from SVN really comfortable.

The big question is, with neither of us having much exposure to Git (I used it very superficially for a school project), how was the transition for us? At its simplest, ignoring branching, tagging, and when using GitHub as a central server, Git is merely SVN with one additional level. In SVN you check code in/out from a central server, whereas in Git you check code in/out for a local repository unique to your working copy, and push/pull from a central server at your liesure. Once we both understood this concept, and had a list of Git equivalents to SVN commands, we were 95% as effect with our current 2-person workflow.

I’ve just gotten into branching and it seems easier to use (at least there’s less typing involved in creating/switching between them). Merging remains to be seen, but I imagine it’s at least as good as SVN, especially given all I’ve read about it. Deploying code is also about the same: a few changes to my bash script and all is well.

Conclusion

Even ignoring the time Git will (theoretically) save us in the long run, I’m confident the switch was a good move now. It didn’t cause much downtime for the two of us, and it’s a lot easier to switch while you’re small and don’t have weird version control rot. I’d definitely recommend at least looking into Git (creating a test repo, poking around, etc.). Don’t bother optimizing your version control too soon, but if you have some sore-points with SVN, or are expanding your team and don’t want SVN to become another technical debt in the back of your mind, it may be worth switching.

Tagged , | 17 Comments

Eye Tracking vs. Mouse Tracking

When explaining GazeHawk to people we often hear comparisons to “mouse tracking” services that track where the mouse moves on a page. It’s a reasonable comparison: both technologies build heatmaps as a main visualization. However, just because they use the same visualization doesn’t mean they are interchangeable. Given that there are a number of articles touting the similarities between mouse tracking and eye tracking, it worth publishing an analysis of the differences.

One such article recently posted by ClickTale claims that mouse movement is highly correlated to eye movement and can be used to draw similar conclusions. There are a few claims in that article that just aren’t supported by the evidence, though, and a few that may no longer be true thanks to recent advances in eye-tracking technology.

Eye Tracking/Mouse Tracking Correlation

The post argues that there is an 84%-88% correlation between eye and mouse movement, citing a Carnegie Mellon paper from 2001. This is a very questionable interpretation of the paper’s claims:

First, the paper measures areas on the screen where both the eye looked and the mouse moved at some point, not necessarily at the same time. The exact quote:

Of the regions that a mouse cursor visited, 84% of them were also visited by an eye gaze. Furthermore, among the regions that the eye gaze didn’t visit, 88% of them were not visited by the mouse cursor, either.

This has no temporal quality to it. All it says is that 84% of the time, if a user’s mouse moved over a given region of the screen, at some point the user’s eyes also crossed over that same region (and the contrapositive happens 88% of the time). One of the questions eye tracking is really good at answering is “what’s the most eye-catching thing on my page”. Knowing where the user’s mouse went at some point doesn’t really answer that question.

Second, they also cite 2 other papers doing similar research that quote lower numbers (they neglect to mention this data and use the papers to support other points):

Finally, ten seconds worth of Googling also pulls up a more recent paper that paints a much grimmer (though anecdotal) picture. Even when only looking at the Y-axis (the more common direction for the mouse to follow the eye), a mere 56 of 175 Google result pages had mouse movement mimic eye movement. That’s only 32%.

Cons of Eye Tracking

The post also mentioned 3 cons for eye tracking (cost, scope, intrusive). These are completely valid for traditional eye tracking hardware: it’s expensive and time consuming. In fact, those three problems are all reasons why we started GazeHawk in the first place. Given the introduction of GazeHawk’s technology, the three cons are significantly less relevant:

  • Cost – I think our pricing speaks for itself. We bring the cost of eye tracking down to $49/user.
  • Limited Scope – When using custom hardware to run an eye tracking study, you are forced to use a certain monitor size, as well as a smaller sample size. GazeHawk is not limited by either of these, since we use users’ own computers/webcams and have the users test themselves with a self-moderated study.
  • Strong Observer Effect – The Hawthorne effect is the problem of subjects changing their actions because they’re aware that they’re being tracked. When you bring a user into a lab you run into this problem. We’ve explicitly heard from a number of testers that when trying out the software they completely forget that they’re being tracked. The only difference between viewing a page and viewing a page during a GazeHawk study is that you have to calibrate (follow a dot around the screen with your eyes) before starting, and there’s a little light on your webcam telling you it’s on. This is not as passive as tracking a user without their knowledge, but it’s pretty close.

Conclusion

In the end the big question is “should I use eye tracking or mouse tracking?” Mouse tracking does have the advantage of being passive: you can silently track all of your users, while eye tracking requires running an opt-in study. While we’re obviously not objective, I would argue that eye tracking gives you information that cannot be inferred from mouse data alone. Of course, you have to factor in costs and budgets, but as these methods become cheaper and easier it becomes more viable to employ more than one of them.

Eye tracking is not the holy grail of usability studies, but it’s definitely a unique and valuable tool to have in your UX/usability/conversion rate optimization arsenal.

Thanks to my cofounder, Joe, for proofreading, and Max Hutchinson for helping with research.

Tagged , , | 17 Comments

GazeHawk Open House Sunday

GazeHawk will be participating in Y Combinator’s distributed open house tomorrow. Stop by the 500Startups incubator space at 444 Castro St in Mountain View between 11 and 4. There will be drinks, food, and ping pong, as well as Joe and myself there answering any questions you have about GazeHawk, Y Combinator, and startups in general.

If you’re looking for a job definitely stop by, we’d love to talk to you! Everyone else is of course welcome too, especially if you’re up for a game of ping pong.

See you there!

Tagged , | Leave a comment

Help Send GazeHawk to Speak at SXSW ’11

Needless to say we’re very passionate about eye tracking technology: it’s an incredibly powerful tool that is just now becoming affordable to individuals and small/medium business

In fact, we’re so excited that we offered to speak at SXSW Interactive.  Not specifically about GazeHawk, but about the past, present, and future of eye tracking technology.  We feel the technology has been underutilized, mostly due to its cost.  Now that the price of running these studies is being brought down, it’s time to remind the UX world how useful eye tracking can be.

We’d really appreciate if you could spare a few moments to vote for us!

Leave a comment

On Accuracy

The past two days have been incredible, in no small part thanks to a terrific writeup by the folks at at TechCrunch. It’s been great to put our product out in front of people and start getting feedback from the community. One theme that comes up a lot is the accuracy of our eye-tracking software, so I thought I’d take some time to address that here.

Where am I looking?

There are a number of ways of measuring accuracy in eye tracking, but ultimately they’re concerned with the same metric: how far is the calculated focus of the user’s gaze from where they were actually looking? You can delve into this even further by measuring the drift – the amount by which the accuracy degrades over time.

A good hardware eye tracker like the Tobii T60 reports having around 0.5 degrees of accuracy and 0.3 degrees of drift. If you’re sitting 24″ from the screen, applying some trigonometry shows that you’ll be off by about a third of an inch. The T60 has a pixel density of about 100 pixels per inch, so you end up with about 35 pixels of error.

Testing on a MacBook Pro, we found that our software currently achieves an error of less than 70 pixels. We’re testing this value on the MacBook Pro, so we have a higher pixel density – 128 pixels per inch – meaning that our error is a little more than half an inch. The MBP has a high high pixel density when compared to a standard display, so we can often do much better than this. As you might expect, error in an eye tracking study gets better under good lighting conditions. It also gets worse in poor lighting conditions or with excessive head movement (which we take some steps to mitigate, see below), but these figures are for a typical study.

We have a couple of projects underway at the moment to bring this error down even further using a few machine learning techniques (look for more posts on this subject!). If you’re doing a web usability study, though, knowing where someone looked plus or minus 70 pixels is generally enough for you to tell what component they’re inspecting. And when you aggregate data from a larger number of users, you get predictably better results…

Who can you track?

A dark secret of the eye-tracking industry is that not everyone tracks well. Even using custom hardware in a controlled environment, 10% or more of people just don’t track well. This number goes up for us, since our whole goal is to let people run eye tracking studies in their own homes with off-the-shelf hardware.

GazeHawk will never charge our customers for anyone who doesn’t track well. As a result, we often send out more invitations to our participants to take part in studies then the customer actually purchased. Then, we review each result to make sure that the accuracy is sufficiently good to include it in the customer’s report. We also pay a bonus to the users whose data we use, as an incentive to improve the lighting conditions and help us give our customers the best results possible.

This is really just an introduction – there’s a lot more to be said about accuracy in eye tracking, especially when it comes to getting useful results out of a study.

Coming soon:
I weigh in on the debate about how many users you should have in an eye-tracking study, and some of the difficulties we’ve faced in providing valuable feedback to our customers.

3 Comments

Introducing GazeHawk: Eye Tracking for Everyone

Hello, world! After months of development, we’re proud to show you GazeHawk: a new technology startup aimed at making eye tracking affordable for everyone.

We’ve always thought that eye tracking was an incredible technology – who wouldn’t want to see exactly where people look on their website? – but it always cost too much to use. At GazeHawk, we’ve developed an innovative way of solving this problem.

GazeHawk provides eye tracking services using ordinary webcams, so there’s no need to purchase expensive hardware. Our valued community of testers helps us run eye-tracking studies from the comfort of their own homes (and if you’re interested in being a paid tester, here’s a good place to get started). These steps allow us to offer professional eye tracking studies for a fraction of the price, and we pass that advantage right on to our customers.

Our goal is to find exactly how eye tracking can help you and your website or business. So please, take a look around and let us know what you think! We’re excited to hear from you.

Leave a comment