Is HCI a Cargo Cult Science?

I'm sad, because I have to skip the CHI conference this year. But I wanted to contribute remotely, hoping that the piece may trigger conversations on the kind of papers we must see more at CHI. And I hope it helps the CHI-goers to choose from the multiple tracks and gazillion presentations.

Disclaimer: I deliberately use the "loaded" term cargo cult - not in order to make a derogatory statement about the indigenous people or HCI, but to reflect upon areas where HCI must aim higher.


Cargo Cult Runway by ~Tomoran on deviantART

Richard Feynman, in his Caltech commencement address given in 1974, describes "cargo cult science", referring to psychology and educational science. The metaphor is now famous, but let me quote the original words (read the full speech here): 

"I think the educational and psychological studies I mentioned are examples of what I would like to call cargo cult science. In the South Seas there is a cargo cult of people. During the war they saw airplanes land with lots of good materials, and they want the same thing to happen now. So they've arranged to imitate things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head like headphones and bars of bamboo sticking out like antennas--he's the controller--and they wait for the airplanes to land. They're doing everything right. The form is perfect. It looks exactly the  way it looked before. But it doesn't work. No airplanes land. So I call these things cargo cult science, because they follow all the  apparent precepts and forms of scientific investigation, but they're missing something essential, because the planes don't land."

Are we delivering - are our planes landing? Well, eh, at least we have room to do better. 

To me, the most glaring example of where planes do not land is this: the most commonly used user interfaces were invented already decades ago - and not within the field & community we recognize as human-computer interaction. The Qwerty was invented in the 1870s, the mouse and the touchscreen in the 1960s. The era of HCI we see now has not been able to replace or fundamentally improve these interface, despite thousands (!) of academic papers. Beyond the hype we have created, can HCI look at the mirror and be proud of achievements as a academic discipline? If the plane had landed, HCI would have had a major effect on the everyday use of computers. Compare HCI to the computer graphics, for instance. Advances in their field are seen in games, movies, and photo editing software.

Feynman goes on and exposes the main reason that makes a field pseudoscientific. There are two minor points and a major one. The two minor points deal with issues we already know about: reporting of details in studies and replication:

"For example, if you're doing an experiment, you should report everything that you think might make it invalid--not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked--to make sure the other fellow can tell they have been eliminated."

These are important but not trivial issues. One cannot report every detail of an experiment - and, I must add, of a system or technique. No "replication" is ever a perfect reinstantiation of the original one. A recent debacle in psychology is a good reminder that the two points are still valid. The debate started from a recent failure to replicate a famous study by John Bargh, which was then interpreted by a science journalist as invalidating the original results. But I don't have much to add to this: Everybody knows this is important, and we ought to improve.

The more important point of Feynman concerns scientific integrity:

"I'm talking about a specific, extra type of integrity that is not lying, but bending over backwards to show how you are maybe wrong, that you ought to have when acting as a scientist. And this is our responsibility as scientists, certainly to other scientists, and I think to laymen." Scientific integrity means that we do our best also to invalidate theories.

Following this line of thinking, we see three kinds of papers in HCI:

1. The pre cargo cultists. I call these papers pre cargo cultists, because they report a study, design, or a novel piece of interactive technology without a grounding in any scientific debate/theory/method whatsoever. "Look, here is a technique/study we did. It works. End of story." There is not even an attempt to stage the runway for the planes. This is a frustrating type of "contribution" to HCI, because there is hardly any value beyond the presented instance itself.

I have to admit that some of these "show and tell" type of papers can be useful and have merit, especially they exhibit a hgh level of competence in solving a practical problem of importance. For example, there's hardly anything scientific in a solution on how to infer the touch area on a touchscreen. It's a trick that requires some competence. Simply showing how it can be done is useful for others. But do these papers belong to archival body of HCI? They do not advance our understanding of human-computer interaction. If they have practical value, wouldn't a better place to publish them be in the web so that anybody could utilize and build on them?

2. The cargo cultists. These papers look scientific: they refer to a relevant work in the mother ("real") sciences (or in HCI) and replicate the superficial exterior of the original work. However, they do do blindly, without critically examining the suitability to the specific context of HCI. The theory, method, or technique borrowed from another field is simply a convenient wrapping that makes the paper sufficiently sciency that the ACs do not need to be embarassed. When we accept such papers, we actually harm the field a lot, because we saturate it with pseudoscientific papers that new papers must refer to and address, even if the original work was done haphazardly. Suddenly there's a "technique" or "finding" that "already did it".

In the worst case, the cargo cultists try to build new runways. I've dubbed this phenomenon hyperdisciplinarity, as opposed to (healthy) multidisciplinarity. I'm afraid that we may have introduced too many relevant-looking theories from mother sciences without doing the foundational work properly. Not all of them can possibly be pertinent to a given problem. And some of them are even contradictory. Well, for fairness, I must add that sometimes also these "new runway papers" can advance the field, because they open our eyes to a phenomenon that we might have not noticed otherwise. But who is going to close the chapter and do the critical study? And do our publication forums encourage such work over "novelty"?

3. The post cargo cultists. These are the Feynmanian papers that not only produce evidence for a finding/technique but methodically collect counter-evidence to develop it further. Unfortunately, we rarely see more than one variation of a study/method/technique reported in a paper. And the short paper format we have in conferences discourages this type of contribution.

I think we have many promising examples of work where we try to reach the third stage. Raising just one example does not do justice to the increasing amount of good work, but let me use work on the Fisheye lens as an example. For years, the fisheye was the prime example of information visualization taught in textbooks and courses, but critical empirical study of the technique was lagging behind. If this technique is so excellent, why don't we see normal users using it? Why didn't the planes land? I'm not actively following Fisheye research, but I was positively surprised to see a review paper that summarized what is known and outlined what remains to be studied.

We ought to encourage more of these kinds of papers. Instead of pooling more and more papers on a topic, we need to encourage critical, synthesizing reviews as key contributions to HCI.

I hope you have a nice CHI experience! Email me when you witness a post cargo cultist on stage.

Cargocult
The cargo cult phone. A picture by Eric Wilde from SF MOMA. Original at http://www.flickr.com/photos/dret/5197053749/

Why tablets will not replace PCs

Tabletkbd
Tablet computers have assumed a whopping share of the consumer market and already overshadow PCs in sales. Some have taken this trend to imply that tablets will replace the PC in the near future. Will PC die? I think not.

Tablet-enthusiasts are guilty of two fallacies. First, they forget what many use their computers for: documents, multimedia, games, and … multitasking. Second, they ignore the fundamental limitations of touchscreen as an input device. Usability is worse by an order of magnitude in many of the tasks we hold dear to us.

Tablet-advocates are not stupid. Most realize that slow text entry is a key issue, but they overlook just how appallingly bad the ergonomics are. I estimate that the words per minute (WPM) is in the range of 10 to 30 for regular touchscreen users. (If you read this blog, you are not a regular user.) Compare this to the ca. 40 to 70 WPM for a physical Qwerty keyboard.

There are many natural reasons for the bad performance that tablet-fans should know. I believe issues cannot be fixed within the present-day paradigm of tablets.

Regular users use only the index finger or two thumbs for text entry, which offer much lower information capacity to begin with than the 5+ fingers that even the inexperienced PC user is able to mobilize on a physical keyboard. Preparatory movements in typing are hampered by the lack of static and dynamic feedback provided by physical buttons. You cannot rest your finger on top of the next key, waiting for its turn to come. Button edges cannot be felt. There are no button release forces.

The critical implication is that text entry must be guided by visual attention, which makes any complex writing/editing task a chore. Bret Victor, in his rant against the touchscreen, that users essentially poke a glassy surface with fingertips. Not only that, they have to witness it all the time. Now, watching your fingers move on top of buttons is not something people look forward to. 

But users are clever, too. Two tactics are a common sight. By laying the device on a surface like a table they enable the use of 4-8 digit fingers in the two hands. Does this solve the problem? No. It demands the unnatural posture where the wrists are brought laterally close to each other and flexed in the ulnar direction. The 'X' posture. Try it out, it's far from comfortable. Of the people who I've observed using this, only one reached a level of performance comparable to Qwerty. Furthermore, because the keyboard and the display are on the same surface, either the angle of the keyboard of the viewing angle is not optimal. The slope of keyboard is important for typing. Finally, because the tablet is on a level surface, and the direction of the force from taps is not toward the ground, the device often lacks stability and the user must control the force carefully so as not to push it back on the surface. The designers of iPad 2 were ingenious when they introduced the flexible cover, but because of these limitations, I really don't foresee this tactic taking off en masse. 

The other tactic is to hold the device with two hands from the sides and use the thumbs. This, however, is not as effective as the advocates fantasize. Two thumbs can reach perhaps 20-40 WPM on a tablet device with a split keyboard, so some gradual improvement is possible. However, the index finger, which you usually use for pointing, selecting, panning, zooming etc., is now on the back side of the device. The cost of switching it is considerable, because both hands may need to change their position: The pointing hand needs to move on top of the display and the support hand needs to change posture to provide proper angle and support for pointing. Alas, performance is still bad despite sacrificing an important function of the hand.

In principle, physical keyboards with Bluetooth connection or docks could change this, but I haven't seen many people to actually start using these. Besides, carrying around a keyboard is an antithesis to the supposed mobility of the tablet. 

So, complex editing and multimedia tasks are next to hopeless. What about games? The most popular genre of games at the moment must be the first-person shooter. Even if we got the graphics right, there's a simple reason why FPS gamers will not switch to tablets: touchscreens are still inadequate for this task. For FPS games, you need to support at least aiming (continuous control in two dimensions), movement (discrete control in two dimension), and shooting. Even though reasonable techniques exist that map these to the two thumbs, the sensitivity of the display and the lack of haptic feedback limit the level of precision. To FPS-gamers even the smallest decreases in performance matter.

The small size of the display hampers multitasking. We know from studies in human-computer interaction that simply having a physically larger monitor is beneficial for multitasking. With the tablets of today you show only one application foregrounded at a time. We lose all the benefits of a windowing system.

In sum, the present-day tablets offer nowhere near sufficient usability for many complex everyday tasks. Statistics of time use show that these are frequent and I'm reluctant to believe that they will go away. I don't deny that iPads have brought about new uses, but are we witnessing the birth of a new niche rather than the elimination of an existing one?

I believe that the issues that I mentioned can and will be overcome. There is promising work in the area of mobile and ubiquitous human-computer interaction addressing almost every single one of them. But the outcome of those developments will create something that we will not recognize as the tablet we know today... 

Rage Comics goes HCI

With special focus on nose-control!

Hciragecomic
Hci3bragecomic

BTW: Nose control is no joke. Check out this link forward to me by Johannes Schöning.

+1

The most widely used version of Fitts' law in HCI is that of Mackenzie. We find it in most HCI textbooks and in Wikipedia:

MT = a + b log2(A/W+1)

But why +1? Although it appears innocent, it is in fact decisive. The alternatives either do not have that or they include another constant.

Indoctrinized as I was, I never challenged the +1 after I had read Mackenzie's 1989 paper where he draws a convincing parallel. Before exposing where the problem is, let me reiterate the three main rationale why Mackenzie's version was commonly accepted: 

  1. ID values are neither zero nor negative.
  2. R-squared improves for Fitts' own data and some other datasets.
  3. It has a direct interpretation within Shannon's information theory, while Fitts' original didn't. 

Point 1 holds trivially. Point 2 is backed up by some evidence, although it is often possible to find a better fitting model for an individual case. Point number 3 is the key. Let me quote the original paragraph justifying the claim:

“Shannon's Theorem 17 expresses the effective information capacity C (in bits × 1/s) of a communications channel of band B (in 1/s) as C = B log2( (P + N ) / N ) where P is the signal power and N is the noise power (Shannon & Weaver, 1949, pp. 100-103). It is the purpose of this note to suggest that Fitts' model contains an unnecessary deviation from Shannon's Theorem 17 and that a model based on an exact adaptation provides a better fit with empirical data. The variation of Fitts' law suggested by direct analogy with Shannon's Theorem 17 is MT = a + b log2( (A + W ) / W )." 

Why should we care whether this analogy is correct or not? The link to information theory is critical when one needs a meaningful interpretation of ID and IP/TP. And this you need if you want to compare user performance across conditions. Zhai (2004) dubbed as "tempting but naive approach" to compare performance measurements directly (which is what we nevertheless see in most HCI papers!) If one condition (say, joystick) had distant small targets and the other condition (say, mouse) large close-by targets, errors and movement times are incommensurable. The same goes to comparisons across studies with different devices and acquisition tasks. The power of Fitts' law is that this issue of incomparable data can be addressed with ID. Using ID we can obtain a unified metric - index of performance IP, sometimes called throuhgput TP - for speed and accuracy in varying target conditions. (The idea, by the way, is similar to d' used in signal detection theory, but I wouldn't dare to say its analogous!). If you want to use IP or TP, the +1 is critical, because it is part of the ID you calculate. My personal stake is that I've published a couple of papers on application of the Mackenzie formulation (e.g., to augmented reality pointing).

And now to the problem. I had the pleasure to meet Heiko Drewes in Munich last week and he educated me on this issue after I used the Mackenzie version in my talk. Heiko is a physicist who did his PhD on HCI and published the paper "Only One Fitts' Law Formula - please!" at CHI 2010.

The paper deals with issues in the application and interpretation of Fitts' law in HCI. Despite its merit, the paper has remained in the margin. You can read the paper yourself, but in a nutshell the problem is summarized in this paragraph, which refers to the quote from Mackenzie 1989 that I pasted above:

"The simple questions regarding the direct analogy are: Why does the power of the noise map to the target width, which means the diameter, and not to the radius? Amplitudes should map to radius or half of the target width respectively. The power of the noise is proportional to the square of the noise amplitude (or variance of the noise, see also the footnote in Fitts’ publication [2]). Therefore, a further question is what happened to the square? In the case of Goldman’s Equation the square can be drawn out of the logarithm and doubles B. Is it possible that the direct analogy – take the distance as the signal power and the diameter as the power of the noise – is a little bit too direct or in other words naïve?"

Even the units are different: Channel capacity is expressed in bits per second, MT in seconds. Fitts' formulation does not lead to negative IDs. The paper goes on and on with these issues...

There are two logical possibilities:

  1. Mackenzie's analogy is correct. He simply skipped some steps in deriving it due to short space. If so, Mackenzie or somebody should show what those intermediate steps are.
  2. Mackenzie's analogy is incorrect, as Heiko claims. 

In both scenarios we're screwed! In the former, we, as a community don't understand the theoretical basis of the most commonly used variant of our flagship (and only?) law. Some scholars may get it, but this wisdom has not trickled down to our textbooks. In the latter scenario, we're in an anomalous state because we don't have a commonly agreed on alternative. People just use whatever version they personally find best - which is typically the Mackenzie one as in my case - and get their papers published depending on whether the reviewers happen to be fans of it too.

A cynical person would say that the link to information theory has been essential for our egos because it associates our flagship theory to something others call Science. Frankly, linear regression on heavily categorized data is lame, but if it has an interpretation in the prestigeous information theory... Alas, the real information theorists don't seem to pay attention to applications in HCI (although they do look at cybernetics and robotics). So here we are, stuck alone with what-may-turn-out-to-be pseudo-science...

What should we do? Fitts' law has had merit both as a predictive tool and as an information theoretic interpretation of user performance. On the one hand, if one only cares about prediction, then why use Fitts' law at all? Everybody who has done this knows that achieving good fit requires aggressive categorization, which loses a lot of information and makes prediction child's play (see Heiko's paper for a good example). Fitts' law is a simple regression model, and we nowadays have more sophisticated tools for prediction. Why not use them instead? On the other hand, if the information theoretic measurement of user performance is the goal, Mackenzie's variant is not safe ground until this issue is solved. For the information theoretic interpretations, we could still use, for instance, Fitts' original equation, but with the risk of compromising predictive fit. No free lunch. However, even if we do so, we should avoid overly interpreting throughputs. Believing that constant throughputs can be obtained for, say, an input device, is fallacious. Psychologists stopped claiming many decades ago that human abilities are limited by fixed capacity information channels. If that is the case then the bits/s we measure in an interactive task are malleable as well (and they are!). My view is that problems arise when the two goals - prediction and theory - are pursued within the same study, because R-squared and information theoretic interpretation are so closely tied together in ID.

It's not on my personal agenda to solve this issue, but it's clear that as a community we should stop kicking the can down the road. If you know a good paper that solves this issue, please email me.

Wunderkind's long lost cousin

I've got considerable feedback from German HCI researchers for my earlier post on the Wunderkind of HCI. Some comments were confirmatory: German HCI is indeed strongly driven by engineering and especially CS. "Real" computer scientists, I heard, do not consider HCI as computer science or a scientific discipline at all (déjà vu, anyone?). I was also corrected re my claim that the scene here is competitive, learning that only Aachen is passionate about rankings, and might not even use commonly agreed criteria in doing them...
Hcifamilytree
Yesterday, however, the plot thickened. I got an eye-opening email from Prof. Matthias Rauterberg (quoted with his kind permission): 
"To  a certain degree I can agree, because I’m one who left Germany long time ago. But to understand the history and impact of HCI in Germany and from there worldwide, you really have to learn the German language, and read all the past achievements starting in 1980 under the term ‘Mensch Maschine Kommunikation’ and ‘Software Ergonomie’. The German HCI research arena is one of the oldest worldwide. All the work done by German researchers in the past had internationally a low visibility due to the language barrier, but that doesn’t mean it wasn’t there, and quite a lot of the most influential concepts made it into the world."
There are two German HCIs? I was intrigued. The one is a young Padawan, while the other is in the thirties.

Although I had in fact been aware of the existence of Mensch-Maschine-Interaktion research from my studies in cognitive ergonomics, I was left wondering if there is something more to MMK/MMI than, well, copycatting of the Anglo-Americans. True, the Sofware-Ergonomie conference started in 1983, only 2 years later than CHI. It was later renamed the Mensch Computer conference, and when I had asked around about this conference, mostly from the new generation of German HCI researchers, their answers imprinted me with the idea that the conference is more geared toward networking than to be considered as a serious outlet for original research. For this reason I had dismissed the conference. Moreover, coming from a country of 5 million people, publishing work in anything else than English is by default considered an academic suicide. However, there are circa 100 million German-speaking people in Europe, so the rewards are larger by a magnitude... 
Hence, in order to provide the "long lost cousin" the benefit of doubt, I interviewed Matthias more and I did some googling. Unfortunately, my high school German is insufficient for thorough appreciation of the original papers, and I must leave that exercise for a later time. But here's what I've learned thus far.
Psyche
The Son of Psyche
The clearest denominator of this line of work is that it is intellectually debted to psychology rather than computer science. International psychology has strong historical roots in Germany. I can't tell what the exact disciplinary influences from psychology have been, but in the present-day MMK/MMI I see influences to/by:
  1. Human factors and ergonomics, as the term Software Ergonomie attests, and especially its information processing wing
  2. Organizational and occupational psychology (Arbeitspsychologie in German)
  3. Activity theory (see Greif, 1991)
  4. Usability engineering (the wikipage of Software Ergonomie is direct about this) that has some influence from cognitive and experimental psychology.
Moreover, a major part of this work has been done in collaboration with the industry. Matthias wrote:
"The situation in Germany, in particular in industrial practice, is strongly influenced by different unions; therefore the German HCI scene has developed a strong track record in this framing, reflected in the name of the conference series ‘Software Ergonomie’. The UPA chapter in Germany is part of this development."
The different disciplinary roots might explain why I overlooked MMI/MMK in the first place when writing about "the Wunderkind."
I still wonder how much the two schools have real influence on each other. One email that I got earlier suggested that, that at least in the eyes of German CS department heads, HCI is considered too psychological and alien. I would say that it is not the first time others are envious of the beauty of Psyche :)

Impact on "the HCI proper"
By far the most interesting question, though, is this: what is there to learn from the German MMK/MMI research? Before WW2, there were areas in science where the most important works were published solely in German, and there still are areas like mechanical engineering where German language is important. And German psychology was and is still very influential (Alexander De Luca reminded me of the Gestalt laws that are taught in every 101 textbook on HCI.)

But what about MMK/MMI? 

I've been mainly reading Anglo-American texts on HCI, and couldn't recall any key concepts being of German origin, which could of course be to my ignorance or poor memory. So I asked Matthias and he named two highlights:
  1. "One of my highlights is ISO 9241 based on DIN 66234 (reflected in European Norm 90/270/EWG), the ground breaking work of Cakir and Dzida"
  2. "Another concept/theory is ‘Activity Theory’ or ‘Handlungsregulationstheorie’ much more elaborated than all the work of Don Norman and other Anglo-Americans in this respect (it was not Bonnie Nardi or Victor Kaptelin)." 
These are important achievements, no doubt. I presently don't have time to confirm these claims by reading the original sources, but I have no doubt to question them.

The present status
Here's what Matthias writes about the present status:
"You should attend the next German HCI conference Mensch & Computer and you will meet all of them. And there is more; I’m still deeply impressed by the innovative power of the MMK workshop." 
When my German skills approve, I will participate. Alas, presently, I struggle to even order Döner Kebab from Ali Baba downstairs...

My sentiment
While research in MMK/MMI has certainly been active, and even produced some important results during its three decades, I think it is a pity that they have not decided to publish in English. Balkanization harms both communities. 

Some starting points to the MMK/MMI literature
Please let me know if there is a good overview written in English!
Lists of books and proceedings of the conference:
A list of introductions Matthias forwarded from Horst Oberquelle:
  • Greif, S. (1991). The Role of German Work Psychology in the Design of Artefacts. J. M. Carroll (Ed.), Designing interaction. Psychology at the human-computer interface (pp. 203-226). New York: Cam-bridge University Press.
  • Greif, S. & Keller, H. (1990). Innovation and the design of work and learning environments: the concept of exploration in human-computer interaction. In M. West & J Farr (Eds.), Innovation and Creativity at Work: Psychological Approaches (pp. 231-249). N.Y.: Wiley.
  • Greif, S. (1989). Psychological Approaches of Software-Design and Computer-Training in West Germany. The Industrial-Organizational Psychologist, Newsletter of the Society for Industrial. & Organizational Psychology, (Inc. Div. 14 of the APA), Vol. 26 (No. 4), 45-48.

Why your paper was rejected

1,214 CHI submissions were rejected this year. Why? Surely the CHI community must be a modern rendering of Freemasons, or a cleptocracy scheming how to torpedo non-members' splendid papers? Or, is the reason rather that a sizable proportion of submissions was infested with trivial errors? Read on...

Construction

Some background first. Yesterday, I left Boston where I had had the honor to serve the Usability, Accessibility and User Experience Subcommittee for CHI 2012. The 48 hours we spent preparing, making, and finalizing decisions with David Gilmore and 19 ACs were intense. Looking back, I have mixed feelings. On the one hand, I'm proud of the review process that the CHI community has honed over the years, as well as of the ACs and reviewers who dedicated countless hours to execute that process. In our subcommittee alone, circa 900 reviews were entered and the subcommittee discussed 80+ papers. The conference program that Captain Höök and others put together on Saturday is just top-notch. On the other hand, I'm downhearted because of the countless forlorn papers that got rejected. Most meta-reviews state up front that the topic of a submission is timely and/or important--so what's the matter?

For_dummies

To answer this question, and to help the authors to get feedback on what kinds of flaws CHI reviewers pay attention to, I spent my Saturday reading the meta-reviews of the 100 lowest-scoring submissions to our subcommittee and listed the rationale for rejecting papers. The scores range between 1.0 and 2.5 and typically the same issues were raised by more than one reviewer. And, today, I clustered the issues into a few super-categories. To scope this effort, I here concentrate on commonly observed flaws in empirical work, leaving aside the issues addressed in the Guide for Successful Submissions (originality, significance, benefit, ...).  

Dare_to_disagree

1. RESEARCH STRATEGY

Research strategy refers to the selection of best possible method to address a given problem (e.g., see the research strategy circumplex of McGrath, as taught in social sciences). Here are the issues I listed under this topic:

  • Not explaining motivation for choice of research strategy. For example, why a laboratory experiment and not a field study?
  • No/poor justification for methodological choices. For example, why were users given a particular kind of feedback after each task?
  • No justification for a new method. No analysis of its strengths and weaknessesses.
  • Wrong/suboptimal method. Sometimes reviewers think that the chosen method is not suitable at all or is inadequate in the light of previous work. For example, a survey may have limited use for studying real-world practices of users.

No_fishing

2. STATISTICAL CONCLUSION VALIDITY

Statistical conclusion validity refers to the reliability with which we can infer a relationship between two or more variables. The following validity-related categories (2-5) follow the taxonomy of Cook & Campbell (1979). The issues in this category:

  • No statistical testing but claiming quantitative differences among conditions/groups. Sometimes authors simply report descriptive statistics per user/group/condition, and draw conclusions based on means only.
  • Wrong statistical test. This is a common reason for rejection and can refer to mismatch of test with experimental design (see Kirk, 1995), mismatch of test with levels of measurement (nominal, ordinal, interval, ratio), violation of test's assumptions. 
  • Using an unconventional statistical test without explaining the basis of selecting it. Either the reader must then familiarize with the test or take a leap of faith; both result in frustration.
  • Claiming significant results although statistical test shows otherwise. For many reviewers this is a show-stopper.
  • Low statistical power. Issue: Claiming no effect although sample size is small. Reviewers know that absence of evidence is not evidence of absence. Following APA's recommendation of reporting statistical power would protect authors from this criticism.
  • No post hoc testing. Omnibus testing is not sufficient when one wants to pinpoint effects for a variable(s) with multiple levels.
  • Fishing. Statistical testing for multiple variables (or their levels). Should utilize the correct post hoc test that accounts for inflated probability of Type I error.
  • Cherry picking. Sometimes reviewers find it irritating that authors report a whole bunch of significant effects, but concentrate on only those that are relevant to their conclusions.
  • Collapsing data from multiple conditions/groups into one, in order to get a statistically significant effect. Don't do this.
  • Using measurements that are noisy. A recurring issue is that authors following "grounded theory" code highly subjective categories but ignore inter-coder reliability assessment. Another example is use of noisy logging data.

Absent_evidence

3. INTERNAL VALIDITY

Internal validity refers to the plausibility of a causal relationship between two variables. The list:

  • Breaking the "ceteris paribus logic". Experimentation often relies on the logic of "all other things being equal." Conditions/groups in a poorly designed experiment differ in more than one dimension. For example, maybe participants in two interface groups also completed different tasks.
  • Manipulation check missing. For example, you claim to induce an emotional state by showing pictures before  a usability test, but fail to check that the manipulation actually has the desired effect. This particular rationale for rejection was rare, though.
  • Confounding/nuisance variables. The "classic" nuisance variables in HCI are order effects. Failing to randomize or counter-balance the order of experimental conditions/groups is often a show-stopper. But then there are more sophisticated nuisance variables pointed out as well, but since these are very study-specific, I do not list them here.

I'd like to add a fallacy that is very common but not often pointed out by reviewers: selection bias. In other words, users are not assigned to experimental conditions/groups randomly. Why this is not brought up as an issue I don't understand.

How_many_correct_words

4. CONSTRUCT VALIDITY

Construct validity refers to the cause and effect construct that explains the causal relationship. The list:

  • Choosing wrong/old model for data. Papers that use a model to explain obtained data may choose one that is wrong in the eyes of the reviewer. Papers aspiring Fitts' law modeling sometimes face this critique.
  • Unconvincing/insufficient explanations for obtained effects. Reports of quantitative relationships between IVs and DVs are unconvincing unless accompanied by explanations based on qualitative sources, such as interviews. Arguments like this point toward favoring mixed methods research in HCI.
  • Mono-operation bias: Using only one simple measure to gauge a complex phenomenon. E.g., measuring "user preference" when aspiring to measure "user experience". Or ignoring errors in a measure of typing speed.
  • Inadequate or incorrect choice of variable levels. For example, claiming to study the effect of "aesthetics" but having only two conditions to compare. Reviewers often point out that there are too few levels in the chosen IVs.
  • Inadequate operationalization of constructs. For example, authors want to measure user experience but collecting data several weeks or even months after use, ignoring the effects of forgetting and interference on the veridicality of user experience accounts.

Strawman

5. EXTERNAL VALIDITY

External validity refers to the generalizability of the causal relationship across persons, settings, and times. The list:

  • Overstatements: Overstating the generalizability of the finding. For example, using students but drawing implications to all healthy users.
  • Limited generalizability is among the most common reasons for rejection and it comes in many flavors. The criticism focuses most often on sample, tasks, user interface, method. For example, convenience sampling (using people from own lab), contrived tasks, short duration of study, unrepresentative user interfaces / systems, or even wizard-of-oz or paper mock-ups instead of working prototypes.
  • Inadequate operationalization of a special group. For example, using blindfolded healthy adults and claiming generalizability to blind users. Authors with little training in accessibility often fantasize that a piece of technology would be useful for a particular user group but do not care to use them as test subjects or even ask them.
  • Adopting "a straw-man" as a baseline condition.
  • Unrepresentative operationalization of interaction styles. For example, you claim to study multi-device interaction, but in the experiment the experimenter sets a pace for alternating attention between two displays.
  • Unbalanced sample; for example, demographics differ radically among groups.
  • Unconvincing experimental analogue. Experimental analogue refers to the resemblance between the set up in the lab and the target use conditions "in the wild" to which the results should generalize. For example, decorating the usability lab to look like a living room with couches etc may not compel reviewers as an efficient analogue to induce home-like behavior.
  • Partial generalization. Overusing one DV in generalization while downplaying the others.

Superpaper

6. SCIENTIFIC COMMUNICATION

Scientific communication refers to the ability of a writer to convey complex phenomena correctly. The lowest-scoring papers are almost without exception guilty of this. Recurring issues:

  • Incomplete description of how a system/interface is used, or how a key algorithm works.
  • Missing figures or tables. Only idiots have their papers rejected for this.
  • Unreadable labels in figures.
  • Poor presentation of complex data. For example, putting in big tables for descriptive or inferential statistics.
  • Overly complex analysis of multivariate experimental design. If you have more than 2 DVs and/or more than 2 IVs, presentation of results requires serious thought!
  • Missing or partial descriptive statistics. Jumping into inferential statistical without descriptive statistics leaves the reader no chance of evaluating what you did.
  • Incomplete reporting of statistical test values. See APA Manual.
  • Ambiguous terminology describing the method or results. Define and label your key variables, and use them consistently.
  • Unjustified selection of measures. Why did you use that measurement instead of the other ones available?
  • Cramming two contributions into one paper. Leads to inadequate space ("the Superpaper syndrome").

Einstein

7. REPLICABILITY

Method sections of empirical papers should be written such that the reader can replicate the study. Common issues:

  • User demographics not reported. In the worst case we witnessed, the paper did not report even the sample size.
  • Insufficient description of method. Parts missing. I strongly advise following the APA template for description of method, and deviating from it only with good reasons. Reviewers are familiar with this format and can easily follow it; plus, you ensure that all  elements of your experiment are described.
  • Incomplete reporting of measurements.
  • Key parts of analysis method missing. For example, conjuring coding categories in "grounded theory" is often guilty of this.

Messy_desk

8. APPLICABILITY OF RESULTS

Let's assume the paper was so good that reviewers found no flaws related to validity or communication Does that mean you can start booking hotels for CHI? Absolutely no! The final stretch where papers get killed--no, massacred--is the interpretation of findings. Here's my list:

  • Null effect: No significant effect was found, but authors nevertheless argue about implications.
  • No surprising finding: The finding is predictable in light of common sense or previous work.
  • Miniscule effect: The effect size is neglible, yet authors argue for real-world implications. It is recommendable to report effect sizes as APA recommends, to avoid speculation on the readers' side.
  • It turns out that the presented technology is not better than existing means.
  • Not analyzing the finding, just reporting/repeating it. Not situating findings within existing literature.
  • Not answering the research question. Didn't we learn in high school that this is a no-no?
  • Absence control group or condition  makes it impossible to contextualize the finding.

Nothing_inside
EPILOGUE

I hope you find this list useful. I sure spent an awful lot of time compiling it. If you think there was nothing new, you know a lot already! Print the list out and use it as a checklist for your submissions. Or, if you know newbies who aspire to publish in CHI but have no proper training, send it to them. I might extend this in the future into a proper paper.

Error_message

My final tip is the easiest: 

Shouting

Tagged CHI2012

Ranking of HCI conferences based on average citations per paper

I've been always curious to see a ranking of HCI conferences based on expected citations per paper. Is CHI the King as most believe? How do the thematic conferences like CSCW and MobileHCI do? And what about regional editions of CHI? Unfortunately, the usual suspects in bibliometric ranking, such as ISI, do a poor job in covering our diverse conferences. There are many good uses for a ranking (and even more bad ones, I know). For instance, PhD students and others who are unfamiliar with the ecology of conferences in HCI, would benefit from knowing which conferences give most "bang for the buck." 

Although there are many bibliometric analyses of HCI, I was not able to unearth a ranking of conferences (please email me if you know one and I'll link it here!). However, thanks to Bernt Schiele's tip, I found out that Microsoft Academic Search provides citation and publication numbers for HCI conferences, and one can constrain the view to the last 10 years. So, I took that list and removed conferences that have very low citations, as well as those that had published less than 100 papers. I think pruning is justified, because these conferences are either newborns or dead men walking. I then, using my lousy Excel skills, calculated the index citations/paper, and ranked the conferences accordingly. Voilá! 

This simplistic measure has some credibility. First, Microsoft Research has done good job expanding the coverage of HCI venues and the numbers start to speak for themselves. Nevertheless, some important conferences, such as Pervasive, cannot be found from the list. Second, while we can guess the limits of this index - which must be analogous to the limits of using GDP per capita for ranking countries - it is (deceptively?) easy to understand, and it has value as a predictor of expected citations for an accepted paper. But common sense must be applied when drawing conclusions from it. For instance, I would not trust anything else than very clear differences. Conferences that have proportionately more posters and short papers are handicapped against those that favor full papers. (It'd make sense to redo the calculation with data from full papers only.) And, of course, newer conferences, like Persuasive, are handicapped when calculating a raw ten-year average.  

Some of the results were surprising to me and they got me to reconsider my publication goals. My first observations:

  1. CHI is not the King. Actually, it's not even in the Top 5. The common misperception of CHI being the best is based on the fact that it has the largest total impact on the field thanks to its massive annual volume of publications. But it's not the best bet for any single submission. 
  2. UIST is the King. Long live the new King! Actually, ECSCW and UbiComp are so close behind UIST that we should throne the three together. I'm not surprised to see UbiComp here. However, take a note that ECSCW has a way lower total volume than UbiComp.
  3. Then there's a group of dozen or so conferences that compare to CHI. It was surprising to me that this group is so big; I thought the distribution of rich/poor conferences would be more polarized. 
  4. Some positive surprises within that dozen. For instance, I'm glad to see that DIS and IDC are so high, I know they have worked hard for high quality and it's paying off. One thing makes me wonder, though: Why is Australian User Interface Conference AUIC so high up on the list - it's a regional conference, and it's the first time I hear about it?  
  5. One negative surprise is INTERACT IFIP, which is very low on the list. What happened? I heard that in the 1990s, it was "the other CHI". But I'm not very surprised to see that MobileHCI is far behind the top lot, at 4.6 citations/paper and thereby on par with NordiCHI. I'm in the paper committee of MobileHCI'12, and I hope we can discuss ways to shape up a bit... 
  6. Then there's a big group of conferences that publish a lot, but almost for nothing. I know that some of these confences deliberately serve other purposes than citations, and that's fine. But then there are some that are just utterly hopeless. In my view, they should either do something radical or reconsider their reason for existence.

Update: I'm happy to see that the post gained a lot of attention in Twitter. Researchers are passionate about conference rankings, and I fully understand why: Careers may depend on where you publish, and there are many strong preconceptions about the quality of conferences. I have received a number of useful suggestions on how to improve this ranking. Trying to do a better ranking would be useful for our community, because it would force us to reflect the nature and goals of our field. Unfortunately, I don't have time to do it myself. But if somebody is serious about doing it, I can forward the ideas that I received.

Table: Ranking of HCI conferences based on average
citations per paper during the last 10 years.

Rank Conference Publications Citations Citations/paper
1 UIST - User Interface Software and Technology 390 8145 20.88461538
2 ECSCW - European Conference on Computer Supported Cooperative Work 104 1892 18.19230769
3 UbiComp(HUC) - Ubiquitous Computing/Handheld and Ubiquitous Computing 408 7184 17.60784314
4 CSCW - Conference on Computer Supported Cooperative Work 475 6407 13.48842105
5 ISWC - International Symposium on Wearable Computers 260 2770 10.65384615
6 DIS - Designing Interactive Systems 289 2526 8.740484429
7 CHI - Computer Human Interaction 5224 43789 8.382274119
8 GROUP - International Conference on Supporting Group Work 309 2577 8.339805825
9 IUI - Intelligent User Interfaces 810 6609 8.159259259
10 IDC - Interaction Design And Children 114 909 7.973684211
11 ICMI - Int. Conf. on Multimodal Interfaces 515 3691 7.166990291
12 MLMI - Machine Learning for Multimodal Interaction 175 1191 6.805714286
13 ICAD - International Conference on Auditory Display 239 1621 6.782426778
14 NIME - New Interfaces for Musical Expression 367 2351 6.40599455
15 UM - User Modeling 377 2375 6.299734748
16 AUIC - Australasian User Interface Conference 103 645 6.262135922
17 DSV-IS - Design, Specification, and Verification of Interactive Systems 144 856 5.944444444
18 AVI - Working Conference on Advanced Visual Interfaces 410 2379 5.802439024
19 ETRA - Eye Tracking Research & Application 212 1211 5.712264151
20 GW - Gesture Workshop 194 903 4.654639175
21 ASSETS - ACM Conference on Assistive Technologies 450 2083 4.628888889
22 Mobile HCI - Mobile HCI 781 3564 4.563380282
23 NORDICHI - Nordic Conference on Human-Computer Interaction 441 1960 4.444444444
24 RO-MAN - IEEE International Symposium on Robot and Human Interactive Communication 259 1076 4.154440154
25 ICCM - International Conference on Cognitive Modelling 130 538 4.138461538
26 TAMODIA - Task Models and Diagrams for User Interface Design 164 604 3.682926829
27 PDC - Participatory Design 185 680 3.675675676
28 INTERACT - IFIP Conference on Human-Computer Interaction 772 2766 3.582901554
29 W4A - Workshop on Web Accessibility 191 670 3.507853403
30 DIGRA - Conference of the Digital Games Research Association 349 1153 3.303724928
31 ACII - Affective Computing and Intelligent Interaction 225 736 3.271111111
32 Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems 862 2724 3.160092807
33 CISS - Conference on Information Sciences and Systems 905 2583 2.854143646
34 PERSUASIVE - Persuasive Technology 180 504 2.8
35 International conference on tangible and embedded interaction 366 948 2.590163934
36 SIGDOC - ACM Special Interest Group for Design of Communication 388 912 2.350515464
37 ACMACE - Advances in Computer Entertainment Technology 581 1314 2.2616179
38 CANDC - Creativity & Cognition 278 572 2.057553957
39 OZCHI - Australasian Computer-Human Interaction Conference 448 907 2.024553571
40 ICEC - International Workshop on Entertainment Computing 521 831 1.595009597
41 BCS HCI Conference 292 443 1.517123288
42 C5 - Conference on Creating, Connecting and Collaborating through Computing 203 289 1.42364532
43 ICCHP - International Conference on Computers for Handicapped Persons 921 1285 1.395222584
44 APCHI - Asia-Pacific Computer and Human Interaction 135 176 1.303703704
45 USAB - Usability Symposium 189 220 1.164021164
46 HCI - Human-Computer Interaction 3462 4004 1.156556904
47 DIMEA - International Conference on Digital Interactive Media in Entertainment and Arts 135 156 1.155555556
48 Multikonferenz Wirtschaftsinformatik 266 245 0.921052632
49 MVA - Machine Vision Applications 439 403 0.917995444
50 Mensch & Computer 338 304 0.899408284
51 ACHI - International Conference on Advances in Computer-Human Interaction 152 129 0.848684211
52 IHM - Interaction Homme-Machine 327 228 0.697247706
53 Active Media Technology 240 129 0.5375
54 ICNSC - International Conference on Networking, Sensing and Control 515 157 0.304854369
55 ICIS - International Conference on Interaction Sciences 271 28 0.103321033
56 ICETET - International Conference on Emerging Trends in Engineering & Technology 225 19 0.084444444
57 MVHI - Machine Vision and Human-machine Interface 208 0 0

Why smartphone is the next HUGE thing for targeted advertisements

The big boys of targeted advertisement are turning their attention to smartphones to take the next big step. In this blog post, I explain what they aim to achieve and give a researcher's perspective to this development.

First, I want to explain the business logic of targeted advertisements. Facebook and Google get much of their revenues from targeted advertisements. The logic is simple: The better the profile, the better the advertisements, the more influence they have on consumer choice, and the more willing advertisers are to invest. Facebook's selling point thus far has been that it sits on top of a treasure chest: social networks, status updates, photos, locations, word-of-mouth etc - for hundreds of millions of people. And to improve its profiling, it's bringing in more applications and features that get users to spend time expressing themselves (data!), and it has been expanding its reach to external websites. Google doesn't do any worse. Beyond queries and clickthroughs in search engine use, which is where it all started, Google of late follows a two-fold strategy to improve profiling: 1) It designs/acquires "free" services where users create and consume content, such as Youtube, GoogleDocs and Google+; 2) It uses cookies to track the user elsewhere in the Web.

The second fact that must be known is that both players (among others) have been moving into the mobile space during the last five years. News told that Facebook is thinking about its own smartphone/OS. And Google is expanding its tracking to mobile devices by promoting Android as a "free" OS. It is no secret that Google tries to gain data from Android devices. And Apple is moving into mobile tracking, too. The incident reported earlier this year revealed its intention to transfer user's location to its servers, but it is most likely examining options for stronger tracking enabled by iPhones and iPads.

The next logical question is how mobile tracking can improve customer profiling. Why, for example, isn't Facebook happy with having its app running on smartphones and tablets? I believe the chase is on for two kinds of prey: First, people spend increasingly lot of time per day using mobile devices. Mobile devices are our browsers, email clients, music players, social networking hubs etc. Some trends indicate that desktop PCs and laptops are dying. Ergo, in order to continue doing the business these businesses already do, mobile use must be tracked. Second, and more critical, present-day mobile devices are equipped with sensors that are powerful indicators what users do and are outside the realm of Internet. This has the potential to extend tracking from Internet to the real world and the real you.

Well, what can be inferred of the user based on sensors, then? A lot! Beyond the familiar GPS, there are many great opportunities. The ambient light sensor tells about lighting conditions and where the phone is - for example, in the pocket versus in the hand. The gyroscope augments GPS by telling where the user is heading and can disambiguate what the user is interested in, say, public space or a mall. The Bluetooth logs other people nearby, assuming they have their Bluetooths on; but it can also be employed, as can be WiFi, for indoor positioning. The accelometer responds to users' movement, which is useful for inferring physical activity (e.g., walking, running). The microphone can be used to detect contexts like being on the bus or in a meeting, as well as to recognize conversation partners and perhaps even topics of talk. The cameras on the back and front of the phone can be used to take visual samples of the user and his/her environment. In addition, we can of course log any interactive event in the phone from application launches and media listening to HTTP requests and communication transactions. We can go as far as capturing the individual keypresses and coordinates from taps on the display.

But why should we be alarmed? Every corporate research unit I know of is investigating logging on mobile devices. And, as a consequence of increasing industry interest, the scientific area of sensor-based inference has grown immensely during the last ten years. It is presently most vibrant in such fields as ubiquitous and pervasive computing. People following these fields know that machine inference of "higher order contexts" from data like this is far from fool proof. But are these limitations due to the fact that researchers tend focus on a single sensor at a time and do not assume a wealth of background information about the users? Many inference problems will be child's play when one can combine multiple data sources and knows the historical tendencies of the user.

Add to this the fact that mobile devices bear potential for presenting the targeted advertisements. There's nothing new in this idea - in fact, the mobile device has been an advertisement platform for more than a decade now. However, there are many reasons why it hasn't become BIG business yet. Simplifying a bit, it boils down to a poor cost-utility trade-off: Present-day mobile advertisements "cost" a lot to the user because they take up the precious little screen space and fragment the already scarce attention they have [3]. They annoy! By contrast, the great promise of advertisements that are based on mobile sensor data combined with historical data is that the product promotions will be more spot-on and feel worthy of one's attention. I anticipate that the first reaction of people will be to freak out, but they will soon learn to neglect the privacy loss and consider highly personalized advertisements as a natural state of affairs. Will the ads actually be better with this data? I believe so. My colleagues Petteri Nurmi (HIIT, Finland) and Antonio Krüger (DFKI, Germany) are already seeing first positive evidence from empirical studies, indicating that personalized mobile advertisements can increase purchases in retail context. Consumers might actually like them when compared to traditional forms of in-store advertising.

Many of us working on the scientific side of this development may have missed the weak signal of the future we might be heading. Many of us genuinely work toward nobler goals. For example, algorithms are developed to detect falling/tripping of a user and can be used for quicker medical response. And social media applications are developed where peers can opt-in to share real-time sensor-based information [2]. I have been involved in making smartphone logging available as a tool for other scientists [1]. For social scientific research, the potential is extraordinaty, and in one paper we dubbed smarpthone "the fMRI of social sciences" [4]. Altought applications like these might benefit our societies, I'm afraid that the true societal impact is somewhere else.

While my colleagues working in these areas have been aware of this development, my feeling is that they have not understood the potential gravity of mobile tracking. It is obvious that the companies are after a much deeper penetration into what the consumer is doing in the real world. In the worst case, future generations will live their lives such that their moment-by-moment actions are tied to advertisements. Consumerism will happen every second.

My pessimistic prediction is that, because the companies have a head start on this, this future will happen in some form or another. I'm afraid that the ethically sustainable future, although still possible, is less likely. Here are my grim predictions for the coming five years:

  1. Sensor data collection will be extended from location data to include more and more sensors.
  2. Criminals will be chased and caught using such data.
  3. "Privacy bundles" - Consumers get their phones cheaper if they allow their data to be collected and accept personalized advertisements.
  4. Mobile devices used increasingly as a platform for targeted advertisements, tying them to OS-level functioning.
  5. Hacker groups and malevolent individuals find ways to snoop personal mobile data.
  6. Governments use this data against activists and "terrorists."
  7. NGOs and scholars start fighting this development, after the fact as usual.


REFERENCES

[1] Oulasvirta, A., & Hasu, T. (2011). ContextLogger2: A Logger Construction Kit. Works in Progress. IEEE Pervasive Computing.
See http://contextlogger.org/contextlogger2/features.html

[2] Oulasvirta, A., Petit, R., Raento, M., & Tiitta, S. (2007). Interpreting and acting on mobile awareness cues. Human-Computer Interaction, 22 (1&2), 97-135.
http://www.leaonline.com/doi/abs/10.1080/07370020701307799

[3] Oulasvirta, A., Tamminen, S., Roto, V., and Kuorelahti, J. (2005). Interaction in 4-second bursts: The fragmented nature of attentional resources in mobile HCI. Proceedings of CHI 2005, ACM Press, New York, pp. 919-928.
http://www.cs.helsinki.fi/u/oulasvir/scipubs/chi2005_oulasvirta.pdf

[4] Raento, M., Oulasvirta, A., & Eagle, N. (2009). Smartphones: An emerging tool for social scientists. Sociological Methods and Research, 37 (2), 426-454.
http://www.mpi-inf.mpg.de/~oantti/pubs/smartphones_SMR.pdf

Germany: The Wunderkind of HCI?

Three weeks ago I started working at the Max Planck Institute for Informatics in Saarbrücken, Germany, with the goal of establishing their first group on HCI. After just three weeks, my understanding of the German HCI scene is still limited, but I wanted to share my first impression.

In 2006, I was a research intern at Deutsche Telekom's newly established T-Labs in Berlin, located in flashy premises in a central location in western Berlin. At that time, which is just five years ago, it was clear that Germany was taking it's baby steps in HCI. There were only a handful of groups and most of them not appear very well funded. The HCI researchers I talked to seemed to have low self-esteem, complaining that HCI is not viewed as "proper science" and they had to fight prejudice in the host departments and funding organizations. Actually, the most promising Germans doing HCI had fled the country. And, if the number of CHI papers is a valid measure of anything, I believe that in the first half of the 2000s even Finland dwarfed Germany.

Starting in my new work three weeks ago, I was glad to find myself in a changed country. My former employer T-Labs has become one of the front-runners of German HCI, among with Hasso Plattner Institute, LMU, RTWH Aachen, Duisburg-Essen, and DFKI. These and other German groups are active and well-known in all top conferences. These groups even compete which gets most CHI papers [1]. My sense is that the then toxic atmosphere has changed as well, from prejudice to tolerance, and is now changing to embracement. New HCI professors have started throughout the country. And some top people who fled the country, like Patrick Baudisch, have returned. There are even serious plans to get the CHI conference to Berlin. Achieving all this in just five years makes Germany the Wunderkind of HCI.

However, the most promising aspect is that Germans are now investing into HCI, and at scales most others can dream of. In Finland, for instance, there is no Center of Excellence focusing on HCI, which is the flagship of academic funding in Finland. And even if there was, it would get funding in the order of 1M€/year. Compare this to the German Clusters of Excellence. The Multimodal Computing and Interaction Cluster of Excellence [2], located in Saarbrücken, enjoys funding in the order of 40M€! This money is used to establish 20 (!) new research groups.

I see two shadows dimming the child prodigy's path to glory. First, research groups are not very multi-disciplinary. Most groups are located within computer science departments and driven by computer scientists or engineers. This is fine--there's a lot of good HCI work that can be done by computer scientists. But will this become a limiting factor? After all, many of the most successful groups in HCI's past have included also behavioral and social scientists. Second, the HCI curricula in German universities are not very comprehensive. I believe that a modern Master's program in HCI should involve a combination of HCI's"classic" elements, such as human factors and prototyping, as well as more timely elements, such as multimodal user interfaces and video games. Obviously many combinations are valid, but what worries me is that German curricula typically cover the classic bit by just 1-2 courses. Without the classic elements will the students have transferable skills that help them address novel problems outside the spezialization? (Update: There's at least one HCI-specific Master's program, at Uni Siegen.)

In my view, these issues are structural and can undermine any amount of euros pumped into HCI research. Solving the issue boils down to one critical question about "Germany's HCI identity": what is it that German HCI research can do better than others? I'll keep you posted on what I learn.

Links:
[1] http://hci.rwth-aachen.de/chi-ranking
[2] http://www.mmci.uni-saarland.de/

What's there to educate about HCI?

I filled out this morning a survey of HCI curricula and made an
observation about the sorry status of education in this field that I
want to share. With"sorry status," I'm not disparaging these important
efforts. In fact, I think that the survey is timely and can be commended
for meticulously listing the topics, technology areas, technologies, and
mother disciplines that have become just innummerable.

And that's exactly the problem! Even the smartest Ivy-league student
cannot be expected to become truly skilled in computer science and
engineering, design, statistics, behavioral and cognitive sciences,
social sciences and economics--and these are just the most obvious
culprits. And survive the swamp that contemporary HCI is, the poor
student should also master numerous ephemeral research topics from
haptics to ICT4D, methods ranging from non-parametric statistical tests
to cultural probes, and UI technologies from speech recognition to
virtual reality. Life's simply too short!

HCI has become so absurdly diverse and multi-multi-disciplinary that
it's more aptly called hyper-disciplinary. And I'm afraid that we have
not found a way to cope with this in HCI education. This is well
reflected in present-day education producing so-called "HCI
generalists." If you read a contemporary HCI textbook, you find hardly
theory that fulfills two criteria: 1) Is unique to HCI and not in
intellectual debt to mother disciplines, and: 2) Is transferable across
the different problems of HCI and therefore useful. Instead, we find
stripped-down versions of methods borrowed from the science proper, and
we find wishy-washy design concepts and ideals that provide no better
foundation than a healthy dose of common sense.

What should we do, then? Here's my favorite quote from Richard Feynman
to contextualize my view: "If, in some cataclysm, all scientific
knowledge were to be destroyed, and only one sentence passed on to the
next generation of creatures, what statement would contain the most
information in the fewest words? I believe it is the atomic hypothesis
(or atomic fact, or whatever you wish to call it) that all things are
made of atoms — little particles that move around in perpetual motion,
attracting each other when they are a little distance apart, but
repelling upon being squeezed into one another. In that one sentence you
will see an enormous amount of information about the world, if just a
little imagination and thinking are applied."

My view is that HCI education should be built on those scientific
principles that reveal something surprising and actionable about the way
humans use technological artefacts. But would "the atomic fact" of HCI
be? I dont' know. The best candidate I could entertain is the
observation of Paul Fitts' (1954) that humans are limited in the
"ability to produce consistently one class of movement from among
several alternative movement classes." This fundamental property of
human motor capacity receives an exact formulation in information
theory. From this formulation one can derive principles like the
speed-accuracy trade-off, as well as measurements like throughput. This
principle has fueled one of the most scientifically successful
sub-enterprises of HCI: the development of input devices and novel
interaction techniques. I'm sure we will be able to identify other
atomic facts for heavily-researched core topics. I'd certainly like to
see what people more intelligent than I am would come up if given this
exercise!