On August 3, 2010 I wrote that my first paper reached exactly 100 citations on Google Scholar. Now my second publication, a UIST 2004 paper, has exactly 100 citations as well. Coincidently, we will host UIST 2013 in St Andrews next year.
Archive for the ‘hci’ Category
I recently attended the 2011 Artificial Intelligence in Education (AIED 2011) conference in Auckland, New Zealand.
The main focus of the AIED conference series is various approaches to designing and evaluating intelligent tutors. An intelligent tutor is basically a computer program that uses AI techniques (such as planning algorithms) to teach students new concepts or techniques based on a model of what the students already know, or are supposed to know. The conference is quite old, this was the 15th biannual conference. The conference is organized by the International AI in Education Society. It also seems that the AIED conference is liaised with the Intelligent Tutoring Systems conference (ITS) so that ITS runs every even year and AIED runs every odd year. Collectively ITS and AIED publish the vast majority of new research in the intelligent tutoring field.
I was at AIED to present a poster on the first steps towards designing intelligent tutors for text entry methods. The idea is to increase user adoption of novel text entry methods by using highly effective and engaging intelligent tutoring. I wouldn’t classify myself as an intelligent tutoring researcher so attending this conference was a good opportunity to get an insight into the field’s frontier.
The conference was held at the Faculty of Engineering at the University of Auckland’s city campus. The conference had a poster session, an interactive event, and three parallel paper tracks. In my opinion the overall organization of the conference and the program worked out very well.
The presented papers fell into three categories: systems, user studies, or data mining. I learned that the latter is apparently so popular that people have felt the need to start a conference on just that, the Educational Data Mining conference (EDM). I also found that machine learning is becoming increasingly popular in intelligent tutoring systems. Some of the machine learning techniques people used were POMDPs and Bayesian models. An interesting concept I didn’t know about before was Open Learner Models (OLMs). OLMs exploit the fact that while a student is using an intelligent tutor the system develops a model of the student. The idea behind OLMs is to figure out how to present the entire model (or aspects of it) to the student. Presumably this can aid the student’s learning.
Some of the papers presented intrigued me. In particular a paper on motivational processes seemed to point in good directions and another paper on confusing students (!) had led to some rather surprising results. I also found out that another active research area is authoring tools for ITSs, some which use techniques from the end-user programming field.
Overall I very much enjoyed going to this conference. It was refreshing to learn about an area of research outside of my comfort zone. Time will tell if I will submit a full paper to AIED 2013!
Update (June 24, 2011):
Firefox 5 brought back the status bar, at least for Windows.
So Firefox prompted me to update to the new version, which I did. And the first thing I noticed was that the menu bar was gone. However, an option enabled me to get it back relatively quickly.
The second thing I noticed was that the status bar was gone. As it turns out, there is no option to get it back, and this is by design.
This puzzles me because there are at least three pieces of information I am relying on the status bar for:
- Which URL I will load if I click on a link.
- Whether or not I am on a secure link (the padlock icon).
- Status of long downloads.
Now these pieces of information are not visible, save for item 1 above. The way Firefox 4 solves item 1 without a status bar is to implement a “temporary status bar” which is only visible when you hover over a link, or when a page loads. Given how many links there are on websites nowadays this results in a lot of flicker in the bottom-left corner of my browsing window when my cursor inadvertently hovers over some of the links.
With the new design I lose information (padlock icon and download progress) and I have to endure distracting flicker in the bottom-left corner. Further, the fact that the status bar is not really gone shows how a mantra (I would guess minimalism) has taken precedence over function.
This makes me think about the philosophy of user interface redesigns and software updates. Do we really want to design software that forces new interface redesigns upon users? Shouldn’t we convince users of the benefits of a new user interface design rather than force it upon users? Unlike software bug fixes, user interface updates always come with retraining costs. Habitual patterns break and users must invest time and effort in figuring out how to achieve the same tasks—assuming those tasks are even possible to achieve with the new design! In the case of Firefox 4, wouldn’t a more democratic and user-friendly approach had been to keep the status bar as it was and gently ask users upon installation whether or not they would want to try the new interface redesign with a flickering status bar?
Economists say it is (in some instances) challenging to convince users to give up suboptimal technologies because of path dependence. Under this interpretation, forcing interface redesigns upon users opens up a way to escape a local optimum in which users refuse to upgrade poor technology they consider “good-enough” . However, interfaces redesigns always come at a cost for users, no matter if those interface redesigns prove to be better later on. Therefore, the expected gains of an interface redesign must always be weighted against the non-negligible adaptation and retraining costs that are imposed on users. And sometimes an interface redesign is just a step backwards. Such as the removal of the status bar in Firefox 4.
I just noticed that according to Google Scholar my first publication, a CHI 2003 paper, has exactly 100 citations now. It seems to be my most cited paper so far.
A recent paper by Gonzalo Génova in the Communications of the ACM talks about the role of empirical evidence in evaluating computer science research. The article talks about computer science in general but it reminds me of Henry Lieberman’s 2003 paper The Tyranny of Evaluation, which attacks the tendency in HCI to reject papers describing groundbreaking systems and techniques solely due to their lack of empirical evidence. Henry makes a comparison to Galileo’s experiments of dropping balls from the Tower of Pisa. As he eloquently puts it: “Trouble is, people aren’t balls.“
There is a trend to use citation counts as an estimator of scientific esteem of journals, university departments, and even individual researchers. Douglas Arnold has written an interesting editorial on the danger of relying on such citation counts to evaluate anything (pdf copy). The editorial provides evidence of just how easy it is to manipulate citation counts. I find the examples provided very disturbing. I would encourage anyone concerned with bibliometrics to read this article.
In this month’s issue of Communications of the ACM there is a paper that shows that selective ACM conference papers are on par, or better than, journal articles in terms of citation counts.
From the paper:
“First and foremost, computing researchers are right to view conferences as an important archival venue and use acceptance rate as an indicator of future impact. Papers in highly selective conferences—acceptance rates of 30% or less—should continue to be treated as first-class research contributions with impact comparable to, or better than, journal papers.”
Considering that the authors only compared these conference papers against the top-tier journals (ACM Transactions), their finding is surprisingly strong. It also strengthens my view that in computer science, selective conference papers are as good, if not better, than journal articles.
From the website, this is what the CHI Academy is about:
“The CHI Academy is an honorary group of individuals who have made extensive contributions to the study of HCI and who have led the shaping of the field.”
Also from the website, this is the citation:
“Shumin Zhai is a Research Staff Member at the IBM Almaden Research Center. Shumin is a leading researcher in applying quantitative and engineering methods in HCI, and has made fundamental contributions to text entry optimization, physical input device design, eye-tracking interfaces, and the understanding of human performance. His contributions to text entry techniques for mobile and touch screen devices include the ShapeWriter gesture keyboard which has been commercialized. Shumin has also been a visiting professor at universities in Europe and China. He has served on many editorial boards and conference committees and is currently the Editor-in-Chief of ACM Transactions on Computer-Human Interaction.”
What are the grand goals in HCI? Do we have any grand goals at all? When you read HCI literature you certainly do not get the impression that some sort of grand goal of an ultimate user interface design was even considered by the authors. In fact, most papers I read do not even seem to have a clear idea of what to do next beyond the paper. They did something “novel” (meaning: not published in the HCI literature before; HCI researchers are notorious for not reading work in the neighboring fields of design, anthropology, comparative literature, psychology and engineering). Then they ran a study and published the results. Now they are doing something completely different… This widespread behavior is leading me to think there is no real progress in the field. Researchers just do a little bit of this and a little bit of that and hope something meaningful eventually comes out of it.
Most of my own work is engineering-focused. I am essentially creating new interfaces that optimize some measurable dependent variable (such as time, error, “insight”, etc.). In my specific work on text entry the primary goal is pretty obvious: I want to create text entry methods that let users write as fast as possible with few errors. Is this a grand goal in HCI then?
I have some reservations. First, HCI is much, much more than text entry. Even I do research in other sub-fields, such as visualization and decision theory. If you narrow down your research objective then sure, you can define a goal, but will it be a grand goal? Second, the above notion only captures the end goal. However, in my opinion, the research trajectory is as, if not more, important than the end goal. It is by repeated trial and error we learn the subtleties behind excellent user interface design. The end goal is fixed while the research trajectory is “elastic” and can be twisted and turned so that you can reach new goals based on it.
This reasoning leads me to think that perhaps my original question was asking the wrong question. Perhaps HCI is actually doing fine. Although the grand goals are hidden in the fog and I sometimes wonder if they exist.
Peer review is often highlighted as a cornerstone of good scientific practice, at least in engineering and the natural sciences. The logic behind peer review is that peers (i.e. other researchers knowledgeable in your research field) review your manuscript to make sure the research is valid, interesting, cites related work, etc.
However, what if reviewers do not really qualify as your peers? Then this validation process isn’t really something that can be called peer review, is it?
I have been submitting and reviewing research papers for the major human-computer interaction (HCI) conferences for six years now, this year as an associate chair (AC) for CHI 2010. I have to say our peer review process leaves something to be desired. A typical outcome is that 1-2 reviewers are actually experts (peers!) and the remaining 2-3 reviewers have never worked in the submission’s particular research area at all. Sometimes the ignorance is so glaringly obvious it is disheartening. For example, my note at CHI 2009 had two reviewers who rated themselves “expert” and “knowledgeable” respectively that argued for rejection because my study “was stating what was already known” [paraphrased]. However, the truth is that the result in this study contradicted what was generally believed in the literature, something I made clear in the rebuttal. In the end, the paper was accepted but it is hard for me to argue that my paper was “peer reviewed”. In this case only one reviewer knew what he or she was talking about and the rest (including the primary and secondary AC) clearly had no research expertise in the area.
In order to have a paper accepted at CHI I have found that above everything else you need to ensure you educate non-peers about your research area. You can safely assume several of the reviewers do not know your research area very well at all (sometimes they even rate themselves as having no knowledge in the area). This is a problem because it means that many good papers get rejected for superficial reasons. It also means that many bad papers end up being accepted. The latter tends to happen for well-written “visually beautiful” papers that either show nothing new or are methodologically invalid. If you are not an expert, you probably won’t spot the subtle methodological flaws that invalidate a paper’s conclusions. Likewise, you won’t realize that the research has already been done much better in a previous paper the authors didn’t know about, or chose not to cite.
CHI tries to fix the issue of reviewer incompetency by having a second stage of the review process – the program committee meeting. However, this is even more flawed because the associate chairs at the committee meeting cannot possibly represent all research areas. As an example, in my committee I was the only one who was active in text entry research. Therefore my power to reject or accept a particular submission involving text entry was immense (even though I chose not to exercise this power much). In the committee meeting the primary and secondary AC are supposed to argue for rejection or acceptance of their assigned submissions. However, if your AC is not an expert he or she will most likely completely rely on the reviewers’ judgments – reviewers, who themselves are often non-experts. This means that the one and only expert-AC in the committee (if there is even one!) needs to speak up in order to save a good paper from being rejected because of AC/reviewer ignorance. Vice versa, bad papers end up being accepted unless someone speaks up at the committee meeting. There is also a third alternative. An AC who for whatever reason does not like a particular paper can kill it at will by raising superficial concerns. This is possible because most likely there is not enough expertise on a particular paper’s topic in the committee room to properly defend it from such attacks (and the authors have obviously no way to address concerns raised at this late stage of the reviewing process).
I think a useful self-assessment indicator would be to ask each reviewer (including the AC) to indicate how many of the references in the submission the reviewer has read before they started to review a particular paper. In many cases, I strongly suspect many honest reviewers would be forced to state they haven’t read a single reference in the reference list! Are such reviewers really peers? No!
This problem of non-expertise among reviewers is probably hard to solve. One huge problem is our insistence on viewing conference publications as the primary publication venue. It means the reviewing system is swamped at one particular point in time each year. As an AC I know how hard it is to find competent reviewers when all the well-qualified candidates you can think of are already busy reviewing other work. Publishing in journals with a rapid turnaround process would be an obvious way to spread the reviewing load over the entire year and therefore maximize the availability of expert reviewers at any given point in time. However, to my surprise, I find that this idea meets a lot of resistance so I am not optimistic this problem is going away anytime soon.