A recent paper by Gonzalo Génova in the Communications of the ACM talks about the role of empirical evidence in evaluating computer science research. The article talks about computer science in general but it reminds me of Henry Lieberman’s 2003 paper The Tyranny of Evaluation, which attacks the tendency in HCI to reject papers describing groundbreaking systems and techniques solely due to their lack of empirical evidence. Henry makes a comparison to Galileo’s experiments of dropping balls from the Tower of Pisa. As he eloquently puts it: “Trouble is, people aren’t balls.“
Archive for the ‘research’ Category
A recent paper in the Proceedings of the National Academy of Sciences shows that 97-98% of active climate scientists are in agreement with IPCC. Perhaps not particularly surprising (data is always good though). This is slightly more amusing: the paper also shows that those in disagreement have substantially lower climate expertise and scientific prominence. I suppose this settles the question on whether there is consensus among climate scientists.
There is a trend to use citation counts as an estimator of scientific esteem of journals, university departments, and even individual researchers. Douglas Arnold has written an interesting editorial on the danger of relying on such citation counts to evaluate anything (pdf copy). The editorial provides evidence of just how easy it is to manipulate citation counts. I find the examples provided very disturbing. I would encourage anyone concerned with bibliometrics to read this article.
In this month’s issue of Communications of the ACM there is a paper that shows that selective ACM conference papers are on par, or better than, journal articles in terms of citation counts.
From the paper:
“First and foremost, computing researchers are right to view conferences as an important archival venue and use acceptance rate as an indicator of future impact. Papers in highly selective conferences—acceptance rates of 30% or less—should continue to be treated as first-class research contributions with impact comparable to, or better than, journal papers.”
Considering that the authors only compared these conference papers against the top-tier journals (ACM Transactions), their finding is surprisingly strong. It also strengthens my view that in computer science, selective conference papers are as good, if not better, than journal articles.
Swedish universities with Nobel Prize-winning faculty:
|Karolinska Institute, Stockholm||5||1982|
|Royal Institute of Technology, Stockholm||1||1970|
|Stockholm School of Economics||1||1977|
|University of Gothenburg||1||2000|
Swedish cities with Nobel Prize-winning faculty:
- Stockholm: 10 Nobel Prizes
- Uppsala: 5 Nobel Prizes
- Gothenburg: 1 Nobel Prize
- Researchers in Uppsala and Stockholm have won all Swedish Nobel Prizes awarded to faculty except one.
- Uppsala University is the only Swedish university to have Nobel Prize winners in more than one category. They have won in Chemistry, Physics, and Physiology or Medicine. All winners at Karolinska Institute are in Physiology or Medicine, and all winners at Stockholm University are in Chemistry.
- The rate of Nobel Prizes won per decade reached a peak in the 1970s and 1980s. Thereafter it dropped to the same rate as in the 1930s and 1940s. See the figure below.
It seems the results obtained from climate science are indeed reliable. It is amazing that scientists are held in so low regard nowadays that British MPs feel they need to jump in and “investigate” a bunch of leaked internal emails from the University of East Anglia. Yes, the peer-review process has problems, any scientist can probably tell you that. Yet it is so much better than the alternative: a flood of bad articles and uninformed opinions swamping reports of actual scientific progress.
For anyone that quickly wants to know more about why global warming is highly probable I recommend you read The Economist‘s nice summary.
From the website, this is what the CHI Academy is about:
“The CHI Academy is an honorary group of individuals who have made extensive contributions to the study of HCI and who have led the shaping of the field.”
Also from the website, this is the citation:
“Shumin Zhai is a Research Staff Member at the IBM Almaden Research Center. Shumin is a leading researcher in applying quantitative and engineering methods in HCI, and has made fundamental contributions to text entry optimization, physical input device design, eye-tracking interfaces, and the understanding of human performance. His contributions to text entry techniques for mobile and touch screen devices include the ShapeWriter gesture keyboard which has been commercialized. Shumin has also been a visiting professor at universities in Europe and China. He has served on many editorial boards and conference committees and is currently the Editor-in-Chief of ACM Transactions on Computer-Human Interaction.”
What are the grand goals in HCI? Do we have any grand goals at all? When you read HCI literature you certainly do not get the impression that some sort of grand goal of an ultimate user interface design was even considered by the authors. In fact, most papers I read do not even seem to have a clear idea of what to do next beyond the paper. They did something “novel” (meaning: not published in the HCI literature before; HCI researchers are notorious for not reading work in the neighboring fields of design, anthropology, comparative literature, psychology and engineering). Then they ran a study and published the results. Now they are doing something completely different… This widespread behavior is leading me to think there is no real progress in the field. Researchers just do a little bit of this and a little bit of that and hope something meaningful eventually comes out of it.
Most of my own work is engineering-focused. I am essentially creating new interfaces that optimize some measurable dependent variable (such as time, error, “insight”, etc.). In my specific work on text entry the primary goal is pretty obvious: I want to create text entry methods that let users write as fast as possible with few errors. Is this a grand goal in HCI then?
I have some reservations. First, HCI is much, much more than text entry. Even I do research in other sub-fields, such as visualization and decision theory. If you narrow down your research objective then sure, you can define a goal, but will it be a grand goal? Second, the above notion only captures the end goal. However, in my opinion, the research trajectory is as, if not more, important than the end goal. It is by repeated trial and error we learn the subtleties behind excellent user interface design. The end goal is fixed while the research trajectory is “elastic” and can be twisted and turned so that you can reach new goals based on it.
This reasoning leads me to think that perhaps my original question was asking the wrong question. Perhaps HCI is actually doing fine. Although the grand goals are hidden in the fog and I sometimes wonder if they exist.
Peer review is often highlighted as a cornerstone of good scientific practice, at least in engineering and the natural sciences. The logic behind peer review is that peers (i.e. other researchers knowledgeable in your research field) review your manuscript to make sure the research is valid, interesting, cites related work, etc.
However, what if reviewers do not really qualify as your peers? Then this validation process isn’t really something that can be called peer review, is it?
I have been submitting and reviewing research papers for the major human-computer interaction (HCI) conferences for six years now, this year as an associate chair (AC) for CHI 2010. I have to say our peer review process leaves something to be desired. A typical outcome is that 1-2 reviewers are actually experts (peers!) and the remaining 2-3 reviewers have never worked in the submission’s particular research area at all. Sometimes the ignorance is so glaringly obvious it is disheartening. For example, my note at CHI 2009 had two reviewers who rated themselves “expert” and “knowledgeable” respectively that argued for rejection because my study “was stating what was already known” [paraphrased]. However, the truth is that the result in this study contradicted what was generally believed in the literature, something I made clear in the rebuttal. In the end, the paper was accepted but it is hard for me to argue that my paper was “peer reviewed”. In this case only one reviewer knew what he or she was talking about and the rest (including the primary and secondary AC) clearly had no research expertise in the area.
In order to have a paper accepted at CHI I have found that above everything else you need to ensure you educate non-peers about your research area. You can safely assume several of the reviewers do not know your research area very well at all (sometimes they even rate themselves as having no knowledge in the area). This is a problem because it means that many good papers get rejected for superficial reasons. It also means that many bad papers end up being accepted. The latter tends to happen for well-written “visually beautiful” papers that either show nothing new or are methodologically invalid. If you are not an expert, you probably won’t spot the subtle methodological flaws that invalidate a paper’s conclusions. Likewise, you won’t realize that the research has already been done much better in a previous paper the authors didn’t know about, or chose not to cite.
CHI tries to fix the issue of reviewer incompetency by having a second stage of the review process – the program committee meeting. However, this is even more flawed because the associate chairs at the committee meeting cannot possibly represent all research areas. As an example, in my committee I was the only one who was active in text entry research. Therefore my power to reject or accept a particular submission involving text entry was immense (even though I chose not to exercise this power much). In the committee meeting the primary and secondary AC are supposed to argue for rejection or acceptance of their assigned submissions. However, if your AC is not an expert he or she will most likely completely rely on the reviewers’ judgments – reviewers, who themselves are often non-experts. This means that the one and only expert-AC in the committee (if there is even one!) needs to speak up in order to save a good paper from being rejected because of AC/reviewer ignorance. Vice versa, bad papers end up being accepted unless someone speaks up at the committee meeting. There is also a third alternative. An AC who for whatever reason does not like a particular paper can kill it at will by raising superficial concerns. This is possible because most likely there is not enough expertise on a particular paper’s topic in the committee room to properly defend it from such attacks (and the authors have obviously no way to address concerns raised at this late stage of the reviewing process).
I think a useful self-assessment indicator would be to ask each reviewer (including the AC) to indicate how many of the references in the submission the reviewer has read before they started to review a particular paper. In many cases, I strongly suspect many honest reviewers would be forced to state they haven’t read a single reference in the reference list! Are such reviewers really peers? No!
This problem of non-expertise among reviewers is probably hard to solve. One huge problem is our insistence on viewing conference publications as the primary publication venue. It means the reviewing system is swamped at one particular point in time each year. As an AC I know how hard it is to find competent reviewers when all the well-qualified candidates you can think of are already busy reviewing other work. Publishing in journals with a rapid turnaround process would be an obvious way to spread the reviewing load over the entire year and therefore maximize the availability of expert reviewers at any given point in time. However, to my surprise, I find that this idea meets a lot of resistance so I am not optimistic this problem is going away anytime soon.