The Lumiere project: The origins and science behind Microsoft’s Office Assistant

Microsoft Office Assistant ClippyI think by now just about everyone has heard about the most annoying feature to have been included in a commercial piece of software. I am talking, of course, about Clippy, the Microsoft Office Assistant that many loved to hate. Clippy was first included in the 1997 release of the Office suite and continued to be part of the product line until 2007 when it was permanently removed.

Many people know Clippy as a major nuisance but few know the story behind the technology and why it sucked so much. Keep reading because I am about to tell you all about it (if you don’t want to read the whole story, you can just jump to the video at the middle of this post; if you watch the last minute you will get a glimpse of Clippy’s grandfather.)


In 1993, Microsoft researchers from the Decision Theory & Adaptive Systems Group established the Lumiere project to study and improve human computer interaction using Bayesian methods. The group wanted to create smart technologies that can observe a user interacting with a computer program and infer his/her goals and needs providing valuable feedback and assistance as necessary. Developing such a technology makes sense since many people often become intimidated by complex software interfaces. I won’t bore you with the details of what Bayesian methods are and why they are good. The mathematics behind such methods is solid and has had many useful applications to date.

So, is this an easy problem to solve?

Actually, inferring a user’s intent is a very hard problem no matter how good your math is. The Microsoft team had to infer user intent from his interaction with the program, e.g., mouse movement, what menu items were selected, context (what is the user trying to do – remember how Clippy always came up saying something like “I think you are trying to write a letter. Would you like some help?”) and specific text queries by the user, e.g., how do I print a document?

Any user model that can adequately capture all the relevant information will necessarily have many variables. The values of these variables must be estimated over time. Moreover, different users tend to interact with a piece of software differently. For example, an experienced user is most likely to need less help; the same user may also help with the more obscure features of the software compared to a novice. Personalization is a very important factor in ensuring that such systems work well.

To make a long story short, the Microsoft researchers led by the senior scientist Dr. Eric Horvitz were making good progress and in 2 years time they already had a nice system working. So, in 1995 and as the team had already started collaborating with the Microsoft Office production team, they put together a demonstration of Lumiere’s inference engine for Excel. The video below is a 9-minute tour of Lumiere working in Excel. In the video, Horvitz explains how the inference engine worked in 1995 and how they envisioned it working in later versions using a cartoon character front-end. Watch the last minute of the video for a glimpse of Clippy’s grandfather.

After the video, I explain using evidence from a number of Microsoft Research publications and personal knowledge why Clippy worked so poorly in the 1997 release of Microsoft Office.

Clippy debuts

The Bayesian inference engine demonstrated in the above video works like a charm monitoring the user’s behavior, inferring his intent, and providing help in a contextual and personalized fashion.

Two years after this video was recorded and after much collaboration between the research and product teams, the Lumiere project debuted as a well advertised feature of Microsoft Office 97. Clippy was one of the many cartoon characters that were available as the engine’s front end interacting with the user.

Unfortunately and as we all know, Clippy worked so poorly that it was not long before users started complaining about its behavior. So, what went wrong?

The reasons behind Clippy’s massive failure

Well, after doing some research I found out what went wrong. In a paper published in 1998 at the Conference on Uncertainty in Artificial Intelligence (UAI), the Lumiere team described the inner workings of the Assistant’s inference engine and also how much of it was included in the released version of Office 97. Below is a list of the features that were excluded from the product release (those keen enough can cross reference the list with what was demoed in the video above.)

  • No persistent user profiles.
  • No reasoning about user competence, i.e., novice versus experienced user
    Small event queue with emphasis only on the most recent interactions of the user with the software interface (this means the engine was trying to guess the values of many variables using very little data.)
  • Separation between user interface events and word-based queries; for word-based queries the engine ignored any context and user actions.
  • Last and possibly most important and I quote from the paper, “The automated facility of providing assistance based on the likelihood that a user may need assistance or on the expected utility of such autonomous action was not employed.” Instead, “The Office team has employed a relatively simple rule-based system on top of the Bayesian query analysis system to bring the agent to the foreground with a variety of tips.” This is why Clippy kept popping up all the time. It was not using the mathematically correct engine that the researchers had designed. It was driven by some rule-based system that one or more of the developers from the product team thought was a reasonable substitute.

Why did Microsoft cripple Clippy?

Obviously for some reason many of the features in Lumiere’s Bayesian inference engine never made it into Office 97.


I have not been able to find an official document that explains why most if not all of the inference engine’s features were not included in the Office 97 release. However, I can provide some informal evidence based on personal knowledge.

Some time in 2000 or 2001 when I was still a graduate student, Dr. Horvitz gave an invited talk at my university. He talked much about his HCI research and the Bayesian modeling techniques he had been studying for years. A question about Clippy was eventually and unavoidably asked. What the heck happened with that?

I recall his response being that what happened was as noted earlier much of their careful mathematical modeling of users never made it in the final product. He explained that the reason for this was a lack of disk space. You see, the Office suite ended up being much more bloated than originally expected and since most of the more mundane features were considered essential the product team decided to limit the amount of space available for the Office Assistant component. This is why so many features had to be removed. They did not have space for it all.

This is the story of the Microsoft Office Assistant or Clippy as it is most widely known. Microsoft discontinued the Office Assistant (more accurately turned off the feature and I very much doubt that anyone bothered to turn it back on) with the release of Office XP on 2001 and so Clippy is now resting in peace somewhere in a backup drive in Redmond.

The courageous story of Clippy and the tragic story behind its humble beginnings, rise to fame, and downfall are now part of history. But at least, you now know the truth of the reasons behind its unfortunate demise.

2 Replies to “The Lumiere project: The origins and science behind Microsoft’s Office Assistant”

  1. Very interesting read and such a shame that the potential clippy, or rather the system behind that character front end was strangled by incompetence. I'm sure some clipart could have been dropped and this kept in!

    One could imagine that this could have spawned a whole new avenue for programmers looking to improve usability in software. With such a terrible final effort in Clippy its easy to see why no one really bothered to try again.

    I've often thought the lack of intelligence in software is a big hole that no programmers seem able to attempt to fill and here we can see an example of why there might be a stigma attached to any idea of AI within applications we use every day.

  2. What a shame, I had gathered a feeling Clippy had his Bayesian wings clipped.

    Clippy’s not alone, much of enterprise software, the core MRP algorithm is still designed for restricted disk space and like Clippy now ready for proper implementation.

Comments are closed.