KDD 2006 August 20 - 23, 2006    Philadelphia, USA

The Twelfth Annual SIGKDD International Conference on
Knowledge Discovery and Data Mining

August 20 - 23, 2006
Philadelphia, USA

Is There a Grand Challenge or X-Prize for Data Mining?

Gregory Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor, Mohammed Zaki

Summary: This panel will discuss possible exciting and motivating Grand Challenge problems for Data Mining, focusing on bioinformatics, multimedia mining, link mining, text mining, and web mining.

Recently we saw several major scientific and engineering advances that were stimulated by a grand challenge/prize [1, 2]. The DARPA Grand Challenge produced great advances in robotic car navigation in 2005; X-prize led to the first successful commercial spaceflight; and RoboCup, whose goal is to developing a team of humanoid robots that can win against the human world soccer champion team by 2050, has greatly advanced robotic performance and created many enthusiasts.

This question is timely -- X-prize foundation is looking for additional fields where the prize can be created. A good grand challenge problem should satisfy several criteria:
  1. It should be relevant to data mining and knowledge discovery and be based on analysis of large volumes of data, preferably publicly available data.
  2. It should be sufficiently important and difficult so that its solution will advance the field and benefit the society at large.
  3. It should be interesting and exciting to attract researchers, public and press attention, and funding. This requires a simple and concise problem statement one or two sentences.
  4. The required domain knowledge should be relatively accessible.
  5. Other groups are not actively working on this problem already.
Some potential ideas for a grand challenge include:
  • Automatic tagging and classification of digital photos and images on the web
  • Identifying all genes and potential therapeutic targets for cancer
  • A text-mining and understanding system that can use the web to pass standard tests like SAT
  • Discovering how, when and where the genes are expressed
We examine several current hot research areas including web mining, text mining, link analysis, video and image mining, and bioinformatics, and discuss possible proposals for an exciting and worthwhile grand challenge / X-prize for data mining.

