| 2003 | A Web page prediction model based on click-stream tree representation of user behavior. Sule Gündüz, M. Tamer Özsu |
| 2003 | A bag of paths model for measuring structural similarity in Web documents. Sachindra Joshi, Neeraj Agrawal, Raghu Krishnapuram, Sumit Negi |
| 2003 | A two-way visualization method for clustered data. Yehuda Koren, David Harel |
| 2003 | Accurate decision trees for mining high-speed data streams. João Gama, Ricardo Rocha, Pedro Medas |
| 2003 | Adaptive duplicate detection using learnable string similarity measures. Mikhail Bilenko, Raymond J. Mooney |
| 2003 | Aggregation-based feature invention and relational concept classes. Claudia Perlich, Foster J. Provost |
| 2003 | Algorithms for estimating relative importance in networks. Scott White, Padhraic Smyth |
| 2003 | An adaptive nearest neighbor search for a parts acquisition ePortal. Rafael Alonso, Jeffrey A. Bloom, Hua Li, Chumki Basu |
| 2003 | An iterative hypothesis-testing strategy for pattern discovery. Richard J. Bolton, Niall M. Adams |
| 2003 | Analyzing customer behavior at Amazon.com. Andreas S. Weigend |
| 2003 | Applications of sampling and fractional factorial designs to model-free data squashing. William DuMouchel, Deepak K. Agarwal |
| 2003 | Applying data mining in investigating money laundering crimes. Zhongfei (Mark) Zhang, John J. Salerno, Philip S. Yu |
| 2003 | Architecting a knowledge discovery engine for military commanders utilizing massive runs of simulations. Philip S. Barry, Jianping Zhang, Mary McDonald |
| 2003 | Assessment and pruning of hierarchical model based clustering. Jeremy Tantrum, Alejandro Murua, Werner Stuetzle |
| 2003 | CLOSET+: searching for the best strategies for mining frequent closed itemsets. Jianyong Wang, Jiawei Han, Jian Pei |
| 2003 | Capturing best practice for microarray gene expression data analysis. Gregory Piatetsky-Shapiro, Tom Khabaza, Sridhar Ramaswamy |
| 2003 | Carpenter: finding closed patterns in long biological datasets. Feng Pan, Gao Cong, Anthony K. H. Tung, Jiong Yang, Mohammed Javeed Zaki |
| 2003 | Classifying large data sets using SVMs with hierarchical clusters. Hwanjo Yu, Jiong Yang, Jiawei Han |
| 2003 | Clinical and financial outcomes analysis with existing hospital patient records. R. Bharat Rao, Sathyakama Sandilya, Radu Stefan Niculescu, Colin Germond, Harsha Rao |
| 2003 | CloseGraph: mining closed frequent graph patterns. Xifeng Yan, Jiawei Han |
| 2003 | Correlating synchronous and asynchronous data streams. Sudipto Guha, Dimitrios Gunopulos, Nick Koudas |
| 2003 | Critical event prediction for proactive management in large-scale computer clusters. Ramendra K. Sahoo, Adam J. Oliner, Irina Rish, Manish Gupta, José E. Moreira, Sheng Ma, Ricardo Vilalta, Anand Sivasubramaniam |
| 2003 | Cross-training: learning probabilistic mappings between topics. Sunita Sarawagi, Soumen Chakrabarti, Shantanu Godbole |
| 2003 | Data quality through knowledge engineering. Tamraparni Dasu, Gregg T. Vesonder, Jon R. Wright |
| 2003 | Data-driven validation, completion and construction of event relationship networks. Chang-Shing Perng, David Thoenen, Genady Grabarnik, Sheng Ma, Joseph L. Hellerstein |
| 2003 | Discovery of climate indices using clustering. Michael S. Steinbach, Pang-Ning Tan, Vipin Kumar, Steven A. Klooster, Christopher Potter |
| 2003 | Distributed cooperative mining for information consortia. Satoshi Morinaga, Kenji Yamanishi, Jun'ichi Takeuchi |
| 2003 | Distributed multivariate regression based on influential observations. Hang Yu, Ee-Chien Chang |
| 2003 | Efficient data reduction with EASE. Hervé Brönnimann, Bin Chen, Manoranjan Dash, Peter J. Haas, Peter Scheuermann |
| 2003 | Efficient decision tree construction on streaming data. Ruoming Jin, Gagan Agrawal |
| 2003 | Efficient elastic burst detection in data streams. Yunyue Zhu, Dennis E. Shasha |
| 2003 | Efficiently handling feature redundancy in high-dimensional data. Lei Yu, Huan Liu |
| 2003 | Eliminating noisy information in Web pages for data mining. Lan Yi, Bing Liu, Xiaoli Li |
| 2003 | Empirical Bayesian data mining for discovering patterns in post-marketing drug safety. David M. Fram, June S. Almenoff, William DuMouchel |
| 2003 | Empirical comparisons of various voting methods in bagging. Kelvin T. Leung, Douglas Stott Parker Jr. |
| 2003 | Experimental design for solicitation campaigns. Uwe F. Mayer, Armand Sarkissian |
| 2003 | Experimental study of discovering essential information from customer inquiry. Keiko Shimazu, Atsuhito Momma, Koichi Furukawa |
| 2003 | Experiments with random projections for machine learning. Dmitriy Fradkin, David Madigan |
| 2003 | Extracting semantics from data cubes using cube transversals and closures. Alain Casali, Rosine Cicchetti, Lotfi Lakhal |
| 2003 | Fast vertical mining using diffsets. Mohammed Javeed Zaki, Karam Gouda |
| 2003 | Finding recent frequent itemsets adaptively over online data streams. Joong Hyuk Chang, Won Suk Lee |
| 2003 | Fragments of order. Aristides Gionis, Teija Kujala, Heikki Mannila |
| 2003 | Frequent-subsequence-based prediction of outer membrane proteins. Rong She, Fei Chen, Ke Wang, Martin Ester, Jennifer L. Gardy, Fiona S. L. Brinkman |
| 2003 | Generating English summaries of time series data using the Gricean maxims. Somayajulu Sripada, Ehud Reiter, Jim Hunter, Jin Yu |
| 2003 | Generative model-based clustering of directional data. Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, Suvrit Sra |
| 2003 | Golden Path Analyzer: using divide-and-conquer to cluster Web clickstreams. Kamal Ali, Steven P. Ketchpel |
| 2003 | Graph-based anomaly detection. Caleb C. Noble, Diane J. Cook |
| 2003 | Improving spatial locality of programs via data mining. Karlton Sequeira, Mohammed Javeed Zaki, Boleslaw K. Szymanski, Christopher D. Carothers |
| 2003 | Indexing multi-dimensional time-series with support for multiple distance measures. Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, Eamonn J. Keogh |
| 2003 | Information awareness: a prospective technical assessment. David D. Jensen, Matthew J. Rattigan, Hannah Blau |
| 2003 | Information-theoretic co-clustering. Inderjit S. Dhillon, Subramanyam Mallela, Dharmendra S. Modha |
| 2003 | Interactive exploration of coherent patterns in time-series gene expression data. Daxin Jiang, Jian Pei, Aidong Zhang |
| 2003 | Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining. Mohammad El-Hajj, Osmar R. Zaïane |
| 2003 | Knowledge-based data mining. Sholom M. Weiss, Stephen J. Buckley, Shubir Kapoor, Søren Damgaard |
| 2003 | Learning relational probability trees. Jennifer Neville, David D. Jensen, Lisa Friedland, Michael Hay |
| 2003 | Maximizing the spread of influence through a social network. David Kempe, Jon M. Kleinberg, Éva Tardos |
| 2003 | Mining concept-drifting data streams using ensemble classifiers. Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han |
| 2003 | Mining data records in Web pages. Bing Liu, Robert L. Grossman, Yanhong Zhai |
| 2003 | Mining distance-based outliers in near linear time with randomization and a simple pruning rule. Stephen D. Bay, Mark Schwabacher |
| 2003 | Mining hepatitis data with temporal abstraction. Tu Bao Ho, Trong Dung Nguyen, Saori Kawasaki, Si Quang Le, DucDung Nguyen, Hideto Yokoi, Katsuhiko Takabayashi |
| 2003 | Mining high dimensional data for classifier knowledge. Raj Bhatnagar, Goutham Kurra, Wen Niu |
| 2003 | Mining phenotypes and informative genes from gene expression data. Chun Tang, Aidong Zhang, Jian Pei |
| 2003 | Mining unexpected rules by pushing user dynamics. Ke Wang, Yuelong Jiang, Laks V. S. Lakshmanan |
| 2003 | Mining viewpoint patterns in image databases. Wynne Hsu, Jing Dai, Mong-Li Lee |
| 2003 | Nantonac collaborative filtering: recommendation based on order responses. Toshihiro Kamishima |
| 2003 | Natural communities in large linked networks. John E. Hopcroft, Omar Khan, Brian Kulis, Bart Selman |
| 2003 | Navigating massive data sets via local clustering. Michael E. Houle |
| 2003 | New unsupervised clustering algorithm for large datasets. William Peter, John Chiochetti, Clare Giardina |
| 2003 | On computing, storing and querying frequent patterns. Guimei Liu, Hongjun Lu, Wenwu Lou, Jeffrey Xu Yu |
| 2003 | On detecting differences between groups. Geoffrey I. Webb, Shane M. Butler, Douglas A. Newlands |
| 2003 | On-line science: the world-wide telescope as a prototype for the new computational science. Jim Gray |
| 2003 | Online novelty detection on temporal sequences. Junshui Ma, Simon Perkins |
| 2003 | PROXIMUS: a framework for analyzing very high dimensional discrete-attributed datasets. Mehmet Koyutürk, Ananth Grama |
| 2003 | PaintingClass: interactive construction, visualization and exploration of decision trees. Soon Tee Teoh, Kwan-Liu Ma |
| 2003 | Passenger-based predictive modeling of airline no-show rates. Richard D. Lawrence, Se June Hong, Jacques Cherrier |
| 2003 | Playing hide-and-seek with correlations. Chris Jermaine |
| 2003 | Privacy-preserving Jaideep Vaidya, Chris Clifton |
| 2003 | Probabilistic discovery of time series motifs. Bill Yuan-chi Chiu, Eamonn J. Keogh, Stefano Lonardi |
| 2003 | Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 24 - 27, 2003 Lise Getoor, Ted E. Senator, Pedro M. Domingos, Christos Faloutsos |
| 2003 | SEWeP: using site semantics and a taxonomy to enhance the Web personalization process. Magdalini Eirinaki, Michalis Vazirgiannis, Iraklis Varlamis |
| 2003 | Screening and interpreting multi-item associations based on log-linear modeling. Xintao Wu, Daniel Barbará, Yong Ye |
| 2003 | Similarity analysis on government regulations. Gloria T. Lau, Kincho H. Law, Gio Wiederhold |
| 2003 | Statistical learning from relational data. Daphne Koller |
| 2003 | Style mining of electronic messages for multiple authorship discrimination: first results. Shlomo Argamon, Marin Saric, Sterling Stuart Stein |
| 2003 | The anatomy of a multimodal information filter. Yi-Leh Wu, Kingshy Goh, Beitao Li, Huaxin You, Edward Y. Chang |
| 2003 | The data mining approach to automated software testing. Mark Last, Menahem Friedman, Abraham Kandel |
| 2003 | Time and sample efficient discovery of Markov blankets and direct causal relations. Ioannis Tsamardinos, Constantin F. Aliferis, Alexander R. Statnikov |
| 2003 | To buy or not to buy: mining airfare data to minimize ticket purchase price. Oren Etzioni, Rattapoom Tuchinda, Craig A. Knoblock, Alexander Yates |
| 2003 | Towards NIC-based intrusion detection. Matthew Eric Otey, Srinivasan Parthasarathy, Amol Ghoting, G. Li, Sundeep Narravula, Dhabaleswar K. Panda |
| 2003 | Towards systematic design of distance functions for data mining applications. Charu C. Aggarwal |
| 2003 | Translation-invariant mixture models for curve clustering. Darya Chudova, Scott Gaffney, Eric Mjolsness, Padhraic Smyth |
| 2003 | Understanding captions in biomedical publications. William W. Cohen, Richard C. Wang, Robert F. Murphy |
| 2003 | Using randomized response techniques for privacy-preserving data mining. Wenliang Du, Justin Zhijun Zhan |
| 2003 | Visualizing changes in the structure of data for exploratory feature selection. Elias Pampalk, Werner Goebl, Gerhard Widmer |
| 2003 | Visualizing concept drift. Kevin B. Pratt, Gleb Tschapek |
| 2003 | Weighted Association Rule Mining using weighted support and significance framework. Feng Tao, Fionn Murtagh, Mohsen M. Farid |
| 2003 | XRules: an effective structural classifier for XML data. Mohammed Javeed Zaki, Charu C. Aggarwal |