| 2004 | 1-dimensional splines as building blocks for improving accuracy of risk outcomes models. David S. Vogel, Morgan C. Wang |
| 2004 | 2PXMiner: an efficient two pass mining of frequent XML query patterns. Liang Huai Yang, Mong-Li Lee, Wynne Hsu, Xinyu Guo |
| 2004 | A Bayesian network framework for reject inference. Andrew T. Smith, Charles Elkan |
| 2004 | A DEA approach for model combination. Zhiqiang (Eric) Zheng, Balaji Padmanabhan, Haoqiang Zheng |
| 2004 | A cross-collection mixture model for comparative text mining. ChengXiang Zhai, Atulya Velivelli, Bei Yu |
| 2004 | A data mining approach to modeling relationships among categories in image collection. Ruofei Zhang, Zhongfei (Mark) Zhang, Sandeep Khanzode |
| 2004 | A framework for ontology-driven subspace clustering. Jinze Liu, Wei Wang, Jiong Yang |
| 2004 | A general approach to incorporate data quality matrices into data mining algorithms. Ian Davidson, Ashish Grover, Ashwin Satyanarayana, Giri Kumar Tayi |
| 2004 | A generalized maximum entropy approach to bregman co-clustering and matrix approximation. Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, Srujana Merugu, Dharmendra S. Modha |
| 2004 | A generative probabilistic approach to visualizing sets of symbolic sequences. Peter Tiño, Ata Kabán, Yi Sun |
| 2004 | A graph-theoretic approach to extract storylines from search results. Ravi Kumar, Uma Mahadevan, D. Sivakumar |
| 2004 | A microeconomic data mining problem: customer-oriented catalog segmentation. Martin Ester, Rong Ge, Wen Jin, Zengjian Hu |
| 2004 | A probabilistic framework for semi-supervised clustering. Sugato Basu, Mikhail Bilenko, Raymond J. Mooney |
| 2004 | A quickstart in frequent structure mining can make a difference. Siegfried Nijssen, Joost N. Kok |
| 2004 | A rank sum test method for informative gene discovery. Lin Deng, Jian Pei, Jinwen Ma, Dik Lun Lee |
| 2004 | A system for automated mapping of bill-of-materials part numbers. Jayant Kalagnanam, Moninder Singh, Sudhir Verma, Michael Patek, Yuk Wah Wong |
| 2004 | ANN quality diagnostic models for packaging manufacturing: an industrial data mining case study. Nicolás de Abajo, Alberto B. Diez, Vanesa Lobato, Sergio R. Cuesta |
| 2004 | Adversarial classification. Nilesh N. Dalvi, Pedro M. Domingos, Mausam, Sumit K. Sanghai, Deepak Verma |
| 2004 | An iterative method for multi-class cost-sensitive learning. Naoki Abe, Bianca Zadrozny, John Langford |
| 2004 | An objective evaluation criterion for clustering. Arindam Banerjee, John Langford |
| 2004 | Analytical view of business data. Adam Yeh, Jonathan Tang, Youxuan Jin, Sam Skrivan |
| 2004 | Approximating a collection of frequent sets. Foto N. Afrati, Aristides Gionis, Heikki Mannila |
| 2004 | Automatic multimedia cross-modal correlation discovery. Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, Pinar Duygulu |
| 2004 | Belief state approaches to signaling alarms in surveillance systems. Kaustav Das, Andrew W. Moore, Jeff G. Schneider |
| 2004 | Cluster-based concept invention for statistical relational learning. Alexandrin Popescul, Lyle H. Ungar |
| 2004 | Clustering moving objects. Yifan Li, Jiawei Han, Jiong Yang |
| 2004 | Clustering time series from ARMA models with clipped data. Anthony J. Bagnall, Gareth J. Janacek |
| 2004 | Column-generation boosting methods for mixture of kernels. Jinbo Bi, Tong Zhang, Kristin P. Bennett |
| 2004 | Cross channel optimized marketing by reinforcement learning. Naoki Abe, Naval K. Verma, Chidanand Apté, Robert Schroko |
| 2004 | Cyclic pattern kernels for predictive graph mining. Tamás Horváth, Thomas Gärtner, Stefan Wrobel |
| 2004 | Data mining in metric space: an empirical analysis of supervised learning performance criteria. Rich Caruana, Alexandru Niculescu-Mizil |
| 2004 | Dense itemsets. Jouni K. Seppänen, Heikki Mannila |
| 2004 | Density-based spam detector. Kenichi Yoshida, Fuminori Adachi, Takashi Washio, Hiroshi Motoda, Teruaki Homma, Akihiro Nakashima, Hiromitsu Fujikawa, Katsuyuki Yamazaki |
| 2004 | Diagnosing extrapolation: tree-based density estimation. Giles Hooker |
| 2004 | Discovering additive structure in black box functions. Giles Hooker |
| 2004 | Discovering complex matchings across web query interfaces: a correlation mining approach. Bin He, Kevin Chen-Chuan Chang, Jiawei Han |
| 2004 | Document preprocessing for naive Bayes classification and clustering with mixture of multinomials. Dmitry Pavlov, Ramnath Balasubramanyan, Byron Dom, Shyam Kapur, Jignashu Parikh |
| 2004 | Early detection of insider trading in option markets. Steve Donoho |
| 2004 | Effective localized regression for damage detection in large complex mechanical structures. Aleksandar Lazarevic, Ramdev Kanapady, Chandrika Kamath |
| 2004 | Efficient closed pattern mining in the presence of tough block constraints. Krishna Gade, Jianyong Wang, George Karypis |
| 2004 | Eigenspace-based anomaly detection in computer systems. Tsuyoshi Idé, Hisashi Kashima |
| 2004 | Estimating the size of the telephone universe: a Bayesian Mark-recapture approach. David Poole |
| 2004 | Exploiting a support-based upper bound of Pearson's correlation coefficient for efficiently identifying strongly correlated pairs. Hui Xiong, Shashi Shekhar, Pang-Ning Tan, Vipin Kumar |
| 2004 | Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods. William W. Cohen, Sunita Sarawagi |
| 2004 | Exploring the community structure of newsgroups. Christian Borgs, Jennifer T. Chayes, Mohammad Mahdian, Amin Saberi |
| 2004 | Fast discovery of connection subgraphs. Christos Faloutsos, Kevin S. McCurley, Andrew Tomkins |
| 2004 | Fast mining of spatial collocations. Xin Zhang, Nikos Mamoulis, David W. Cheung, Yutao Shou |
| 2004 | Fast nonlinear regression via eigenimages applied to galactic morphology. Brigham S. Anderson, Andrew W. Moore, Andrew J. Connolly, Robert Nichol |
| 2004 | Feature selection in scientific applications. Erick Cantú-Paz, Shawn D. Newsam, Chandrika Kamath |
| 2004 | Fully automatic cross-associations. Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S. Modha, Christos Faloutsos |
| 2004 | GPCA: an efficient dimension reduction scheme for image compression and retrieval. Jieping Ye, Ravi Janardan, Qi Li |
| 2004 | Generalizing the notion of support. Michael S. Steinbach, Pang-Ning Tan, Hui Xiong, Vipin Kumar |
| 2004 | Graphical models for data mining. David Heckerman |
| 2004 | IDR/QR: an incremental dimension reduction algorithm via QR decomposition. Jieping Ye, Qi Li, Hui Xiong, Haesun Park, Ravi Janardan, Vipin Kumar |
| 2004 | IMMC: incremental maximum margin criterion. Jun Yan, Benyu Zhang, Shuicheng Yan, Qiang Yang, Hua Li, Zheng Chen, Wensi Xi, Weiguo Fan, Wei-Ying Ma, QianSheng Cheng |
| 2004 | Identifying early buyers from purchase data. Paat Rusmevichientong, Shenghuo Zhu, David Selinger |
| 2004 | Improved robustness of signature-based near-replica detection via lexicon randomization. Aleksander Kolcz, Abdur Chowdhury, Joshua Alspector |
| 2004 | IncSpan: incremental mining of sequential patterns in large database. Hong Cheng, Xifeng Yan, Jiawei Han |
| 2004 | Incorporating prior knowledge with weighted margin support vector machines. Xiaoyun Wu, Rohini K. Srihari |
| 2004 | Incremental maintenance of quotient cube for median. Cuiping Li, Gao Cong, Anthony K. H. Tung, Shan Wang |
| 2004 | Interactive training of advanced classifiers for mining remote sensing image archives. Selim Aksoy, Krzysztof Koperski, Carsten Tusk, Giovanni B. Marchisio |
| 2004 | Interestingness of frequent itemsets using Bayesian networks as background knowledge. Szymon Jaroszewicz, Dan A. Simovici |
| 2004 | Kernel k-means: spectral clustering and normalized cuts. Inderjit S. Dhillon, Yuqiang Guan, Brian Kulis |
| 2004 | Learning a complex metabolomic dataset using random forests and support vector machines. Young Truong, Xiaodong Lin, Chris Beecher |
| 2004 | Learning spatially variant dissimilarity (SVaD) measures. Krishna Kummamuru, Raghu Krishnapuram, Rakesh Agrawal |
| 2004 | Learning to detect malicious executables in the wild. Jeremy Z. Kolter, Marcus A. Maloof |
| 2004 | Locating secret messages in images. Ian Davidson, Goutam Paul |
| 2004 | Machine learning for online query relaxation. Ion Muslea |
| 2004 | Mining and summarizing customer reviews. Minqing Hu, Bing Liu |
| 2004 | Mining coherent gene clusters from gene-sample-time microarray data. Daxin Jiang, Jian Pei, Murali Ramanathan, Chun Tang, Aidong Zhang |
| 2004 | Mining reference tables for automatic text segmentation. Eugene Agichtein, Venkatesh Ganti |
| 2004 | Mining scale-free networks using geodesic clustering. Andrew Y. Wu, Michael Garland, Jiawei Han |
| 2004 | Mining the space of graph properties. Glen Jeh, Jennifer Widom |
| 2004 | Mining traffic data from probe-car system for travel time prediction. Takayuki Nakata, Jun'ichi Takeuchi |
| 2004 | Mining, indexing, and querying historical spatiotemporal data. Nikos Mamoulis, Huiping Cao, George Kollios, Marios Hadjieleftheriou, Yufei Tao, David W. Cheung |
| 2004 | On demand classification of data streams. Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu |
| 2004 | On detecting space-time clusters. Vijay S. Iyengar |
| 2004 | On the discovery of significant statistical quantitative rules. Hong Zhang, Balaji Padmanabhan, Alexander Tuzhilin |
| 2004 | Optimal randomization for privacy preserving data mining. Michael Yu Zhu, Lei Liu |
| 2004 | Ordering patterns by combining opinions from multiple sources. Pang-Ning Tan, Rong Jin |
| 2004 | Parallel computation of high dimensional robust correlation and covariance matrices. James Chilson, Raymond T. Ng, Alan Wagner, Ruben H. Zamar |
| 2004 | Predicting customer shopping lists from point-of-sale purchase data. Chad M. Cumby, Andrew E. Fano, Rayid Ghani, Marko Krema |
| 2004 | Predicting prostate cancer recurrence via maximizing the concordance index. Lian Yan, David Verbel, Olivier Saidi |
| 2004 | Privacy preserving regression modelling via distributed computation. Ashish P. Sanil, Alan F. Karr, Xiaodong Lin, Jerome P. Reiter |
| 2004 | Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. Rebecca N. Wright, Zhiqiang Yang |
| 2004 | Probabilistic author-topic models for information discovery. Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, Thomas Griffiths |
| 2004 | Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August 22-25, 2004 Won Kim, Ron Kohavi, Johannes Gehrke, William DuMouchel |
| 2004 | Programming the K-means clustering algorithm in SQL. Carlos Ordonez |
| 2004 | Rapid detection of significant spatial clusters. Daniel B. Neill, Andrew W. Moore |
| 2004 | Recovering latent time-series from their observed sums: network tomography with particle filters. Edoardo M. Airoldi, Christos Faloutsos |
| 2004 | Redundancy based feature selection for microarray data. Lei Yu, Huan Liu |
| 2004 | Regularized multi--task learning. Theodoros Evgeniou, Massimiliano Pontil |
| 2004 | Rotation invariant distance measures for trajectories. Michail Vlachos, Dimitrios Gunopulos, Gautam Das |
| 2004 | SPIN: mining maximal frequent subgraphs from graph databases. Jun Huan, Wei Wang, Jan F. Prins, Jiong Yang |
| 2004 | Scalable mining of large disk-based graph databases. Chen Wang, Wei Wang, Jian Pei, Yongtai Zhu, Baile Shi |
| 2004 | Selection, combination, and evaluation of effective software sensors for detecting abnormal computer usage. Jude W. Shavlik, Mark Shavlik |
| 2004 | Semantic representation: search and mining of multimedia content. Apostol Natsev, Milind R. Naphade, John R. Smith |
| 2004 | Sleeved coclustering. Avraham A. Melkman, Eran Shaham |
| 2004 | Support envelopes: a technique for exploring the structure of association patterns. Michael S. Steinbach, Pang-Ning Tan, Vipin Kumar |
| 2004 | Systematic data selection to mine concept-drifting data streams. Wei Fan |
| 2004 | The IOC algorithm: efficient many-class non-parametric classification for high-dimensional data. Ting Liu, Ke Yang, Andrew W. Moore |
| 2004 | The complexity of mining maximal frequent itemsets and maximal frequent patterns. Guizhen Yang |
| 2004 | TiVo: making show recommendations using a distributed collaborative filtering architecture. Kamal Ali, Wijnand van Stam |
| 2004 | Towards parameter-free data mining. Eamonn J. Keogh, Stefano Lonardi, Chotirat (Ann) Ratanamahatana |
| 2004 | Tracking dynamics of topic trends using a finite mixture model. Satoshi Morinaga, Kenji Yamanishi |
| 2004 | Turning CARTwheels: an alternating algorithm for mining redescriptions. Naren Ramakrishnan, Deept Kumar, Bud Mishra, Malcolm Potts, Richard F. Helm |
| 2004 | User-centered design for KDD. Eric Haseltine |
| 2004 | V-Miner: using enhanced parallel coordinates to mine product design and test data. Kaidi Zhao, Bing Liu, Thomas M. Tirpak, Andreas Schaller |
| 2004 | Visually mining and monitoring massive time series. Jessica Lin, Eamonn J. Keogh, Stefano Lonardi, Jeffrey P. Lankford, Donna M. Nystrom |
| 2004 | Web usage mining based on probabilistic latent semantic analysis. Xin Jin, Yanzan Zhou, Bamshad Mobasher |
| 2004 | When do data mining results violate privacy? Murat Kantarcioglu, Jiashun Jin, Chris Clifton |
| 2004 | Why collective inference improves relational classification. David D. Jensen, Jennifer Neville, Brian Gallagher |
| 2004 | k-TTP: a new privacy model for large-scale distributed environments. Bobi Gilburd, Assaf Schuster, Ran Wolff |