| 2007 | A concept-based model for enhancing text categorization. Shady Shehata, Fakhri Karray, Mohamed Kamel |
| 2007 | A fast algorithm for finding frequent episodes in event streams. Srivatsan Laxman, P. S. Sastry, K. P. Unnikrishnan |
| 2007 | A framework for classification and segmentation of massive audio data streams. Charu C. Aggarwal |
| 2007 | A framework for community identification in dynamic social networks. Chayant Tantipathananandh, Tanya Y. Berger-Wolf, David Kempe |
| 2007 | A framework for simultaneous co-clustering and learning from complex data. Meghana Deodhar, Joydeep Ghosh |
| 2007 | A learning framework using Green's function and kernel regularization with application to recommender system. Chris H. Q. Ding, Rong Jin, Tao Li, Horst D. Simon |
| 2007 | A probabilistic framework for relational clustering. Bo Long, Zhongfei (Mark) Zhang, Philip S. Yu |
| 2007 | A scalable modular convex solver for regularized risk minimization. Choon Hui Teo, Alexander J. Smola, S. V. N. Vishwanathan, Quoc V. Le |
| 2007 | A spectral clustering approach to optimally combining numericalvectors with a modular network. Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka |
| 2007 | Active exploration for learning rankings from clickthrough data. Filip Radlinski, Thorsten Joachims |
| 2007 | An event-based framework for characterizing the evolutionary behavior of interaction graphs. Sitaram Asur, Srinivasan Parthasarathy, Duygu Ucar |
| 2007 | Applying collaborative filtering techniques to movie search for better ranking and browsing. Seung-Taek Park, David M. Pennock |
| 2007 | Association analysis-based transformations for protein interaction networks: a function prediction case study. Gaurav Pandey, Michael S. Steinbach, Rohit Gupta, Tushar Garg, Vipin Kumar |
| 2007 | Automatic labeling of multinomial topic models. Qiaozhu Mei, Xuehua Shen, ChengXiang Zhai |
| 2007 | BoostCluster: boosting clustering by pairwise constraints. Yi Liu, Rong Jin, Anil K. Jain |
| 2007 | Calculating latent demand in the long tail. Chris Anderson |
| 2007 | Canonicalization of database records using adaptive similarity measures. Aron Culotta, Michael L. Wick, Robert J. Hall, Matthew Marzilli, Andrew McCallum |
| 2007 | Challenges in mining social network data: processes, privacy, and paradoxes. Jon M. Kleinberg |
| 2007 | Characterising the difference. Jilles Vreeken, Matthijs van Leeuwen, Arno Siebes |
| 2007 | Cleaning disguised missing data: a heuristic approach. Ming Hua, Jian Pei |
| 2007 | Co-clustering based classification for out-of-domain documents. Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu |
| 2007 | Constraint-driven clustering. Rong Ge, Martin Ester, Wen Jin, Ian Davidson |
| 2007 | Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus. Deepavali Bhagwat, Kave Eshghi, Pankaj Mehra |
| 2007 | Correlation search in graph databases. Yiping Ke, James Cheng, Wilfred Ng |
| 2007 | Corroborate and learn facts from the web. Shubin Zhao, Jonathan Betz |
| 2007 | Cost-effective outbreak detection in networks. Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne M. VanBriesen, Natalie S. Glance |
| 2007 | Cross-language information retrieval using PARAFAC2. Peter A. Chew, Brett W. Bader, Tamara G. Kolda, Ahmed Abdelali |
| 2007 | Data mining at the crossroads: successes, failures and learning from them. Srinivasan Parthasarathy |
| 2007 | Density-based clustering for real-time stream data. Yixin Chen, Li Tu |
| 2007 | Detecting anomalous records in categorical datasets. Kaustav Das, Jeff G. Schneider |
| 2007 | Detecting changes in large data sets of payment card data: a case study. Chris Curry, Robert L. Grossman, David Locke, Steve Vejcik, Joseph Bugajski |
| 2007 | Detecting research topics via the correlation between graphs and texts. Yookyung Jo, Carl Lagoze, C. Lee Giles |
| 2007 | Detecting time series motifs under uniform scaling. Dragomir Yankov, Eamonn J. Keogh, Jose Medina, Bill Yuan-chi Chiu, Victor B. Zordan |
| 2007 | Development of NeuroElectroMagnetic ontologies(NEMO): a framework for mining brainwave ontologies. Dejing Dou, Gwen A. Frishkoff, Jiawei Rong, Robert M. Frank, Allen D. Malony, Don M. Tucker |
| 2007 | Discovering the hidden structure of house prices with a non-parametric latent manifold model. Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, Yann LeCun |
| 2007 | Distributed classification in peer-to-peer networks. Ping Luo, Hui Xiong, Kevin Lü, Zhongzhi Shi |
| 2007 | Domain-constrained semi-supervised mining of tracking models in sensor networks. Rong Pan, Junhui Zhao, Vincent Wenchen Zheng, Jeffrey Junfeng Pan, Dou Shen, Sinno Jialin Pan, Qiang Yang |
| 2007 | Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis. Frizo A. L. Janssens, Wolfgang Glänzel, Bart De Moor |
| 2007 | Efficient and effective explanation of change in hierarchical summaries. Deepak Agarwal, Dhiman Barman, Dimitrios Gunopulos, Neal E. Young, Flip Korn, Divesh Srivastava |
| 2007 | Efficient incremental constrained clustering. Ian Davidson, S. S. Ravi, Martin Ester |
| 2007 | Efficient mining of iterative patterns for software specification discovery. David Lo, Siau-Cheng Khoo, Chao Liu |
| 2007 | Enhanced max margin learning on multimodal data mining in a multimedia database. Zhen Guo, Zhongfei Zhang, Eric P. Xing, Christos Faloutsos |
| 2007 | Enhancing semi-supervised clustering: a feature projection perspective. Wei Tang, Hui Xiong, Shi Zhong, Jie Wu |
| 2007 | Estimating rates of rare events at multiple resolutions. Deepak Agarwal, Andrei Z. Broder, Deepayan Chakrabarti, Dejan Diklic, Vanja Josifovski, Mayssam Sayyadian |
| 2007 | Event summarization for system management. Wei Peng, Charles Perng, Tao Li, Haixun Wang |
| 2007 | Evolutionary spectral clustering by incorporating temporal smoothness. Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, Belle L. Tseng |
| 2007 | Expertise modeling for matching papers with reviewers. David M. Mimno, Andrew McCallum |
| 2007 | Exploiting duality in summarization with deterministic guarantees. Panagiotis Karras, Dimitris Sacharidis, Nikos Mamoulis |
| 2007 | Exploiting underrepresented query aspects for automatic query expansion. Daniel Crabtree, Peter Andreae, Xiaoying Gao |
| 2007 | Extracting relevant named entities for automated expense reimbursement. Guangyu Zhu, Timothy J. Bethea, Vikas Krishna |
| 2007 | Extracting semantic relations from query logs. Ricardo A. Baeza-Yates, Alessandro Tiberi |
| 2007 | Fast best-effort pattern matching in large attributed graphs. Hanghang Tong, Christos Faloutsos, Brian Gallagher, Tina Eliassi-Rad |
| 2007 | Fast direction-aware proximity for graph mining. Hanghang Tong, Christos Faloutsos, Yehuda Koren |
| 2007 | Feature selection methods for text classification. Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, Michael W. Mahoney |
| 2007 | Finding low-entropy sets and trees from binary data. Hannes Heikinheimo, Jouni K. Seppänen, Eino Hinkkanen, Heikki Mannila, Taneli Mielikäinen |
| 2007 | Finding tribes: identifying close-knit individuals from employment patterns. Lisa Friedland, David D. Jensen |
| 2007 | From frequent itemsets to semantically meaningful visual patterns. Junsong Yuan, Ying Wu, Ming Yang |
| 2007 | From mining the web to inventing the new sciences underlying the internet. Usama M. Fayyad |
| 2007 | Generalized component analysis for text with heterogeneous attributes. Xuerui Wang, Chris Pal, Andrew McCallum |
| 2007 | GraphScope: parameter-free mining of large time-evolving graphs. Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, Philip S. Yu |
| 2007 | Hierarchical mixture models: a probabilistic analysis. Mark Sandler |
| 2007 | High-quantile modeling for customer wallet estimation and other applications. Claudia Perlich, Saharon Rosset, Richard D. Lawrence, Bianca Zadrozny |
| 2007 | IMDS: intelligent malware detection system. Yanfang Ye, Dingding Wang, Tao Li, Dongyi Ye |
| 2007 | Information distance from a question to an answer. Xian Zhang, Yu Hao, Xiaoyan Zhu, Ming Li, David R. Cheriton |
| 2007 | Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases. Benyah Shaparenko, Thorsten Joachims |
| 2007 | Joint cluster analysis of attribute and relationship data withouta-priori specification of the number of clusters. Flavia Moser, Rong Ge, Martin Ester |
| 2007 | Joint optimization of wrapper generation and template detection. Shuyi Zheng, Ruihua Song, Ji-Rong Wen, Di Wu |
| 2007 | Knowledge discovery of multiple-topic document using parametric mixture model with dirichlet prior. Issei Sato, Hiroshi Nakagawa |
| 2007 | Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming. Jieping Ye, Shuiwang Ji, Jianhui Chen |
| 2007 | Local decomposition for rare class analysis. Junjie Wu, Hui Xiong, Peng Wu, Jian Chen |
| 2007 | LungCAD: a clinically approved, machine learning system for lung cancer detection. R. Bharat Rao, Jinbo Bi, Glenn Fung, Marcos Salganicoff, Nancy Obuchowski, David P. Naidich |
| 2007 | Machine learning for stock selection. Robert J. Yan, Charles X. Ling |
| 2007 | Making generative classifiers robust to selection bias. Andrew T. Smith, Charles Elkan |
| 2007 | Mining complex power networks for blackout prevention. Junhua Zhao, Zhao Yang Dong, Pei Zhang |
| 2007 | Mining correlated bursty topic patterns from coordinated text streams. Xuanhui Wang, ChengXiang Zhai, Xiao Hu, Richard Sproat |
| 2007 | Mining favorable facets. Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu, Ke Wang |
| 2007 | Mining optimal decision trees from itemset lattices. Siegfried Nijssen, Élisa Fromont |
| 2007 | Mining statistically important equivalence classes and delta-discriminative emerging patterns. Jinyan Li, Guimei Liu, Limsoon Wong |
| 2007 | Mining templates from search result records of search engines. Hongkun Zhao, Weiyi Meng, Clement T. Yu |
| 2007 | Model-shared subspace boosting for multi-label classification. Rong Yan, Jelena Tesic, John R. Smith |
| 2007 | Modeling relationships at multiple scales to improve accuracy of large recommender systems. Robert M. Bell, Yehuda Koren, Chris Volinsky |
| 2007 | Multiscale topic tomography. Ramesh Nallapati, Susan Ditmore, John D. Lafferty, Kin Ung |
| 2007 | Nestedness and segmented nestedness. Heikki Mannila, Evimaria Terzi |
| 2007 | Nonlinear adaptive distance metric learning for clustering. Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu |
| 2007 | On string classification in data streams. Charu C. Aggarwal, Philip S. Yu |
| 2007 | On-board analysis of uncalibrated data for a spacecraft at mars. Rebecca Castaño, Kiri Wagstaff, Steve A. Chien, Timothy M. Stough, Benyang Tang |
| 2007 | Partial example acquisition in cost-sensitive learning. Victor S. Sheng, Charles X. Ling |
| 2007 | Practical guide to controlled experiments on the web: listen to your customers not to the hippo. Ron Kohavi, Randal M. Henne, Dan Sommerfield |
| 2007 | Practical learning from one-sided feedback. D. Sculley |
| 2007 | Predictive discrete latent factor models for large scale dyadic data. Deepak Agarwal, Srujana Merugu |
| 2007 | Privacy-preservation for gradient descent methods. Li Wan, Wee Keong Ng, Shuguo Han, Vincent C. S. Lee |
| 2007 | Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007 Pavel Berkhin, Rich Caruana, Xindong Wu |
| 2007 | Raising the baseline for high-precision text classifiers. Aleksander Kolcz, Wen-tau Yih |
| 2007 | Real-time ranking with concept drift using expert advice. Hila Becker, Marta Arias |
| 2007 | Relational data pre-processing techniques for improved securities fraud detection. Andrew S. Fast, Lisa Friedland, Marc E. Maier, Brian J. Taylor, David D. Jensen, Henry G. Goldberg, John Komoroske |
| 2007 | SCAN: a structural clustering algorithm for networks. Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, Thomas A. J. Schweiger |
| 2007 | Scalable look-ahead linear regression trees. David S. Vogel, Ognian Asparouhov, Tobias Scheffer |
| 2007 | Semi-supervised classification with hybrid generative/discriminative methods. Gregory Druck, Chris Pal, Andrew McCallum, Xiaojin Zhu |
| 2007 | Show me the money!: deriving the pricing power of product features by mining consumer reviews. Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis |
| 2007 | Statistical change detection for multi-dimensional data. Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, Sanjay Ranka |
| 2007 | Stochastic processes and temporal data mining. Paul Cotofrei, Kilian Stoffel |
| 2007 | Structural and temporal analysis of the blogosphere through community factorization. Yun Chi, Shenghuo Zhu, Xiaodan Song, Jun'ichi Tatemura, Belle L. Tseng |
| 2007 | Support feature machine for classification of abnormal brain activity. Wanpracha Art Chaovalitwongse, Ya-Ju Fan, Rajesh C. Sachdeo |
| 2007 | Temporal causal modeling with graphical granger methods. Andrew Arnold, Yan Liu, Naoki Abe |
| 2007 | The minimum consistent subset cover problem and its applications in data mining. Byron J. Gao, Martin Ester, Jin-Yi Cai, Oliver Schulte, Hui Xiong |
| 2007 | Time-dependent event hierarchy construction. Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Huan Liu, Philip S. Yu |
| 2007 | Tracking multiple topics for finding interesting articles. Raymond K. Pon, Alfonso F. Cardenas, David Buttler, Terence Critchlow |
| 2007 | Trajectory pattern mining. Fosca Giannotti, Mirco Nanni, Fabio Pinelli, Dino Pedreschi |
| 2007 | Truth discovery with multiple conflicting information providers on the web. Xiaoxin Yin, Jiawei Han, Philip S. Yu |
| 2007 | Use of ranked cross document evidence trails for hypothesis generation. Rohini K. Srihari, Li Xu, Tushar Saxena |
| 2007 | Using hierarchical clustering for learning theontologies used in recommendation systems. Vincent Schickel-Zuber, Boi Faltings |
| 2007 | Very sparse stable random projections for dimension reduction in Ping Li |
| 2007 | Webpage understanding: an integrated approach. Jun Zhu, Bo Zhang, Zaiqing Nie, Ji-Rong Wen, Hsiao-Wuen Hon |
| 2007 | Weighting versus pruning in rule validation for detecting network and host anomalies. Gaurav Tandon, Philip K. Chan |
| 2007 | Xproj: a framework for projected structural clustering of xml documents. Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua Feng, Mohammed Javeed Zaki |