| 2021 | 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021, Madrid, Spain, May 17-19, 2021 |
| 2021 | A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests. Guillaume Haben, Sarra Habchi, Mike Papadakis, Maxime Cordy, Yves Le Traon |
| 2021 | A Traceability Dataset for Open Source Systems. Mouna Hammoudi, Christoph Mayr-Dorn, Atif Mashkoor, Alexander Egyed |
| 2021 | A large-scale study on human-cloned changes for automated program repair. Fernanda Madeiral, Thomas Durieux |
| 2021 | An Empirical Study of Developer Discussions on Low-Code Software Development Challenges. Md. Abdullah Al Alamin, Sanjay Malakar, Gias Uddin, Sadia Afroz, Tameem Bin Haider, Anindya Iqbal |
| 2021 | An Empirical Study of OSS-Fuzz Bugs. Zhen Yu Ding, Claire Le Goues |
| 2021 | An Empirical Study on the Usage of BERT Models for Code Completion. Matteo Ciniselli, Nathan Cooper, Luca Pascarella, Denys Poshyvanyk, Massimiliano Di Penta, Gabriele Bavota |
| 2021 | An Exploratory Study of Log Placement Recommendation in an Enterprise System. Jeanderson Cândido, Jan Haesen, Maurício Aniche, Arie van Deursen |
| 2021 | An Exploratory Study of Project Activity Changepoints in Open Source Software Evolution. James Walden, Noah Burgin, Kuljit Kaur |
| 2021 | AndroCT: Ten Years of App Call Traces in Android. Wen Li, Xiaoqin Fu, Haipeng Cai |
| 2021 | AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories. Sebastian Nielebock, Paul Blockhaus, Jacob Krüger, Frank Ortmeier |
| 2021 | Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution. Ruben Opdebeeck, Ahmed Zerouali, Coen De Roover |
| 2021 | Andror2: A Dataset of Manually-Reproduced Bug Reports for Android apps. Tyler Wendland, Jingyang Sun, Junayed Mahmud, S. M. Hasan Mansur, Steven Huang, Kevin Moran, Julia Rubin, Mattia Fazzini |
| 2021 | Apache Software Foundation Incubator Project Sustainability Dataset. Likang Yin, Zhiyuan Zhang, Qi Xuan, Vladimir Filkov |
| 2021 | Applying CodeBERT for Automated Program Repair of Java Simple Bugs. Ehsan Mashhadi, Hadi Hemmati |
| 2021 | Architecture Smells and Pareto Principle: A Preliminary Empirical Exploration. Alexandra-Maria Chaniotaki, Tushar Sharma |
| 2021 | Attention-based model for predicting question relatedness on Stack Overflow. Jiayan Pei, Yimin Wu, Zishan Qin, Yao Cong, Jingtao Guan |
| 2021 | Automatic Part-of-Speech Tagging for Security Vulnerability Descriptions. Sofonias Yitagesu, Xiaowang Zhang, Zhiyong Feng, Xiaohong Li, Zhenchang Xing |
| 2021 | Automatically Selecting Follow-up Questions for Deficient Bug Reports. Mia Mohammad Imran, Agnieszka Ciborowska, Kostadin Damevski |
| 2021 | Building the Collaboration Graph of Open-Source Software Ecosystem. Elena Lyulina, Mahmoud Jahanshahi |
| 2021 | Can I Solve It? Identifying APIs Required to Complete OSS Tasks. Fabio Santos, Igor Wiese, Bianca Trinkenreich, Igor Steinmacher, Anita Sarma, Marco Aurélio Gerosa |
| 2021 | Challenges in Developing Desktop Web Apps: a Study of Stack Overflow and GitHub. Gian Luca Scoccia, Patrizio Migliarini, Marco Autili |
| 2021 | Characterising the Knowledge about Primitive Variables in Java Code Comments. Mahfouth Alghamdi, Shinpei Hayashi, Takashi Kobayashi, Christoph Treude |
| 2021 | Comparative Study of Feature Reduction Techniques in Software Change Prediction. Ruchika Malhotra, Ritvik Kapoor, Deepti Aggarwal, Priya Garg |
| 2021 | Data Balancing Improves Self-Admitted Technical Debt Detection. Murali Sridharan, Mika Mäntylä, Leevi Rantala, Maëlick Claes |
| 2021 | Denchmark: A Bug Benchmark of Deep Learning-related Software. Misoo Kim, Youngkyoung Kim, Eunseok Lee |
| 2021 | Does Code Review Promote Conformance? A Study of OpenStack Patches. Panyawut Sri-Iesaranusorn, Raula Gaikovina Kula, Takashi Ishio |
| 2021 | Duets: A Dataset of Reproducible Pairs of Java Library-Clients. Thomas Durieux, César Soto-Valero, Benoit Baudry |
| 2021 | EqBench: A Dataset of Equivalent and Non-equivalent Program Pairs. Sahar Badihi, Yi Li, Julia Rubin |
| 2021 | Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data. Samuel W. Flint, Jigyasa Chauhan, Robert Dyer |
| 2021 | Fast and Memory-Efficient Neural Code Completion. Alexey Svyatkovskiy, Sebastian Lee, Anna Hadjitofi, Maik Riechert, Juliana Vicente Franco, Miltiadis Allamanis |
| 2021 | GE526: A Dataset of Open-Source Game Engines. Dheeraj Vagavolu, Vartika Agrahari, Sridhar Chimalakonda, Akhila Sri Manasa Venigalla |
| 2021 | Googling for Software Development: What Developers Search For and What They Find. André C. Hora |
| 2021 | How Do Software Developers Use GitHub Actions to Automate Their Workflows? Timothy Kinsman, Mairieli Santos Wessel, Marco Aurélio Gerosa, Christoph Treude |
| 2021 | How Effective is Continuous Integration in Indicating Single-Statement Bugs? Jasmine Latendresse, Rabe Abdalkareem, Diego Elias Costa, Emad Shihab |
| 2021 | How Java Programmers Test Exceptional Behavior. Diego Marcilio, Carlo A. Furia |
| 2021 | Identifying Critical Projects via PageRank and Truck Factor. Rolf-Helge Pfeiffer |
| 2021 | Identifying Versions of Libraries used in Stack Overflow Code Snippets. Ahmed Zerouali, Camilo Velázquez-Rodríguez, Coen De Roover |
| 2021 | JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction. Chanathip Pornprasit, Chakkrit Tantithamthavorn |
| 2021 | KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle. Luigi Quaranta, Fabio Calefato, Filippo Lanubile |
| 2021 | Learning Off-By-One Mistakes: An Empirical Study. Hendrig Sellik, Onno van Paridon, Georgios Gousios, Maurício Aniche |
| 2021 | Leveraging Models to Reduce Test Cases in Software Repositories. Golnaz Gharachorlu, Nick Sumner |
| 2021 | ManyTypes4Py: A Benchmark Python Dataset for Machine Learning-based Type Inference. Amir M. Mir, Evaldas Latoskinas, Georgios Gousios |
| 2021 | Mea culpa: How developers fix their own simple bugs differently from other developers. Wenhan Zhu, Michael W. Godfrey |
| 2021 | Mining API Interactions to Analyze Software Revisions for the Evolution of Energy Consumption. Andreas Schuler, Gabriele Kotsis |
| 2021 | Mining DEV for social and technical insights about software development. Maria Papoutsoglou, Johannes Wachs, Georgia M. Kapitsaki |
| 2021 | Mining Energy-Related Practices in Robotics Software. Michel Albonico, Ivano Malavolta, Gustavo Pinto, Emitza Guzman, Katerina Chinnappan, Patricia Lago |
| 2021 | Mining Workflows for Anomalous Data Transfers. Huy Tu, George Papadimitriou, Mariam Kiran, Cong Wang, Anirban Mandal, Ewa Deelman, Tim Menzies |
| 2021 | Mining the ROS ecosystem for Green Architectural Tactics in Robotics and an Empirical Evaluation. Ivano Malavolta, Katerina Chinnappan, Stan Swanborn, Grace A. Lewis, Patricia Lago |
| 2021 | On Improving Deep Learning Trace Analysis with System Call Arguments. Quentin Fournier, Daniel Aloise, Seyed Vahid Azhari, François Tetreault |
| 2021 | On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Exploratory Study. Anthony Peruma, Christian D. Newman |
| 2021 | On the Effectiveness of Deep Vulnerability Detectors to Simple Stupid Bug Detection. Jiayi Hua, Haoyu Wang |
| 2021 | On the Naturalness and Localness of Software Logs. Sina Gholamian, Paul A. S. Ward |
| 2021 | On the Rise and Fall of Simple Stupid Bugs: a Life-Cycle Analysis of SStuBs. Balázs Mosolygó, Norbert Vándor, Gábor Antal, Péter Hegedüs |
| 2021 | On the Use of Dependabot Security Pull Requests. Mahmoud Alfadel, Diego Elias Costa, Emad Shihab, Mouafak Mkhallalati |
| 2021 | PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code. Egor Spirin, Egor Bogomolov, Vladimir Kovalenko, Timofey Bryksin |
| 2021 | Practitioners' Perceptions of the Goals and Visual Explanations of Defect Prediction Models. Jirayus Jiarpakdee, Chakkrit Tantithamthavorn, John C. Grundy |
| 2021 | Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study. Anderson G. Uchôa, Caio Barbosa, Daniel Coutinho, Willian Nalepa Oizumi, Wesley K. G. Assunção, Silvia Regina Vergilio, Juliana Alves Pereira, Anderson Oliveira, Alessandro F. Garcia |
| 2021 | PySStuBs: Characterizing Single-Statement Bugs in Popular Open-Source Python Projects. Arthur V. Kamienski, Luisa Palechor, Cor-Paul Bezemer, Abram Hindle |
| 2021 | QScored: A Large Dataset of Code Smells and Quality Metrics. Tushar Sharma, Marouane Kessentini |
| 2021 | Revisiting Dockerfiles in Open Source Software Over Time. Kalvin Eng, Abram Hindle |
| 2021 | Rollback Edit Inconsistencies in Developer Forum. Saikat Mondal, Gias Uddin, Chanchal K. Roy |
| 2021 | S3M: Siamese Stack (Trace) Similarity Measure. Aleksandr Khvorov, Roman Vasiliev, George A. Chernishev, Irving Muller Rodrigues, Dmitrij V. Koznov, Nikita Povarov |
| 2021 | Sampling Projects in GitHub for MSR Studies. Ozren Dabic, Emad Aghajani, Gabriele Bavota |
| 2021 | Search4Code: Code Search Intent Classification Using Weak Supervision. Nikitha Rao, Chetan Bansal, Joe Guan |
| 2021 | Studying the Change Histories of Stack Overflow and GitHub Snippets. Saraj Singh Manes, Olga Baysal |
| 2021 | TNM: A Tool for Mining of Socio-Technical Data from Git Repositories. Nikolai Sviridov, Mikhail Evtikhiev, Vladimir Kovalenko |
| 2021 | Technical Debt in the Peer-Review Documentation of R Packages: a rOpenSci Case Study. Zadia Codabux, Melina C. Vidoni, Fatemeh H. Fard |
| 2021 | The Diversity-Innovation Paradox in Open-Source Software. Mengchen Sam Yong, Lavínia Paganini, Huilian Sophie Qiu, José Bayoán Santiago Calderón |
| 2021 | The Secret Life of Hackathon Code Where does it come from and where does it go? Ahmed Imam, Tapajit Dey, Alexander Nolte, Audris Mockus, James D. Herbsleb |
| 2021 | The Wonderless Dataset for Serverless Computing. Nafise Eskandani, Guido Salvaneschi |
| 2021 | Tracing Vulnerable Code Lineage. David Reid, Kalvin Eng, Chris Bogart, Adam Tutko |
| 2021 | Tracking Hackathon Code Creation and Reuse. Ahmed Imam, Tapajit Dey |
| 2021 | Waiting around or job half-done? Sentiment in self-admitted technical debt. Gianmarco Fucci, Nathan Cassee, Fiorella Zampetti, Nicole Novielli, Alexander Serebrenik, Massimiliano Di Penta |
| 2021 | What Code Is Deliberately Excluded from Test Coverage and Why? André C. Hora |
| 2021 | Which contributions count? Analysis of attribution in open source. Jean-Gabriel Young, Amanda Casari, Katie McLaughlin, Milo Z. Trujillo, Laurent Hébert-Dufresne, James P. Bagrow |
| 2021 | gambit - An Open Source Name Disambiguation Tool for Version Control Systems. Christoph Gote, Christian Zingg |