| 2025 | 22nd IEEE/ACM International Conference on Mining Software Repositories, MSR@ICSE 2025, Ottawa, ON, Canada, April 28-29, 2025 |
| 2025 | 50 Years of Programming Language Evolution through the Software Heritage looking glass. Adèle Desmazières, Roberto Di Cosmo, Valentin Lorentz |
| 2025 | A Dataset of Contributor Activities in the NumFocus Open-Source Community. Youness Hourri, Alexandre Decan, Tom Mens |
| 2025 | A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools. Rio Kishimoto, Tetsuya Kanda, Yuki Manabe, Katsuro Inoue, Shi Qiu, Yoshiki Higo |
| 2025 | A Public Benchmark of REST APIs. Alix Decrop, Sara Eraso, Xavier Devroey, Gilles Perrouin |
| 2025 | An Empirical Study on Leveraging Images in Automated Bug Report Reproduction. Dingbang Wang, Zhaoxu Zhang, Sidong Feng, William G. J. Halfond, Tingting Yu |
| 2025 | Analyzing Dependency Clusters and Security Risks in the Maven Central Repository. George Lake, Minhaz F. Zibran |
| 2025 | Analyzing Vulnerability Overestimation in the Maven Ecosystem. Taha Draoui, Faten Jebari, Chawki Ben Slimen, Munjaap Uppal, Mohamed Wiem Mkaouer |
| 2025 | Are the Majority of Public Computational Notebooks Pathologically Non-Executable? Tien Nguyen, Waris Gill, Muhammad Ali Gulzar |
| 2025 | Automatic High-Level Test Case Generation using Large Language Models. Navid Bin Hasan, Md. Ashraful Islam, Junaed Younus Khan, Sanjida Senjik, Anindya Iqbal |
| 2025 | Build Code Needs Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems. Anwar Ghammam, Dhia Elhaq Rzig, Mohamed Almukhtar, Rania Khalsi, Foyzul Hassan, Marouane Kessentini |
| 2025 | CARDS: A collection of package, revision, and miscellaneous dependency graphs. Euxane Tran-Girard, Laurent Bulteau, Pierre-Yves David |
| 2025 | Can LLMs Generate Higher Quality Code Than Humans? An Empirical Study. Mohammad Talal Jamil, Shamsa Abid, Shafay Shamail |
| 2025 | Can LLMs Replace Manual Annotation of Software Engineering Artifacts? Toufique Ahmed, Premkumar T. Devanbu, Christoph Treude, Michael Pradel |
| 2025 | Cascading Effects: Analyzing Project Failure Impact in the Maven Central Ecosystem. Mina Shehata, Saidmakhmud Makhkamjonoov, Mahad Syed, Esteban Parra |
| 2025 | Characterizing Packages for Vulnerability Prediction. Saviour Owolabi, Francesco Rosati, Ahmad Abdellatif, Lorenzo De Carli |
| 2025 | Chasing the Clock: How Fast Are Vulnerabilities Fixed in the Maven Ecosystem? Md. Fazle Rabbi, Arifa Islam Champa, Rajshakhar Paul, Minhaz F. Zibran |
| 2025 | CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance. Kunal Suresh Pai, Premkumar T. Devanbu, Toufique Ahmed |
| 2025 | CoMRAT: Commit Message Rationale Analysis Tool. Mouna Dhaouadi, Bentley Oakes, Michalis Famelis |
| 2025 | CoPhi - Mining C/C++ Packages for Conan Ecosystem Analysis. Vivek Sarkar, Anemone Kampkötter, Ben Hermann |
| 2025 | CoUpJava: A Dataset of Code Upgrade Histories in Open-Source Java Repositories. Kaihang Jiang, Bihui Jin, Pengyu Nie |
| 2025 | Combining Large Language Models with Static Analyzers for Code Review Generation. Imen Jaoua, Oussama Ben Sghaier, Houari A. Sahraoui |
| 2025 | DPy: Code Smells Detection Tool for Python. Aryan Boloori, Tushar Sharma |
| 2025 | DataTD: A Dataset of Java Projects Including Test Doubles. Mengzhen Li, Mattia Fazzini |
| 2025 | Decoding Dependency Risks: A Quantitative Study of Vulnerabilities in the Maven Ecosystem. Costain Nachuma, Md Mosharaf Hossan, Asif Kamal Turzo, Minhaz F. Zibran |
| 2025 | Dependency Dilemmas: A Comparative Study of Independent and Dependent Artifacts in Maven Central Ecosystem. Mehedi Hasan Shanto, Muhammad Asaduzzaman, Manishankar Mondal, Shaiful Alam Chowdhury |
| 2025 | Dependency Update Adoption Patterns in the Maven Software Ecosystem. Baltasar Berretta, Augustus Thomas, Heather Guarnera |
| 2025 | Do Developers Depend on Deprecated Library Versions? A Mining Study of Log4j. Haruhiko Yoshioka, Sila Lertbanjongngam, Masayuki Inaba, Youmei Fan, Takashi Nakano, Kazumasa Shimari, Raula Gaikovina Kula, Kenichi Matsumoto |
| 2025 | Do LLMs Provide Links to Code Similar to What They Generate? A Study with Gemini and Bing CoPilot. Daniele Bifolco, Pietro Cassieri, Giuseppe Scanniello, Massimiliano Di Penta, Fiorella Zampetti |
| 2025 | Does Functional Package Management Enable Reproducible Builds at Scale? Yes. Julien Malka, Stefano Zacchiroli, Théo Zimmermann |
| 2025 | Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code. Timur Galimzyanov, Sergey Titov, Yaroslav Golubev, Egor Bogomolov |
| 2025 | E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects. Sergio Di Meglio, Luigi Libero Lucio Starace, Valeria Pontillo, Ruben Opdebeeck, Coen De Roover, Sergio Di Martino |
| 2025 | Enhancing Just-In-Time Defect Prediction Models with Developer-Centric Features. Emanuela Guglielmi, Andrea D'Aguanno, Rocco Oliveto, Simone Scalabrino |
| 2025 | Evidence is All We Need: Do Self-Admitted Technical Debts Impact Method-Level Maintenance? Shaiful Alam Chowdhury, Hisham Kidwai, Muhammad Asaduzzaman |
| 2025 | EvoChain: A Framework for Tracking and Visualizing Smart Contract Evolution. Ilham A. Qasse, Mohammad Hamdaqa, Björn Þór Jónsson |
| 2025 | Faster Releases, Fewer Risks: A Study on Maven Artifact Vulnerabilities and Lifecycle Management. Md Shafiullah Shafin, Md. Fazle Rabbi, S. M. Mahedy Hasan, Minhaz F. Zibran |
| 2025 | FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs. Madhurima Chakraborty, Peter Pirkelbauer, Qing Yi |
| 2025 | From Industrial Practices to Academia: Uncovering the Gap in Vulnerability Research and Practice. Zhuang Liu, Xing Hu, Jiayuan Zhou, Xin Xia |
| 2025 | GHALogs: Large-Scale Dataset of GitHub Actions Runs. Florent Moriconi, Thomas Durieux, Jean-Rémy Falleri, Raphaël Troncy, Aurélien Francillon |
| 2025 | GitProjectHealth: an Extensible Framework for Git Social Platform Mining. Nicolas Hlad, Benoît Verhaeghe, Kilian Bauvent |
| 2025 | Good practice versus reality: A landscape analysis of Research Software metadata adoption in European Open Science Clusters. Anas El Hounsri, Daniel Garijo |
| 2025 | HaPy-Bug - Human Annotated Python Bug Resolution Dataset. Piotr Przymus, Mikolaj Fejzer, Jakub Narebski, Radoslaw Wozniak, Lukasz Halada, Aleksander Kazecki, Mykhailo Molchanov, Krzysztof Stencel |
| 2025 | Harnessing Large Language Models for Curated Code Reviews. Oussama Ben Sghaier, Martin Weyssow, Houari A. Sahraoui |
| 2025 | How Do Infrastructure-as-Code Practitioners Update Their Dependencies? An Empirical Study on Terraform Module Updates. Mahi Begoug, Ali Ouni, Moataz Chouchen |
| 2025 | How Effective are LLMs for Data Science Coding? A Controlled Experiment. Nathalia Nascimento, Everton Guimarães, Sai Sanjna Chintakunta, Santhosh Anitha Boominathan |
| 2025 | How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks Before and After Fine-tuning. Fabio Salerno, Ali Al-Kaswan, Maliheh Izadi |
| 2025 | Human-In-The-Loop Software Development Agents: Challenges and Future Directions. Jirat Pasuksmit, Wannita Takerngsaksiri, Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Ruixiong Zhang, Shiyan Wang, Fan Jiang, Jing Li, Evan Cook, Kun Chen, Ming Wu |
| 2025 | HyperAST: Incrementally Mining Large Source Code Repositories. Quentin Le Dilavrec, Andy Zaidman |
| 2025 | ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs. Chaomeng Lu, Tianyu Li, Toon Dehaene, Bert Lagaisse |
| 2025 | Inferring Questions from Programming Screenshots. Faiz Ahmed, Xuchen Tan, Folajinmi Adewole, Suprakash Datta, Maleknaz Nayebi |
| 2025 | Insights into Dependency Maintenance Trends in the Maven Ecosystem. Barisha Chowdhury, Md. Fazle Rabbi, S. M. Mahedy Hasan, Minhaz F. Zibran |
| 2025 | Insights into Vulnerability Trends in Maven Artifacts: Recurrence, Popularity, and User Behavior. Courtney Bodily, Eric Hill, Andreas Kramer, Leslie Kerby, Minhaz F. Zibran |
| 2025 | Intelligent Semantic Matching (ISM) for Video Tutorial Search using Transformer Models. Ahmad J. Tayeb, Sonia Haiduc |
| 2025 | Investigating the Understandability of Review Comments on Code Change Requests. Md Shamimur Rahman, Zadia Codabux, Chanchal K. Roy |
| 2025 | Is it Really Fun? Detecting Low Engagement Events in Video Games. Emanuela Guglielmi, Gabriele Bavota, Nicole Novielli, Rocco Oliveto, Simone Scalabrino |
| 2025 | It Works (only) on My Machine: A Study on Reproducibility Smells in Ansible Scripts. Ghazal Sobhani, Israat Haque, Tushar Sharma |
| 2025 | It's About Time: An Empirical Study of Date and Time Bugs in Open-Source Python Software. Shrey Tiwari, Serena Chen, Alexander Joukov, Peter Vandervelde, Ao Li, Rohan Padhye |
| 2025 | JPerfEvo: A Tool for Tracking Method-Level Performance Changes in Java Projects. Kaveh Shahedi, Maxime Lamothe, Foutse Khomh, Heng Li |
| 2025 | Jupyter Notebook Activity Dataset. Tomoki Nakamaru, Tomomasa Matsunaga, Tetsuro Yamazaki |
| 2025 | LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations. Ziyang Ye, Triet Huynh Minh Le, M. Ali Babar |
| 2025 | Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy. Negar Alizadeh, Boris Belchev, Nishant Saurabh, Patricia Kelbert, Fernando Castor |
| 2025 | Learning from Mistakes: Understanding Ad-hoc Logs through Analyzing Accidental Commits. Yi-Hung Chou, Yiyang Min, April Yi Wang, James A. Jones |
| 2025 | MARIN: A Research-Centric Interface for Querying Software Artifacts on Maven Repositories. Johannes Düsing, Jared Chiaramonte, Ben Hermann |
| 2025 | MYRIAD PEOPLE Open Source Software for New Media Arts. Benoit Baudry, Erik Natanael Gustafsson, Roni Kaufman, Maria Kling |
| 2025 | MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs). Bikash Saha, Nanda Rani, Sandeep Kumar Shukla |
| 2025 | Measuring InnerSource Value. Chamindra de Silva, Daniel Izquierdo-Cortazar |
| 2025 | Mining Bug Repositories for Multi-Fault Programs. Dylan Callaghan, Bernd Fischer |
| 2025 | Mining a Decade of Event Impacts on Contributor Dynamics in Ethereum: A Longitudinal Study. Matteo Vaccargiu, Sabrina Aufiero, C. Ba, Silvia Bartolucci, Richard G. Clegg, Daniel Graziotin, Rumyana Neykova, Roberto Tonelli, Giuseppe Destefanis |
| 2025 | Mining for Lags in Updating Critical Security Threats: A Case Study of Log4j Library. Hidetake Tanaka, Kazuma Yamasaki, Momoka Hirose, Takashi Nakano, Youmei Fan, Kazumasa Shimari, Raula Gaikovina Kula, Kenichi Matsumoto |
| 2025 | Navigating and Exploring Software Dependency Graphs Using Goblin. Damien Jaime, Joyce El Haddad, Pascal Poizat |
| 2025 | OSPtrack: A Labeled Dataset Targeting Simulated Execution of Open-Source Software. Zhuoran Tan, Christos Anagnostopoulos, Jeremy Singer |
| 2025 | OSS License Identification at Scale: A Comprehensive Dataset Using World of Code. Mahmoud Jahanshahi, David Reid, Adam McDaniel, Audris Mockus |
| 2025 | On the Evolution of Unused Dependencies in Java Project Releases: An Empirical Study. Nabhan Suwanachote, Yagut Shakizada, Yutaro Kashiwa, Bin Lin, Hajimu Iida |
| 2025 | On the calibration of Just-in-time Defect Prediction. Xhulja Shahini, Jone Bartel, Klaus Pohl |
| 2025 | OpenMent: A Dataset of Mentor-Mentee Interactions in Google Summer of Code. Erfan Raoofian, Fatemeh H. Fard, Ifeoma Adaji, Gema Rodríguez-Pérez |
| 2025 | Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven. Piotr Przymus, Mikolaj Fejzer, Jakub Narebski, Krzysztof Rykaczewski, Krzysztof Stencel |
| 2025 | Patch Me If You Can - Securing the Linux Kernel. Gunnar Kudrjavets |
| 2025 | Popularity and Innovation in Maven Central. Nkiru Ede, Jens Dietrich, Ulrich Zülicke |
| 2025 | Prompt Engineering or Fine-Tuning: An Empirical Assessment of LLMs for Code. Jiho Shin, Clark Tang, Tahmineh Mohati, Maleknaz Nayebi, Song Wang, Hadi Hemmati |
| 2025 | Prompting in the Wild: An Empirical Study of Prompt Evolution in Software Repositories. Mahan Tafreshipour, Aaron Imani, Eric Huang, Eduardo Santana de Almeida, Thomas Zimmermann, Iftekhar Ahmed |
| 2025 | PyExamine: A Comprehensive, Un-Opinionated Smell Detection Tool for Python. Karthik Shivashankar, Antonio Martini |
| 2025 | RefExpo: Unveiling Software Project Structures through Advanced Dependency Graph Extraction. Vahid Haratian, Pouria Derakhshanfar, Vladimir Kovalenko, Eray Tüzün |
| 2025 | Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential. Emna Ksontini, Meriem Mastouri, Rania Khalsi, Wael Kessentini |
| 2025 | RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering. Samuel Abedu, Laurine Menneron, SayedHassan Khatoonabadi, Emad Shihab |
| 2025 | Revisiting Defects4J for Fault Localization in Diverse Development Scenarios. Md Nakhla Rafi, An Ran Chen, Tse-Hsun Peter Chen, Shaohua Wang |
| 2025 | SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset. Chavhan Sujeet Yashavant, MitrajSinh Chavda, Saurabh Kumar, Amey Karkare, Angshuman Karmakar |
| 2025 | SMATCH-M-LLM: Semantic Similarity in Metamodel Matching With Large Language Models. Nafisa Ahmed, Hin Chi Kwok, Mohammad Hamdaqa, Wesley K. G. Assunção |
| 2025 | SPRINT: An Assistant for Issue Report Management. Ahmed Adnan, Antu Saha, Oscar Chaparro |
| 2025 | Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks. Kyi Shin Khant, Hong Yi Lin, Patanamon Thongtanunam |
| 2025 | Smells-sus: Sustainability Smells in IaC. Seif Kosbar, Mohammad Hamdaqa |
| 2025 | SnipGen: A Mining Repository Framework for Evaluating LLMs for Code. Daniel Rodríguez-Cárdenas, Alejandro Velasco, Denys Poshyvanyk |
| 2025 | Software Bills of Materials in Maven Central. Yogya Gamage, Nadia Gonzalez Fernandez, Martin Monperrus, Benoit Baudry |
| 2025 | Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study. Sabato Nocera, Sira Vegas, Giuseppe Scanniello, Natalia Juristo |
| 2025 | TerraDS: A Dataset for Terraform HCL Programs. Christoph Bühler, David Spielmann, Roland Meier, Guido Salvaneschi |
| 2025 | TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest. Altino Alves, André C. Hora |
| 2025 | The Ecosystem of Open-Source Music Production Software - A Mining Study on the Development Practices of VST Plugins on GitHub. Andrei Bogdan, Mauricio Verano Merino, Ivano Malavolta |
| 2025 | The Ripple Effect of Vulnerabilities in Maven Central: Prevalence, Propagation, and Mitigation Challenges. Ehtisham Ul Haq, Song Wang, Robert S. Allison |
| 2025 | Too Noisy To Learn: Enhancing Data Quality for Code Review Comment Generation. Chunhua Liu, Hong Yi Lin, Patanamon Thongtanunam |
| 2025 | Towards Detecting Prompt Knowledge Gaps for Improved LLM-guided Issue Resolution. Ramtin Ehsani, Sakshi Pathak, Preetha Chatterjee |
| 2025 | Tracing Vulnerabilities in Maven: A Study of CVE lifecycles and Dependency Networks. Corey Yang-Smith, Ahmad Abdellatif |
| 2025 | TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data. Anisha Islam, Abram Hindle |
| 2025 | Under the Blueprints: Parsing Unreal Engine's Visual Scripting at Scale. Kalvin Eng, Abram Hindle |
| 2025 | Understanding Abandonment and Slowdown Dynamics in the Maven Ecosystem. Kazi Amit Hasan, Jerin Yasmin, Huizi Hao, Yuan Tian, Safwat Hassan, Steven H. H. Ding |
| 2025 | Understanding Software Vulnerabilities in the Maven Ecosystem: Patterns, Timelines, and Risks. Md. Fazle Rabbi, Rajshakhar Paul, Arifa Islam Champa, Minhaz F. Zibran |
| 2025 | Understanding Test Deletion in Java Applications. Suraj Bhatta, Frank Kendemah, Ajay Kumar Jha |
| 2025 | Understanding the Popularity of Packages in Maven Ecosystem. Sadman Jashim Sakib, Muhammad Asaduzzaman, Curtis Bright, Cole Morgan |
| 2025 | What Do Contribution Guidelines Say About Software Testing? Bruna Falcucci, Felipe Gomide, André C. Hora |
| 2025 | Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code. Luis Soeiro, Thomas Robert, Stefano Zacchiroli |
| 2025 | Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack. Piotr Przymus, Thomas Durieux |
| 2025 | pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods. Idriss Abdelmadjid, Robert Dyer |