| 2020 | A Framework for Extracted View Maintenance. Besat Kassaie, Frank Wm. Tompa |
| 2020 | A Framework to Evaluate Webpage Segment Recognizers. Nicola R. Di Matteo, James Blustein |
| 2020 | An Assessment of Sentence Simplification Methods in Extractive Text Summarization. Rafaella F. Vale, Rafael Dueire Lins, Rafael Ferreira |
| 2020 | Assessing Causality Structures learned from Digital Text Media. Mariano Maisonnave, Fernando Delbianco, Fernando Tohmé, Ana Gabriela Maguitman, Evangelos E. Milios |
| 2020 | Automatic Generation of Electrical Plan Documents from Architectural Data. Melissa Cote, Alireza Rezvanifar, Alexandra Branzan Albu |
| 2020 | COVID-19 Kaggle Literature Organization. Maksim Ekin Eren, Nick Solovyev, Edward Raff, Charles Nicholas, Ben Johnson |
| 2020 | COVIDSeer: Extending the CORD-19 Dataset. Shaurya Rohatgi, Zeba Karishma, Jason Chhay, Sai Raghav Reddy Keesara, Jian Wu, Cornelia Caragea, C. Lee Giles |
| 2020 | Cardinal Graph Convolution Framework for Document Information Extraction. Rinon Gal, Shai Ardazi, Roy Shilkrot |
| 2020 | Change Detection on JATS Academic Articles: An XML Diff Comparison Study. Milos Cuculovic, Frédéric Fondement, Maxime Devanne, Jonathan Weber, Michel Hassenforder |
| 2020 | Direct Sampling of Multiview Line Drawings for Document Retrieval. Cristopher Flagg, Ophir Frieder |
| 2020 | DocEng '20: ACM Symposium on Document Engineering 2020, Virtual Event, CA, USA, September 29 - October 1, 2020 |
| 2020 | DocEng'2020 Competition on Extractive Text Summarization. Rafael Dueire Lins, Rafael Ferreira Leite de Mello, Steven J. Simske |
| 2020 | DocEng'2020 Time-Quality Competition on Binarizing Photographed Documents. Rafael Dueire Lins, Steven J. Simske, Rodrigo Barros Bernardino |
| 2020 | HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models. Arthur Flor de Sousa Neto, Byron Leite Dantas Bezerra, Alejandro Héctor Toselli, Estanislau Baptista Lima |
| 2020 | Improving query expansion strategies with word embeddings. Alfredo Silva, Marcelo Mendoza |
| 2020 | Interactive and Scalable visualization framework for Version-aware XML documents. Ahmed S. Shatnawi, Ethan V. Munson |
| 2020 | Machine Interpretation of Sketched Documents. Alexandra Bonnici, Kenneth P. Camilleri |
| 2020 | Order out of Chaos: Construction of Knowledge Models from PDF Textbooks. Isaac Alpizar Chacon, Sergey A. Sosnovsky |
| 2020 | PDF2LaTeX: A Deep Learning System to Convert Mathematical Documents from PDF to LaTeX. Zelun Wang, Jyh-Charn Liu |
| 2020 | Parsing a markup language that supports overlap and discontinuity. Ronald Haentjens Dekker, Bram Buitendijk, Elli Bleeker |
| 2020 | ServiceMarq: Extracting Service Contributions from Call for Papers. Shi Tian, Abhinav Ramesh Kashyap, Min-Yen Kan |
| 2020 | Short Text Stream Clustering via Frequent Word Pairs and Reassignment of Outliers to Clusters. Md. Rashadul Hasan Rakib, Norbert Zeh, Evangelos E. Milios |
| 2020 | The Old Bailey and OCR: Benchmarking AWS, Azure, and GCP with 180, 000 Page Images. William Ughetta, Brian W. Kernighan |