| 2021 | 20 years of physical document and product protection using digital methods. Justin Picard |
| 2021 | A comparative study on methods and tools for handwritten mathematical expression recognition. Daniela S. Costa, Carlos A. B. Mello, Marcelo d'Amorim |
| 2021 | A large-scale exploration of terms of service documents on the web. Soundarya Nurani Sundareswara, Mukund Srinath, Shomir Wilson, C. Lee Giles |
| 2021 | A novel approach on the joint de-identification of textual and relational data with a modified mondrian algorithm. Fabian Singhofer, Aygul Garifullina, Mathias Kern, Ansgar Scherp |
| 2021 | ALiBERT: improved automated list inspection (ALI) with BERT. Rajkumar Ramamurthy, Maren Pielka, Robin Stenzel, Christian Bauckhage, Rafet Sifa, Tim Dilmaghani Khameneh, Ulrich Warning, Bernd Kliem, Rüdiger Loitz |
| 2021 | Binarisation of photographed documents image quality and processing time assessment. Rafael Dueire Lins, Steven J. Simske, Rodrigo Barros Bernardino |
| 2021 | COVID-19 multidimensional kaggle literature organization. Maksim Ekin Eren, Nick Solovyev, Chris Hamer, Renee McDonald, Boian S. Alexandrov, Charles Nicholas |
| 2021 | Challenges in chart image classification: a comparative study of different deep learning methods. Jennil Thiyam, Sanasam Ranbir Singh, Prabin Kumar Bora |
| 2021 | Counterfeit detection with QR codes. Justin Picard, Paul Landry, Michael Bolay |
| 2021 | Direct binarization a quality-and-time efficient binarization strategy. Rafael Dueire Lins, Rodrigo Barros Bernardino, Ricardo da Silva Barboza, Zanoni Dueire Lins |
| 2021 | DocEng '21: ACM Symposium on Document Engineering 2021, Limerick, Ireland, August 24-27, 2021 Patrick Healy, Mihai Bilauca, Alexandra Bonnici |
| 2021 | Document engineering issues in malware analysis. Charles Nicholas, Robert J. Joyce, Steve Simske |
| 2021 | Domain-specific modeling in document engineering. Verislav Djukic, Juha-Pekka Tolvanen |
| 2021 | ELSKE: efficient large-scale keyphrase extraction. Johannes Knittel, Steffen Koch, Thomas Ertl |
| 2021 | Efficient clustering of short text streams using online-offline clustering. Md. Rashadul Hasan Rakib, Norbert Zeh, Evangelos E. Milios |
| 2021 | Efficient sparse spherical k-means for document clustering. Johannes Knittel, Steffen Koch, Thomas Ertl |
| 2021 | Engineering of an artificial intelligence safety data sheet document processing system for environmental, health, and safety compliance. Kevin Fenton, Steven Simske |
| 2021 | Evaluating deep neural networks for image document enhancement. Lucas N. Kirsten, Ricardo Piccoli, Ricardo Ribani |
| 2021 | Heuristic stopping rules for technology-assisted review. Eugene Yang, David D. Lewis, Ophir Frieder |
| 2021 | MTLV: a library for building deep multi-task learning architectures. Fatemeh Rahimi, Evangelos E. Milios, Stan Matwin |
| 2021 | Metadata-driven eye tracking for real-time applications. Yasith Jayawardana, Gavindya Jayawardena, Andrew T. Duchowski, Sampath Jayarathna |
| 2021 | On minimizing cost in legal document review workflows. Eugene Yang, David D. Lewis, Ophir Frieder |
| 2021 | Ordering sentences and paragraphs with pre-trained encoder-decoder transformers and pointer ensembles. Rémi Calizzano, Malte Ostendorff, Georg Rehm |
| 2021 | Pornographic content classification using deep-learning. André Tabone, Kenneth P. Camilleri, Alexandra Bonnici, Stefania Cristina, Reuben A. Farrugia, Mark Borg |
| 2021 | Recognizing creative visual design: multiscale design characteristics in free-form web curation documents. Ajit Jain, Andruid Kerne, Nic Lupfer, Gabriel Britain, Aaron Perrine, Yoonsuck Choe, John Keyser, Ruihong Huang |
| 2021 | Rescuing historical climate observations to support hydrological research: a case study of solar radiation data. Ogundepo Odunayo, Naveela N. Sookoo, Gautam Bathla, Anthony Cavallin, Bhaleka D. Persaud, Kathy Szigeti, Philippe Van Cappellen, Jimmy Lin |
| 2021 | Searching harsh documents. Ophir Frieder |
| 2021 | Shock wave: a graph layout algorithm for text analyzing. Maxime Cauz, Julien Albert, Anne Wallemacq, Isabelle Linden, Bruno Dumas |
| 2021 | SlideGen: an abstractive section-based slide generator for scholarly documents. Athar Sefid, Prasenjit Mitra, C. Lee Giles |
| 2021 | Small-step pipelines reduce the complexity of XSLT/XPath programs. Marcel Schaeben, Gioele Barabucci |
| 2021 | Table-structure recognition method using neural networks for implicit ruled line estimation and cell estimation. Manabu Ohta, Ryoya Yamada, Teruhito Kanazawa, Atsuhiro Takasu |
| 2021 | Text line extraction using deep learning and minimal sub seams. Adi Azran, Alon Schclar, Raid Saabni |
| 2021 | Towards extraction of theorems and proofs in scholarly articles. Shrey Mishra, Lucas Pluvinage, Pierre Senellart |
| 2021 | Trustworthiness of spam email addresses using machine learning. Francisco Jáñez-Martino, Rocío Alaíz-Rodríguez, Víctor González-Castro, Eduardo Fidalgo |