Lost in OCR Translation? Vision-Based Approaches to Robust Document Retrieval
Retrieval-Augmented Generation (RAG) has become a popular technique for enhancing the reliability and utility of Large Language Models (LLMs) by groundi...
Published writings, invited talks, and other work - Conference Paper
Retrieval-Augmented Generation (RAG) has become a popular technique for enhancing the reliability and utility of Large Language Models (LLMs) by groundi...
Large Language Models (LLMs) are increasingly being leveraged for generating andtranslating scientific computer codes by both domain-experts and non-dom...
The advent of large language models (LLMs) has significantly advanced the field of code translation, enabling automated translation between programming ...
Web archives are sources of big data. When presenting human visitors with archived web pages, or mementos, web archives often apply user interface augme...
Much computer vision research has focused on natural images, but technical documents typically consist of abstract images, such as charts, drawings, dia...
As web archives' holdings grow, archivists subdivide them into collections so they are easier to understand and manage. In this work, we review the coll...
Referencing resources on the web has become an integral part of our digital scholarship. However, the long-term availability and accessibility of these ...
In a perfect world, all articles consistently contain sufficient metadata to describe the resource. We know this is not the reality, so we are motivated...
To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying ...