Date Thesis Awarded

5-2021

Access Type

Honors Thesis -- Open Access

Degree Name

Bachelors of Arts (BA)

Department

Interdisciplinary Studies

Advisor

Daniel Runfola

Committee Members

Anthony Stefanidis

Maurits van der Veen

Abstract

Building on insights from two years of manually extracting events information from online news media, an interactive information extraction environment (IIEE) was developed. SCOPE, the Scientific Collection of Open-source Policy Evidence, is a Python Django-based tool divided across specialized modules for extracting structured events data from unstructured text. These modules are grouped into a flexible framework which enables the user to tailor the tool to meet their needs. Following principles of user-oriented learning for information extraction (IE), SCOPE offers an alternative approach to developing AI-assisted IE systems. In this piece, we detail the ongoing development of the SCOPE tool, present methods and results of tests of the efficacy of SCOPE relative to past methods, and provide a novel framework for future tests of AI-assisted IE tasks. Information gathered from a four-week period of use was analyzed to evaluate the initial utility of the tool and establish baseline accuracy metrics. Using the SCOPE tool, 15 users extracted 529 summaries and 362 structured events from 207 news articles achieving an accuracy of 31.8% holding time constant at 4 minutes per source. To demonstrate how fully or partially-automated AI processes can be integrated into SCOPE, a baseline AI was implemented and achieved 4.8% accuracy at 3.25 seconds per source. These results illustrate the ability of SCOPE to present the relative strengths and weaknesses of manual users and AI, as well as establish precedent and methods for integrating the two.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS