Date Thesis Awarded
5-2018
Access Type
Honors Thesis -- Access Restricted On-Campus Only
Degree Name
Bachelors of Science (BS)
Department
Computer Science
Advisor
Denys Poshyvanyk
Committee Members
Robert M Lewis
Peter Kemper
Ross Iaci
Abstract
Upon installing a mobile application, human beings are able, to a great extent, to know immediately what the subcomponents of the screen do. They know what buttons return them to the previous screen, which ones submit their log in information, and which brings up the menu. This is the result of a combination of intuitive design and cross-platform design standards which allow users to draw on previous experience. Regardless, the fact that humans are able to understand the functionality of screen components at a glance suggests that there is semantic information encode into a mobile application’s GUI. In this work, we present an automated approach to exploring the nature of the semantic information encoded into the GUI of a mobile application. We do this using three modalities (1) a screenshot of an image, (2) text descriptions of the functionality of GUI components sourced through Amazon’s Mechanical Turk, and (3) parsed information from the screen hierarchy’s XML dump. The first two modalities are aligned using a convolutional neural network, which detects objects in the screenshot and extracts salient features, paired with a bidirectional recurrent neural network which serves as a language model. Both of these models maps their respective modalities to a semantic space, and then aligns the two representations in that space. The third modality is incorporated by using a Seq2Seq model which maps the screen’s XML dump directly to reasonable descriptions of the functionality of the screen. Our experiments reveal that semantic information extracted from the above representations of the GUI of a mobile application is comparable to that of real-world images such as those found in the MSCOCO dataset. In this work, we compare our results to similar models trained on this dataset, and compare the results from different screen representations against each other.
Recommended Citation
Curcio, Michael J., "Clarity: An Exploration of Semantic Information Encoded in Mobile Application GUIs" (2018). Undergraduate Honors Theses. William & Mary. Paper 1267.
https://scholarworks.wm.edu/honorstheses/1267