Date Awarded

1994

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.)

Department

Computer Science

Advisor

W Robert Collins

Abstract

During the past decade, there has been a resurgence of interest in applying mathematical methods to problems in artificial intelligence. Much work has been done in the field of machine learning, but it is not always clear how the results of this research should be applied to practical problems. Our aim is to help bridge the gap between theory and practice by addressing the question: "If we are given a machine learning algorithm, how should we go about formally analyzing it?" as opposed to the usual question: "how do we write a learning algorithm we can analyze?".;We will consider algorithms that accept randomly drawn training data as input, and produce classification rules as their outputs. For the most part our analyses will be based on the syntactic structure of these classification rules; for example, if we know that the algorithm we want to analyze will only output logical expressions that are conjunctions of variables, we can use this fact to facilitate our analysis.;We use a probabilistic framework for machine learning, often called the pac model. In this framework, one asks whether or not a machine learning algorithm has a high probability of generating classification rules that "usually" make the right classification (pac means probably approximately correct). Research in the pac framework can be divided into two subfields. The first field is concerned with the amount of training data that is needed for successful learning to take place (success being defined in terms of generalization ability); the second field is concerned with the computational complexity of learning once the training data have been selected. Since most existing algorithms use heuristics to deal with the problem of complexity, we are primarily concerned with the amount of training data that algorithms require.

DOI

https://dx.doi.org/doi:10.21220/s2-ngq1-gy69

Rights

Recommended Citation

Michael, Christoph Cornelius, "General methods for analyzing machine learning sample complexity" (1994). Dissertations, Theses, and Masters Projects. William & Mary. Paper 1539623860.
https://dx.doi.org/doi:10.21220/s2-ngq1-gy69

Dissertations, Theses, and Masters Projects

General methods for analyzing machine learning sample complexity

Date Awarded

Document Type

Degree Name

Department

Advisor

Abstract

DOI

Rights

Recommended Citation

Included in

Browse

Search

Author Corner

Links

About Scholarworks

Links

Dissertations, Theses, and Masters Projects

General methods for analyzing machine learning sample complexity

Author

Date Awarded

Document Type

Degree Name

Department

Advisor

Abstract

DOI

Rights

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Links

About Scholarworks

Links