Stephen Winters-Hilt

Informatics and Machine Learning


Скачать книгу

theory and variational/statistical modeling has significant roots in variational calculus. Chapter 3 describes information theory ideas and the information “calculus” description (and related anomaly detection methods). The involvement of variational calculus methods and the possible parallels with the nascent development of a new (modern) “calculus of information” motivates the detailed overview of the highly successful physics development/applications of the calculus of variations (Appendix B). Using variational calculus, for example, it is possible to establish a link between a choice of information measure and statistical formalism (maximum entropy, Section 3.1). Taking the maximum entropy on a distribution with moment constraints leads to the classic distributions seen in mathematics and nature (the Gaussian for fixed mean and variance, etc.). Not surprisingly, variational methods also help to establish and refine some of the main ML methods, including Neural Nets (NNs) (Chapters 9, 13) and Support Vector Machines (SVM) (Chapter 10). SVMs are the main tool presented for both classification (supervised learning) and clustering (unsupervised learning), and everything in between (such as bag learning).

      All of the tFSA signal acquisition methods described in Chapters 24 are O(L), i.e. they scan the data with a computational complexity no greater than that of simply seeing the data (via a “read” or “touch” command, O(L) is known as “order of,” or “big‐oh,” notation). Because the signal acquisition is only O(L) it is not significantly costly, computationally, to simply repeat the acquisition analysis multiple times with a more informed process with each iteration, to have arrived at a “bootstrap” signal acquisition process. In such a setting, signal acquisition is often done with bias to very high specificity initially (and sensitivity very poor), to get a “gold standard” set of highly likely true signals that can be data mined for their attributes. With a filter stage thereby trained, later scan passes can pass suspected signals with very weak specificity (very high sensitivity now) with high specificity then recovered by use of the filter. This then allows a bootstrap process to a very high specificity (SP) and sensitivity (SN) at the tFSA acquisition stage on the signals of interest.

      Ad hoc signal acquisition refers to finding the solution for “this” situation (whatever “this” is) without consideration of wider application. The solution is strongly data dependent in other words. Data dependent methodologies are, by definition, not defined at the outset, but must be invented as the data begins to be understood. As with data dependency in non‐evolutionary search metaheuristics, where there is no optimal search method that is guaranteed to always work well, here there is no optimal signal acquisition method known in advance. This is simply restating a fundamental limit from non‐evolutionary search metaheuristics in another form [1, 3]. What can be done, however, is assemble the core tools and techniques from which a solution can be constructed and to perform a bootstrap algorithmic learning process with those tools (examples in what follows) to arrive at a functional signal acquisition on the data being analyzed. A universal, automated, bootstrap learning process may eventually be possible using evolutionary learning algorithms. This is related to the co‐evolutionary Free Lunch Theorem [1, 3], and this is discussed in Chapter 12.

      “Bootstrap” refers to a method of problem solving when the problem is solved by seemingly paradoxical measures (the name references Baron von Munchausen who freed the horse he was riding from a bog by pulling himself, and the horse with him, up by his bootstraps). Such algorithmic methods often involve repeated passes over the data sequence, with improved priors, or a trained filter, among other things, to have improved performance. The bootstrap amplifier from electrical engineering is an amplifier circuit where part