Stephen Winters-Hilt

Informatics and Machine Learning


Скачать книгу

1 slash 2 right-parenthesis"/>

      2.6.2.3 Significant Distributions That Are Not Gaussian or Geometric

      Nongeometric duration distributions occur in many familiar areas, such as the length of spoken words in phone conversation, as well as other areas in voice recognition. Although the Gaussian distribution occurs in many scientific fields (an observed embodiment of the LLN, among other things), there are a huge number of significant (observed) skewed distributions, such as heavy‐tailed (or long‐tailed) distributions, multimodal distributions, etc.

Schematic illustration of the Gaussian distribution, aka Normal, shown with mean zero and variance equal to one.

      2.6.3 Series

      A series is a mathematical object consisting of a series of numbers, variables, or observation values. When observations describe equilibrium or “steady state,” emergent phenomenon familiar from physical reality, we often see series phenomena that are martingale. The martingale sequence property can be seen in systems reaching equilibrium in both the physical setting and algorithmic learning setting.

      A discrete‐time martingale is a stochastic process where a sequence of r.v. {X1, …, Xn} has conditional expected value of the next observation equal to the last observation: E(Xn+1 | X1, … Xn) = Xn, where E(|Xn|) < ∞. Similarly, one sequence, say {Y1,…, Yn}, is said to be martingale with respect to another, say {X1,…, Xn}, if for all n: E(Yn+1 | X1, … Xn) = Yn, where E(|Yn|) < ∞. Examples of martingales are rife in gambling. For our purposes, the most critical example is the likelihood‐ratio testing in statistics, with test‐statistic, the “likelihood ratio” given as: Yn = Πn i=1 g (Xi)/ f (Xi), where the population densities considered for the data are f and g . If the better (actual) distribution is f , then Yn is martingale with respect to Xn. This scenario arises throughout the hidden Markov models (HMM) Viterbi derivation if local “sensors” are used, such as with profile‐HMM's or position‐dependent Markov models in the vicinity of transition between states. This scenario also arises in the HMM Viterbi recognition of regions (versus transition out of those regions), where length‐martingale side information will be explicitly shown in Chapter 7, providing a pathway for incorporation of any martingale‐series side information (this fits naturally with the clique‐HMM generalizations described in Chapter 7). Given that the core ratio of cumulant probabilities that is employed is itself a martingale, this then provides a means for incorporation of side‐information in general (further details in Appendix C).

      1 2.1 Evaluate the Shannon Entropy, by hand, for the fair die probability distribution: (1/6,1/6,1/6,1/6,1/6,1/6), for the probability of rolling a 1 thru a 6 (all are the same, 1/6, for uniform prob. Dist). Also evaluate for loaded die: (1/10,1/10,1/10,1/10,1/10,1/2).

      2 2.2 Evaluate the Shannon Entropy for the fair and loaded probability distribution in Exercise 2.1 computationally, by running the program described in Section 2.1.

      3 2.3 Now consider you have two dice, where each separately rolls “fair,” but together they do not roll “fair,” i.e. each specific pair of die rolls does not have probability 1/36, but instead has probability:Die 1 rollDie 2 rollProbability11(1/6)*(0.001)12(1/6)*(0.125)13(1/6)*(0.125)14(1/6)*(0.125)15(1/6)*(0.124)16(1/6)*(0.5)2Any(1/6)*(1/6)3Any(1/6)*(1/6)4Any(1/6)*(1/6)5Any(1/6)*(1/6)61(1/6)*(0.5)62(1/6)*(0.125)63(1/6)*(0.125)64(1/6)*(0.125)65(1/6)*(0.124)66(1/6)*(0.001)What is Shannon Entropy for the Die 1 outcomes? (call H(1)) What is the Shannon entropy of the Die 2 outcomes (refer to as H(2))? What is the Shannon entropy on the two‐dice outcomes with probabilities shown in the table above (denote (H(1,2))?Compute the function MI(Die 1,Die 2) = H(1) + H(2) − H(1,2). Is it positive?

      4 2.4 Go to genbank (https://www.ncbi.nlm.nih.gov/genbank/) and select the genome of a small virus (~10 kb). Using the Python code shown in Section 2.1, determine the base frequencies for {a,c,g,t}. What is the shannon entropy (if those frequencies are taken to be the probabilities on the associated outcomes)?

      5 2.5 Go to genbank (https://www.ncbi.nlm.nih.gov/genbank/) and select the genome of three medium‐sized viruses (~100 kb). Using the Python code shown in Section 2.1, determine the trinucleotide frequencies.