Группа авторов

Computational Statistics in Data Science


Скачать книгу

parameter for logistic regression [56, 57].

      6 6 See Nishimura and Suchard [57] and references therein for the role and design of a preconditioner.

      1 1 Davenport, T.H. and Patil, D. (2012) Data scientist. Harvard Bus. Rev., 90, 70–76.

      2 2 Google Trends (2020) Data source: Google trends. https://trends.google.com/trends (accessed 12 July 2020).

      3 3 American Statistical Association (2020) Statistics Degrees Total and By Gender, https://ww2.amstat.org/misc/StatTable1987-Current.pdf (accessed 01 June 2020).

      4 4 Cleveland, W.S. (2001) Data science: an action plan for expanding the technical areas of the field of statistics. Int. Stat. Rev., 69, 21–26.

      5 5 Donoho, D. (2017) 50 Years of data science. J. Comput. Graph. Stat., 26, 745–766.

      6 6 Fisher, R.A. (1936) Design of experiments. Br Med J 1.3923, 554–554.

      7 7 Fisher, R.A. (1992) Statistical methods for research workers, in Kotz S., Johnson N.L. (eds) Breakthroughs in Statistics, Springer Series in Statistics (Perspectives in Statistics). Springer, New York, NY. (Especially Section 21.02). doi: 10.1007/978-1-4612-4380-9_6.

      8 8 Wald, A. and Wolfowitz, J. (1944) Statistical tests based on permutations of the observations. Ann. Math. Stat., 15, 358–372.

      9 9 Efron B. (1992) Bootstrap methods: another look at the jackknife, in Breakthroughs in Statistics. Springer Series in Statistics (Perspectives in Statistics) (eds S. Kotz and N.L. Johnson), Springer, New York, NY, pp. 569–593. doi: 10.1007/978-1-4612-4380-9_41.

      10 10 Efron, B. and Tibshirani, R.J. (1994) An Introduction to the Bootstrap, CRC press.

      11 11 Bliss, C.I. (1935) The comparison of dosage‐mortality data. Ann. Appl. Biol., 22, 307–333 (Fisher introduces his scoring method in appendix).

      12 12 McCullagh, P. and Nelder, J. (1989) Generalized Linear Models, 2nd edn, Chapman and Hall, London. Standard book on generalized linear models.

      13 13 Tierney, L. (1994) Markov chains for exploring posterior distributions. Ann. Stat., 22, 1701–1728.

      14 14 Brooks, S., Gelman, A., Jones, G., and Meng, X.‐L. (2011) Handbook of Markov Chain Monte Carlo, CRC press.

      15 15 Chavan, V. and Phursule, R.N. (2014) Survey paper on big data. Int. J. Comput. Sci. Inf. Technol., 5, 7932–7939.

      16 16 Williams, C.K. and Rasmussen, C.E. (1996) Gaussian processes for regression. Advances in Neural Information Processing Systems, pp. 514–520.

      17 17 Williams, C.K. and Rasmussen, C.E. (2006) Gaussian Processes for Machine Learning, vol. 2, MIT press, Cambridge, MA.

      18 18 Gelman, A., Carlin, J.B., Stern, H.S. et al. (2013) Bayesian Data Analysis, CRC press.

      19 19 Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N. et al. (1953) Equation of state calculations by fast computing machines. J. Chem. Phys., 21, 1087–1092.

      20 20 Hastings, W.K. (1970) Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57 (1), 97–109. doi: 10.1093/biomet/57.1.97

      21 21 Holbrook, A.J., Lemey, P., Baele, G. et al. (2020) Massive parallelization boosts big Bayesian multidimensional scaling. J. Comput. Graph. Stat., 1–34.

      22 22 Holbrook, A.J., Loeffler, C.E., Flaxman, S.R. et al. (2021) Scalable Bayesian inference for self‐excitatory stochastic processes applied to big American gunfire data, Stat. Comput. 31, 4.

      23 23 Seber, G.A. and Lee, A.J. (2012) Linear Regression Analysis, vol. 329, John Wiley & Sons.

      24 24 Trefethen, L.N. and Bau, D. (1997) Numerical linear algebra. Soc. Ind. Appl. Math.

      25 25 Gelman, A., Roberts, G.O., and Gilks, W.R. (1996) Efficient metropolis jumping rules. Bayesian Stat., 5, 42.

      26 26 Van Dyk, D.A. and Meng, X.‐L. (2001) The art of data augmentation. J. Comput. Graph. Stat., 10, 1–50.

      27 27 Neal, R.M. (2011) MCMC using Hamiltonian dynamics, in Handbook of Markov Chain Monte Carlo (eds S. Brooks, A. Gelman, G. Jones and X.L. Meng), Chapman and Hall/CRC Press, 113–162.

      28 28 Holbrook, A., Vandenberg‐Rodes, A., Fortin, N., and Shahbaba, B. (2017) A Bayesian supervised dual‐dimensionality reduction model for simultaneous decoding of LFP and spike train signals. Stat, 6, 53–67.

      29 29 Bouchard‐Côté, A., Vollmer, S.J., and Doucet, A. (2018) The bouncy particle sampler: a nonreversible rejection‐free Markov chain Monte Carlo method. J. Am. Stat. Assoc., 113, 855–867.

      30 30 Murty, K.G. and Kabadi, S.N. (1985) Some NP‐Complete Problems in Quadratic and Nonlinear Programming. Tech. Rep.

      31 31 Kennedy, J. and Eberhart, R. (1995) Particle Swarm Optimization. Proceedings of ICNN'95‐International Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE.

      32 32 Davis, L. (1991) Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York.

      33 33 Hunter, D.R. and Lange, K. (2004) A tutorial on MM algorithms. Am. Stat., 58, 30–37.

      34 34 Boyd, S., Boyd, S.P., and Vandenberghe, L. (2004) Convex Optimization, Cambridge University Press.

      35 35 Fisher, R.A. (1922) On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. London, Ser. A, 222,309–368.

      36 36 Beale, E., Kendall, M., and Mann, D. (1967) The discarding of variables in multivariate analysis. Biometrika, 54, 357–366.

      37 37 Hocking, R.R. and Leslie, R. (1967) Selection of the best subset in regression analysis. Technometrics, 9, 531–540.

      38 38 Tibshirani, R. (1996) Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, 58,267–288.

      39 39 Geyer, C. (1991) Markov Chain Monte Carlo Maximum Likelihood. Computing Science and Statistics: Proceedings of 23rd Symposium on the Interface Interface Foundation, Fairfax Station, 156–163.

      40 40 Tjelmeland, H. and Hegstad, B.K. (2001) Mode jumping proposals in MCMC. Scand. J. Stat., 28, 205–223.

      41 41 Lan, S., Streets, J., and Shahbaba, B. (2014) Wormhole Hamiltonian Monte Carlo. Twenty‐Eighth AAAI Conference on Artificial Intelligence.

      42 42 Nishimura, A. and Dunson, D. (2016) Geometrically tempered Hamiltonian Monte Carlo. arXiv preprint arXiv:1604.00872.

      43 43 Mitchell, T.J. and Beauchamp, J.J. (1988) Bayesian variable selection in linear regression. J. Am. Stat. Assoc., 83, 1023–1032.

      44 44 Madigan, D. and Raftery, A.E. (1994) Model selection and accounting for model uncertainty in graphical models using Occam's window. J. Am. Stat. Assoc., 89, 1535–1546.

      45 45 George, E.I. and McCulloch, R.E. (1997) Approaches for Bayesian variable selection. Statistica Sinica, 7, 339–373.

      46 46 Hastie, T., Tibshirani, R., and Wainwright, M. (2015) Statistical Learning with Sparsity: The Lasso and Generalizations, CRC Press.

      47 47 Friedman, J., Hastie, T., and Tibshirani, R. (2010) Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw., 33, 1.

      48 48 Bhattacharya, A., Chakraborty, A., and Mallick, B.K. (2016) Fast sampling with Gaussian scale mixture priors in high‐dimensional regression. Biometrika, 103, 985–991.

      49 49 Suchard, M.A., Schuemie, M.J., Krumholz, H.M. et al. (2019) Comprehensive comparative effectiveness and safety of first‐line antihypertensive drug classes: a systematic, multinational, large‐scale analysis. The Lancet, 394, 1816–1826.

      50 50 Passos, I.C., Mwangi, B., and Kapczinski, F. (2019) Personalized Psychiatry: Big Data Analytics in Mental Health, Springer.

      51 51