Группа авторов

Computational Statistics in Data Science


Скачать книгу

information, Cambridge University Press.

      101 101 Grover, L.K. (1996) A Fast Quantum Mechanical Algorithm for Database Search. Proceedings of the Twenty‐Eighth Annual ACM Symposium on Theory of Computing, pp. 212–219.

      102 102 Boyer, M., Brassard, G., Høyer, P., and Tapp, A. (1998) Tight bounds on quantum searching. Fortschritte der Physik: Progress of Physics, 46, 493–505.

      103 103 Jordan, S.P. (2005) Fast quantum algorithm for numerical gradient estimation. Phys. Rev. Lett., 95, 050501.

      104 104 Harrow, A.W., Hassidim, A., and Lloyd, S. (2009) Quantum algorithm for linear systems of equations. Phys. Rev. Lett., 103, 150502.

      105 105 Aaronson, S. (2015) Read the fine print. Nat. Phys., 11, 291–293.

      106 106 COPSS (2020) Committee of Presidents of Statistical Societies, https://community.amstat.org/copss/awards/winners (accessed 31 August 2020).

      107 107 Wickham, H. (2007) Reshaping data with the reshape package. J. Stat. Soft., 21, 1–20.

      108 108 Wickham, H. (2011) The split‐apply‐combine strategy for data analysis. J. Stat. Soft., 40, 1–29.

      109 109 Wickham, H. (2014) Tidy data. J. Stat. Soft., 59, 1–23.

      110 110 Kahle, D. and Wickham, H. (2013) ggmap: spatial visualization with ggplot2. R J., 5, 144–161.

      111 111 Wickham, H. (2016) ggplot2: Elegant Graphics for Data Analysis, Springer.

       Alfred G. Schissler and Alexander D. Knudson

       The University of Nevada, Reno, NV, USA

      Next, we briefly mention an array of software used for statistical applications. We discuss the specific purpose of each software and how the tool fills a need for data scientists. The aim here is to be fairly complete to provide a comprehensive viewpoint of the statistical software ecosystem and to leave readers with some familiarity with the most prevalent languages and software.

      After the presentation of noteworthy software, we transition to describing a handful of emerging and promising statistical computing technologies. Our goal in these sections is to guide users who wish to be early adopters for a software application or readers facing a scale‐limiting aspect to their current statistical programming language. Some of the latest tools for big data statistical applications are discussed in these sections.

Software Open source Classification Style Notes
Python Y Popular Programming Versatile, popular
R Y Popular Programming Academia/Industry, active community
SAS N Popular Programming Strong historical following
SPSS N Popular GUI: menu, dialogs Popular in scholarly work
C++ Y Notable Programming Fast, low‐level
Excel N Notable GUI: menu, dialogs Simple, works well for rectangular data
GNU Octave Y Notable Mixed Open source counterpart to MATLAB
Java Y Notable Programming Cross‐platform, portable
JavaScript, Typescript Y Notable Programming Popular, cross‐platform
Maple N Notable Mixed Academia, algebraic manipulation
MATLAB N Notable Mixed Speedy, popular among engineers
Minitab N Notable GUI: menu, dialogs Suitable for teaching and simple analysis
SQL Y Notable Programming Necessary tool for databases
Stata N Notable GUI: menu, dialogs