Группа авторов

Computational Statistics in Data Science


Скачать книгу

and computational power in the face of an unbounded generation of data with high velocity and brief life span. To cope with these requirements, approximate computing, which aims at low latency at the expense of acceptable quality loss, has been a practical solution [110]. Even though approximate computing has been extensively used for the processing of data stream, combining it with distributed processing models brings new research directions. Such research directions include approximation with heterogeneous resources, pricing models with approximation, intelligent data processing, and energy‐aware approximation.

      1 1 World Economic Forum (2019) How Much Data is Generated Each Day? Visual Capitalist, https://www.visualcapitalist.com/how‐much‐data‐is‐generated‐each‐day.

      2 2 Huynh, V. and Phung, D. (2017) Streaming clustering with Bayesian nonparametric models. Neurocomputing, 258, 52–62. doi: 10.1016/j.neucom.2017.02.078.

      3 3 Ray, I., Adaikkalavan, R., Xie, X., and Gamble, R. (2015) Stream Processing with Secure Information Flow Constraints. 29th IFIP Annual Conference on Data and Applications Security and Privacy. Fairfax, USA, pp. 311–329. doi: 10.1007/978‐3‐319‐20810‐7_22.

      4 4 Sibai, R.E., Chabchoub, Y., Demerjian, J. et al. (2016) Sampling Algorithms in Data Stream Environment. 2016 International Conference on Digital Economy Carthage. IEEE, Tunisia, pp. 29–36. doi: 10.1109/ICDEC.2016.7563142.

      5 5 Youn, J., Shim, J., and Lee, S.G. (2018) Efficient data stream clustering with sliding windows based on locality sensitive hashing. IEEE Access, 6, 63757–63776. doi: 10.1109/ACCESS.2018.2877138.

      6 6 Das, S., Beheraa, R.K., Kumar, M., and Rath, S.K. (2018) Real‐time sentiment analysis of twitter streaming data for stock prediction. Procedia Comput. Sci., 132, 956–964.

      7 7 Wang, J., Zhu, R., and Liu, S. (2018) A differentially private unscented Kalman filter for streaming data in IoT. IEEE Access, 6 (1), 6487–6495. doi: 10.1109/ACCESS.2018.2797159.

      8 8 Kolchinsky, I. and Schuster, A. (2019) Real‐Time Multi‐Pattern Detection Over Event Streams. Proceedings of the 2019 International Conference on Management of Data, Amsterdam Netherlands: New York, NY, USA: ACM, pp. 589–606. doi: 10.1145/3299869.3319869.

      9 9 Tozi, C. (2017) Dummy's Guide to Batch vs Streaming. Retrieved from Trillium Software, https://www.precisely.com/blog/big‐data/big‐data‐101‐batch‐stream‐processing.

      10 10 Kolajo, T., Daramola, O., and Adebiyi, A. (2019) Big data stream analysis: a systematic literature review. J. Big Data, 6, 47.

      11 11 Kusumakumari, V., Sherigar, D., Chandran, R., and Patil, N. (2017) Frequent pattern mining on stream data using Hadoop CanTree‐GTree. Procedia Comput. Sci., 115, 266–273.

      12 12 Giustozzia, F., Sauniera, J., and Zanni‐Merk, C. (2019) Abnormal situations interpretation in industry 4.0 using stream reasoning. Procedia Comput. Sci., 159, 620–629.

      13 13 Liu, R., Li, Q., Li, F. et al. (2014) Big Data Architecture for IT Incident Management. Proceedings of IEEE international conference on service operations and logistics, and informatics. Qingdao, China, pp. 424–429.

      14 14 Sakr, S. (2013) An Introduction to Infosphere Streams: A Platform for Analyzing Big Data in Motion, IBM, https://www.ibm.com/developerworks/library/bd‐streamsintro/index.html.

      15 15 Inoubli, W., Aridhi, S., Mezni, H. et al. (2018) An experimental survey on big data frameworks. Future Gener. Comp. System, 86, 546–564. doi: 10.1016/j.future.2018.04.032.

      16 16 International Business Machine (2019) Stream Computing Platforms, Applications and Analytics, https://researcher.watson.ibm.com/researcher/view_group.php?id=2531.

      17 17 Vidyasankar, K. (2017) On continuous queries in stream processing. Procedia Comput. Sci., 109C, 640–647.

      18 18 Joseph, S., Jasmin, E.A., and Chandran, S. (2015) Stream computing: opportunities and challenges in smart grid. Procedia Tech., 21, 49–53.

      19 19 Wozniak, M., Ksieniewicz, P., Cyganek, B. et al. (2016) Active learning classification of drifted streaming data. Procedia Comput. Sci., 80, 1724–1733.

      20 20 Kim, T. and Park, C.H. (2020) Anomaly pattern detection for streaming data. Expert Syst. Appl., 149, 113252. doi: 10.1016/j.eswa.2020.113252.

      21 21 Sethi, T.S. and Kantardzic, M. (2018) Handling adversarial concept drift in streaming data. Expert Syst. Appl., 97, 18–40.

      22 22 Toor, A.A., Usman, M., Younas, F. et al. (2020) Mining massive e‐health data streams for IoMT enabled healthcare systems. Sensors, 20 (7), 2131. doi: 10.3390/s20072131.

      23 23 Shan, J., Luo, J., Ni, G. et al. (2016) CVS: fast cardinality estimation for large‐scale data streams over sliding windows. Neurocomputing, 194, 107–116.

      24 24 Liu, W., Wang, Z., Liu, X. et al. (2017) A survey of deep neural network architectures and their applications. Neurocomputing, 234, 11–26.

      25 25 Priya, S. and Uthra, R.A. (2020) Comprehensive analysis for class imbalance data with concept drift using ensemble based classification. J. Ambient Intell. Humaniz. Comput. doi: 10.1007/s12652‐020‐01934‐y.

      26 26 Zhou, L., Pan, S., Wang, J., and Vasilakos, A.V. (2017) Machine learning on big data: opportunities and challenges. Neurocomputing, 237, 350–361. doi: 10.1016/j.neucom.2017.01.026.

      27 27 O'Donovan, P., Leahy, K., Bruton, K., and O'Sullivan, D.T.J. (2015) An industrial big data pipeline for data‐driven analytics maintenance applications in large‐scale smart manufacturing facilities. J. Big Data, 2, 25. doi: 10.1186s40537‐015‐0034‐z.

      28 28 Zaharia, M., Das, T., Li, H. et al. (2013) Discretized Streams: Fault‐Tolerant Streaming Computation at Scale. Proceedings of the 24th ACM Symposium on Operating System Principles (SOSP 2013), Farmington: ACM Press, pp. 423–438.

      29 29 Jayasekara, S., Harwood, A., and Karunasekera, S. (2020) A utilization model for optimization of checkpoint intervals in distributed stream processing systems. Futur. Gener. Comput. Syst., 110, 68–79. doi: 10.1016/j.future.2020.04.019.

      30 30 Chong, D. and Shi, H. (2015) Big data analytics: a literature review. J. Manag. Anal., 2 (3), 175–201.

      31 31 Qian, Z., He, Y., Su, C. et al. (2013) TimeStream: Reliable Stream Computation in the Cloud. Proceedings of the 8th ACM European Conference on Computer Systems. ACM, Prague, pp. 1–14. doi: 10.1145/2465351.2465353.

      32 32 Shi, P., Cui, Y., Xu, K. et al. (2019) Data consistency theory and case study for scientific big data. Information, 10, 137. doi: 10.3390/info10040137.

      33 33 Santipantakis, G., Kotis, K., and Vouros, G.A. (2017) OBDAIR: ontology‐based distributed framework for accessing, integrating and reasoning with data in disparate data sources. Expert Syst. Appl., 90, 464–483.

      34 34 Cortes, R., Bonnaire, X., Marin, O., and Sens, P. (2015) Stream processing of healthcare sensor data: studying user traces to identify challenges from a big data perspective. Procedia Comput. Sci., 52, 1004–1009.

      35 35 D'Argenio, V. (2018) The high‐throughput analyses era: are we ready for the data struggle. High Throughput, 7 (1), 8. doi: 10.3390/ht7010008.

      36 36 Qiu, Y. and Ma, M. (2018) Secure group mobility support for 6LoWPAN networks. IEEE Internet Things J., 5 (2), 1131–1141.

      37 37 Wanga, J., Luo, J., Liu, X. et al. (2019) Improved kalman filter based differentially private streaming data release in cognitive computing. Futur. Gener. Comput. Syst., 98, 541–549.

      38 38 Denham, B., Pears, R., and Naeem, A.M. (2020) Enhancing random projection with independent and cumulative additive noise for privacy‐preserving