and foremost, we express our heartfelt appreciation to all the authors. We thank them all for considering and trusting this edited book as the platform for publishing their valuable work. We also thank all the authors for their kind co-operation extended during the various stages of processing of the manuscript. This edited book will serve as a motivating factor for those researchers who have spent years working as crime analysts, data analysts, statisticians, and budding researchers.
Dr. Sunil Kumar Dhal Professor IT, Sri Sri University, Odisha, India
Dr. Subhendu Kumar Pani Principal, Krupajal Computer Academy, BPUT, India
Dr. Srinivas Prasad GITAM Institute of Technology, Visakhapatnam Campus, India
Dr. Sudhir Kumar Mohapatra Addis Ababa Science and Technology University, Addis Ababa, Ethiopia March 2022
1
An Introduction to Big Data Analytics Techniques in Healthcare
Anil Audumbar Pise*
University of the Witwatersrand, Johannesburg, South Africa
Abstract
There is a notable rise in the amount of data being generated in the healthcare industries. Trying to improve the health outcomes and cut the costs derived from better utilization of healthcare data has been of great interest to healthcare providers (and the abundance of the data has brought that about big change), whereas the nature of healthcare data presents specific problems when it comes to processing and looking at big data, particularly, as well as analyzing the abundance of it. Some new ideas about how to deal with these problems are discussed in this chapter. According to this chapter, there are two ways in which advances in processing healthcare data have been made in the last 10 years that may make generating better predictions from the medical data feasible. Firstly by using advancing technological methods of analysis and secondly developing novel models that can handle large quantities of data.
Keywords: Healthcare analytics, predictive analytics, healthcare informatics, big data
1.1 Introduction
Big Data has the potential to transform all sorts of business sectors, from the wellness of individuals to the provision of healthcare. For the purposes of most current day, Big Data is defined as “storing, arranging, and processing, the current huge amounts of heterogeneous data, getting results, and then reorganized and measured data is called Clean/Big Data”. This pattern emerges because businesses are using technology to accomplish more and to help customers generate more data which creates a greater volume of data that consumers then produce, who generate bigger volumes of data in social networks. A variety of new developments involving more modern sources and different ways of processing data is currently emerging in the healthcare and medical industries. One thing is clear from the research point of view is the field of ‘omics’ in which previously used, pre-owned data offers new approaches to e-health records, open data, and the ‘quantified self’ methodologies for enhancing data analytics. We have made tremendous advances in text data extraction, which unlocks a lot of information in the medical records for analytics. On the other hand, big data use in healthcare, adoption of new medical and healthcare practices are moving more slowly than people may be expected. These difficulties can be found to their varying levels of data complexity, to issues regarding data, organization, and regulations, and also issues concerning ethical issues. It is very likely that new ideas and better practices for data acquisition and data analysis will emerge from larger scales of the accumulation of big data and the best practices. This paper takes a comprehensive look at the possibilities of Big Data holds for the medical and healthcare professions.
Although big data analytics is relatively new in its role in-flux in healthcare, it is nevertheless having a significant impact in practices and research. The system has given healthcare researchers the ability to gather, store, and manage disparate, structured, and unstructured data generated by current healthcare systems, as well as data sets for analysis. Larger databases and powerful computer software have recently been used in medical research to help with delivery and disease exploration. Some of the most basic big data principles cannot be escaped, even though advances have been made; as long as there are these limitations, they may persist in preventing further development in this sector. A concern that we wanted to tackle in this paper is the obstacles we encounter in three exciting new and emergent medical research areas: Genomic Data Analysis, Signal Detection, and Medical Image Processing. In the most recent studies, the focus has been on employing high volume data of medical information, which integrates multimodal information from diverse sources. In order to evaluate the capabilities and opportunities for healthcare delivery, research focuses on areas with the ability to make a positive difference as well as well as potential.
The remainder of this chapter is organized as follows. In Section 1.2, a brief idea of Big Data in Healthcare is explained with basic introduction and concept of the five Vs of big data with aspects that explore the use of big data in medical field. In Section 1.3, Areas of Big Data Analytics in medicine are discussed. In Section 1.4, the Concept of Healthcare a Big Data Repository is briefly explained. Then, Section 1.5 presents Applications of Healthcare Big Data with examples and in Section 1.6 Challenges in Big Data Analytics are provided. Big Data Privacy and Security policies are explained in Section 1.7. The remaining sections provide a conclusion and future work.
1.2 Big Data in Healthcare
The term “Big Data” refers to the volume, velocity, and variety of data generated over time by healthcare providers and containing information pertinent to a patient’s care, such as demographics, diagnoses, medical procedures, medications, vital signs, immunizations, laboratory results, and radiology images. Figure 1.1 depicts above mentioned healthcare entities.
Figure 1.1 Big data in healthcare.
Figure 1.2 Five vs of big data.
According to Thota et al. [1], electronic health sources such as sensor devices, streaming machines, and high-throughput instruments are accumulating more data as medical data collection advances. This big data in healthcare is used for a variety of purposes, including diagnosis, drug discovery, precision medicine, and disease prediction. Big data has been critical in a variety of fields, including healthcare, scientific research, industry, social networking, and government administration [1]. The five Vs of big data are as follows as shown in the Figure 1.2 for better understanding:
1 1. Variety: Without a doubt, the variety of data represents big data. For instance, among the various data formats are database, excel, and CSV, all of which can be stored in a plain text file. Additionally, structured, unstructured,