events [6].
Figure 3.2 Big data value creation flow.
3.4.1 Sources of Medical Big Data
The medical care industries are producing a bulky amount of patient information. Figure 3.3 shows an important origin of the hospitals and patient data.
EHRs, electronic medical records (EMR), hospital reports, images of radiology, sensor devices, and IoT are the most important origins of the medical data. These data are used to diagnose and treat the diseases of human beings. The healthcare data are diverse in nature. It is a very difficult and challenging process to integrate the big data of healthcare industries. The technological innovation transforms the healthcare big data into useful information.
Internet of Things
The IoT consists of the physical devices and the sensors that are connected to the internet. All the interconnected devices gather and transfer the data over the internet. IoT device enable remote monitoring of the patients, which avoids patient’s visit to hospital unnecessarily. The wearable device senses the human heart rate, blood pressure (BP), weight, stress level, physical activities, glucose level, etc. Because of the IoT-enabled devices, the people can self-track their health conditions. These sensor devices gather a large volume of health data continuously. Apart from wearable devices, there are applications in smartphones and other medical devices, which can gather information of patients and send it to the cloud on a continual basis.
Figure 3.3 Different sources of healthcare data.
Electronic Medical Records and Electronic Health Records
EMRs and EHRs are the most widespread sources for big data in healthcare. The EHRs consist of health status information and data of people in digital format. This health information will be distributed among the diverse healthcare professionals, doctors, and hospitals. The EHRs of patients can be used for improving the healthcare process and enable the healthcare professional to treat the patients well.
The EHR consists of following information about patients:
• Patient’s demographic information, medical history
• Billing data
• Insurance information
• Laboratory results
• Clinical information
• Medication records
• Medical imaging data
The EMRs contain data from a particular physician’s office. The EMRs are private and confidential documents. The EMRs contain treatment history of the patients in any physician’s office.
The difference between EHRs and EMRs are discussed in Table 3.3.
The main difference between EHR and EMR is that the EMR holds information of the patient’s history by one provider. The EHR goes ahead of the information from one provider and contains wide-ranging patients’ history, which can be shared between all providers.
Table 3.3 Difference between electronic health record and electronic medical record.
Electronic health records | Electronic medical records |
All the patients’ health information in digital form. | The patient health information form one physician’s office in digital form. |
Electronic health records are shared among medical professionals. | Electronic medical records are private and confidential documents. |
Electronic health records are used by many numbers of providers for the diagnosis and for the better treatment. | Electronic medical records are used by one provider for the patient’s treatment. |
The EHR provides enormous data that enable doctors for better clinical decisions with advanced analytics.
Apart from IoT and EHR, the other sources of big data, in healthcare are as follows:
• Insurance data
• Other clinical data
• Healthcare-related research studies data
• Social media and search engine data
All these medical related data exist in many forms: laboratory results, the physician’s clinical notes, the medical images, the sensor data, etc. There is no standard form for all these sources of medical data. The medical data is large in volume, of structured and unstructured data.
Structured data: Structured data means the information accumulated and presented in an organized way. Examples of the structured health data are height, weight and BP, blood group, and stages of a disease diagnosis [5, 8].
Unstructured data: Unstructured data is the data that is not organized with any standard format. Examples of the unstructured health data are the doctor’s clinical notes, images, and the search engine data. The main problem of big data in healthcare fields is that around 80% of the healthcare data are in unstructured form [9]. Deep learning, artificial intelligence, cloud computing, data mining, machine learning algorithms, text analysis, and natural language processing tools are used for transforming the unstructured data into a meaningful knowledge insight. Table 3.4 summarizes the different sources of healthcare data.
3.4.2 Knowledge in Healthcare
In general, the knowledge is classified into the following two categories [5]:
• Explicit knowledge
• Tactic knowledge
Explicit knowledge: The explicit knowledge refers to clear and unambiguous information that is very simple to distribute, written down, and coherent. Studying the life cycle of a parasite, research on cancer, etc., are some of the examples of explicit knowledge. In healthcare, explicit knowledge is identifying the medication to treat a disease.
Table 3.4 Summary of different sources of healthcare data [13].
Sources of healthcare data | Details | Format |
Medical records |
The medical records consist of electronic health records (EHRs) and the electronic medical records (EMRs). These records offer information
|