Группа авторов

Machine Learning Techniques and Analytics for Cloud Security


Скачать книгу

the present article, our focus is to avail infrastructural service in the field of cloud computing or more specifically storage as a service. Our study is based on processing of biological data and generates resultant bio markers. As we have developed a model where gene expression data of both normal and cancerous state is analyzed and identified, the cancer mediating genes can be beneficial in the field of medical science and can help biologists in different perspective.

      The biggest challenge for research community in the field of genomic science is to develop infrastructure with a huge number of computers and some efficient software tools for analyzing the genomic datasets more exhaustively in the field biomedical research and to some extent in clinical practice. People who are doing research in this domain are getting toward cloud domain. To find a solution of different biomedical problems, it is very much important to analyze data effectively. Thus, integrating data from genomics, systems biology, and biomedical data mining always becomes promising one [24]. In our proposed model, we have worked on a dataset as a file (.csv format), and after processing by the developed methodology, we have produced a resultant dataset which is again sorted in a text file. So, our concerns how all these data can be made available in cloud environment so that it can be accessed by other user of the research community for further progress. But there are some parameters of concern [25].

      In the domain of cloud computing maintaining the secrecy of the data is a major concern that needs special attention with utmost priority. As we are here only concerned about the confidentiality of the data at the same time in a simplified manner without going insight the architectural detail. This also attracts the other benefits and advantages of cloud computing like lowering cost and greater efficiency. Besides, these other points of concern are data security and confidentiality. In cloud service, there are many commercial offering but these are heterogeneous in nature and deals with different needs which depends on the customers. The primary contestants in this field are Microsoft Azure, Google AppEngine, Amazone Web Service (AWS), IBM cloud, and many more. Amazon Simple Storage Service also known as Amazon S3 provides an object-based storage service that offers scalability considered as industry-leading, security, performance, and, of course, the availability of data. As our requirement is to store the files and get the security over the dataset so Amazon Web Service can be a good choice as because, AWS provides a Simple Storage Service (S3) for storing of data. It provides object storage to all the software developers and group of people related to IT which is highly secured, scalable, and durable as well. It offers a web interface which is easy to use and provides facility to store and retrieve data from anywhere on the web without considering the amount of data being consumed. It is a place where we can store our files on the AWS cloud Dropbox by simplifying the user interface of S3. The Dropbox here acts as a layer built on top of S3. Data is spread across multiple devices and facilities. Although S3 can be used for many purposes but in the present context, it can be used as storing files in Buckets/Folders in a secured way.

Schematic illustration of storing and accessing the data values in Amazon S3.

      3.6 Conclusion

      Our work is having scope of extension in future for identifying more genes which might be correlated to mutations. Further identifying interactions among those genes can be very much helpful in prognosis, cancer prevention and treatment. Analyzing interactions of Gene-Gene will be beneficial for finding out more TP genes having key role for mediating cancer. The extension of our study using other omics data might help researchers and biologists to concentrate on cancer study in a targeted way.

      References

      1. Soh, K.P., Szczurek, E., Sakoparnig, T., Beerenwinkel, N., Predicting cancer type from tumour DNA signatures. Genome Med., 9, 104, 2017.

      2. Hao, X., Luo, H., Krawczyk, M., Wei, W., Wang, W., Wang, J. et al., DNA methylation markers for diagnosis and prognosis of common cancers. Proc. Natl. Acad. Sci. U.S.A., 114, 28, 7414–19, 2017.

      3. Yang, Z., Jin, M., Zhang, Z., Lu, J., Hao, K., Classification based on feature extraction for hepatocellular carcinoma diagnosis using high-throughput dna methylation sequencing data. Proc. Comput. Sci., 107, 412–417, 2017.

      4. Rachman, A.A. and Rustam, Z., Cancer classification using Fuzzy C-Means with feature selection. 12th International Conference on Mathematics, Statistics, and Their Applications (ICMSA), pp. 31–34, 2016.

      5. Ghosh, A. and De, R.K., Fuzzy Correlated Association Mining: Selecting altered associations among the genes, and some possible marker genes mediating certain cancers. Appl. Soft Comput., 38, 587–605, 2015.

      6. Mao, Z.-Y., Cai, W.-S., Shao, X.-G., Selecting significant genes by randomization test for cancer classification using gene expression data. J. Biomed. Inform., 46, 4, 549–601, 2013.

      7.