In this article, on both datasets, k-means algorithm has been applied where k value is 3. Secondly, on the same datasets, hierarchical algorithm has been applied. At last, on the same datasets, fuzzy c-means algorithm has been applied where cluster number is 3. Total numbers of glycans are 442 that are present in all datasets. On host cell surfaces, these 442 glycans are displayed and act as sensory receptors that basically identify the glycoproteins of the viral surface. Consider an example where these 442 glycan structures concluded by sialic acid a2, 3- or a2, 6-linked that is called N-acetyl neuraminic acid which acts as receptors for H1N1. The upper respiratory surface of human mainly displays sialylated glycan receptors that are executed with a2 to 6-linked sialic acid. Moreover, various types of glycan receptors are responsible to identify the hemagglutinin glycoprotein (HA) on the outermost part of influenza A viruses. This way, human can be infected and H1N1 viruses transmit via respiratory droplets in humans. Nineteen differentially expressed glycans are found out of 442 after applying three clustering algorithms. After that, t-test statistical validations are applied on the infected and non-infected (normal) datasets. In Table 2.1, nineteen differentially expressed glycans are found after t-test validation.
Figure 2.10 Fuzzy c-means clustering algorithm of Influenza A (H1N1) infected human.
Figure 2.11 Fuzzy c-means clustering algorithm of Influenza A (H1N1) infected human.
After that, type-I and type-II errors are used for finding the accuracy and predicting the output between the actual and predicted values. Table 2.2 is represented as rows and columns where rows are experiment and columns are represented as gold set. The meaning of true positive (TP) is that the set of glycans are identified by our experiment as same as the result mentioned in the gold set. True negative (TN) means that the set of glycans are identified by our experiment not same as the result mentioned in the gold set. False negative (FN) means that the set of glycans are identified by our experiment which is missing in our experiment. False positive (FP) means that the set of glycans are identified by our experiment that positive but missing in the gold set. Type-I and Type-II errors are described in Figure 2.12.
Table 2.1 Significant glycan list.
Sr. no. | Structure |
---|---|
1 | Neu5Aca2-3(6-O-Su)Galb1-4(Fuca1-3)GlcNAcb-Sp8 |
2 | Neu5Aca2-6Galb1-4GlcNAcb1-3Galb1-4(Fuca1-3)GlcNAcb1-3Galb1-4(Fuca1-3) GlcNAcb-Sp0 |
3 | Galb1-4GlcNAcb1-2Mana1-3(Neu5Aca2-6Galb1-4GlcNAcb1-2Mana1-6) Manb1-4GlcNAcb1-4GlcNAcb-Sp12 |
4 | GlcAb1-3GlcNAcb-Sp8 |
5 | Mana1-2Mana1-2Mana1-3(Mana1-2Mana1-6(Mana1-2Mana1-3)Mana1-6)Mana-Sp9 |
6 | GlcNAcb1-2Mana1-3(Galb1-4GlcNAcb1-2Mana1-6) Manb1-4GlcNAcb1-4GlcNAc-Sp12 |
7 | Galb1-4GlcNacb1-2(Galb1-4GlcNacb1-4)Mana1-3(Galb1-4GlcNacb1-2(Galb1-4GlcNacb1-6)Mana1-6)Manb1-4GlcNacb1-4GlcNacb-Sp21 |
8 | Galb1-3Galb1-4GlcNAcb-Sp8 |
9 | Galb1-3(Neu5Aca2-6)GalNAca-Sp14 |
10 | Neu5Aca2-6Galb1-4Glcb-Sp0 |
11 | Neu5Aca2-3Galb1-4GlcNAcb1-2Mana1-3(Neu5Aca2-3Galb1-4GlcNAcb1-2Mana1-6)Manb1-4GlcNAcb1-4GlcNAcb-Sp12 |
12 | Neu5Aca2-6Galb1-4GlcNAcb1-2Mana1-3(Neu5Aca2-3Galb1-4GlcNAcb1-2Mana1-6)Manb1-4GlcNAcb1-4GlcNAcb-Sp12 |
13 | Neu5Aca2-6GlcNAcb1-4GlcNAcb1-4GlcNAc-Sp21 |
14 | Neu5Aca2-3Galb1-4GlcNAcb-Sp8 |
15 | Neu5Aca2-6Galb1-4GlcNAcb-Sp0 |
16 | Neu5Gca2-3Galb1-4(Fuca1-3)GlcNAcb-Sp0 |
17 | Neu5Aca2-3Galb1-4GlcNAcb1-3Galb1-3GlcNAcb-Sp0 |
18 | Neu5Aca2-6Galb1-4GlcNAcb1-3Galb1-6GlcNAcb-Sp8 |
19 | Neu5Aca2-3Galb1-3GalNAcb1-4(Neu5Aca2-8Neu5Aca2-3)Galb1-4Glcb-Sp0 |
Table 2.2 The tabular format has been created from the above diagram.
Our experiment | Gold Set | ||
---|---|---|---|
Positive (+) | Negative (−) | ||
Positive (+) | T_Pos | F_Pos | |
Negative (−) | F_Neg | T_Neg |
The performance of the method has been validated using various statistical measurements & metrices. For details, please refer to Table 2.3. The visual representation of the method's performance has also been depicted in Figure 2.13.
In our experiment, the value of t-pos = 14, F_pos = 5, F_Neg = 5, and T_neg = 418. Performance of our method using various metrices.
Figure 2.12 Concepts of type-I and type-II error in terms set.
Table 2.3 Performance of the method using various metrices.
Parameter | Value |
---|---|
Sensitivity | 0.736 |
True negative rate | 0.976 |
Precision | 0.736 |
Negative-predictive value | 0.163 |
Miss
|