Группа авторов

Muography


Скачать книгу

standard deviation and the daily number of eruptions with gray impulses for the data collection period from 20 October 2018 to 29 June 2020. The muon flux images and average flux values are excluded from this analysis for those days on which the MMOS did not operate reliably or maintenance work was performed in SMO.

Schematic illustration of average muon flux values are plotted with 1σ standard deviation (black error bars) for the erupting Minamidake crater (a), the deactivated Showa crater (b), and the Surface region of Sakurajima volcano (c) for the data collection period between 20 October 2018 and 29 June 2020. Schematic illustration of averaged relative flux values for eruption days and the two days before that for the erupting Minamidake crater (a), the deactivated Showa crater (b), and the Surface region (c).

      In this section, we present the application of three different ML techniques to the processing of muographic image data. The significant variation of averaged relative flux values through the erupting Mindamidake crater (Fig. 4.4) suggested the application of conventional classifier models for processing of averaged flux values. Thus, we applied SVM and an artificial neural network (ANN) for the processing of average fluxes calculated for the three selected regions. For the processing of daily muon flux images, a convolutional neural network (CNN) model was utilized. Each model was implemented, trained and tested with scikit‐learn version 0.22.1 (Pedregosa et al., 2011), Keras version 2.4.3 (Keras, 2020), and Tensorflow version 2.3.0 (Tensorflow, 2020). Receiver operating characteristic (ROC) analysis was performed on each trained model to evaluate the eruption forecasting performance using the test data (Fawcett, 2006). The performance of each model was quantified by means of calculation of area under the curve (AUC). The optimal cutoff points, i.e. sensitivity and specificity parameters of the ROC curves, were determined with the Youden index (Youden, 1950). In this study, sensitivity (true positive rate) and specificity (1 ‐ false positive rate) correspond to the probabilities of forecasting an occurred eruption and false alarm, respectively.

      In the following subsections, we present briefly the core concepts, the results of hyperparameter tuning, and the achieved eruption forecasting performances for the different models.

      4.4.1 Processing of Average Fluxes with Support Vector Machine

      SVM is a versatile model that constructs a set of hyperplanes to separate the multi‐dimensional input data for the classification, regression, or detection of outliers (Vapnik, 1995). In this analysis, SVM was implemented with radial basis function kernel with a C regularization parameter and a γ kernel parameter. The C parameter represents the cost of training accuracy that regulates the balance between the maximization of distances of data points from the hyperplane and the maximization of correctly classified data points in training data. In case of smaller C, the distance is larger between the data points and the hyperplane, and SVM achieves lower accuracy in classification of training data. In case of larger C, the distance is smaller between the data points and hyperplane, and SVM achieves a better classification in training data; however, it is more sensitive to the unique features of training data, which can result in lower classification accuracy in test data. The γ kernel parameter represents the inverse of the radius of influence of samples selected by the SVM. In case of small γ, SVM cannot capture the complexity of the training data. In case of large γ, the support vector only includes itself that results in over‐fitting with any value of C.

      4.4.2 Processing of Average Fluxes with Neural Network

      An ANN model was also constructed to process average flux values. Conceptually, the ANN is a directed and weighted graph of neurons. Each neuron has multiple inputs and produces one output that can be connected to multiple neurons. The inputs on the first layer are the input data, such as series of numbers or image data. The last layer represents the output of ANN that accomplish the required task, such as prediction or classification. The connections of neurons are corresponding to the synapses of biological brain. The input of a neuron is determined by an activation function, typically by ReLU or sigmoid, that is applied