WPT‐based node energy SF SFWTP(u) [16] can then be expressed in (2.15).
where
u uth wavelet packet node at level L, u= 1, 2, …, L;
v subband length for each wavelet packet node at level L, v = N/2L.
The signal’s energy distribution contained in a specific frequency band is calculated based on all cL[n] in each wavelet packet node using (2.15) and can be used as a SF [16], which provides more useful information than directly using cL[n].
In this way, the WPT technique precisely localizes information behind the non‐stationary signals in both time and frequency domains and thus it is widely applied to mechanical fault diagnosis.
2.3.3.4 Autoencoder
Recently, AEN becomes an important and popular technique to efficiently reduce the dimensionality and generate the abstract of large volumes of data [11, 12]. AEN is an unsupervised backpropagation neural‐network consisting of three fully‐connected layers of encoder (input), code (middle), and decoder (output).
The encoder layer encodes and compresses the data to the code layer, and then decoder layer reconstructs the compressed internal representation of input data from the code layer into output data as closer to the original input as possible. As depicted in Figure 2.22, the architecture of the encoder, code, and decoder can be designed to constitute at least one layer each.
Figure 2.22 Architecture of the AEN.
Let x be one variable of the input set, then the mathematical relationships between layers can be defined as (2.16) and (2.17), and its output
where
h compressed code of the middle layer;
output reconstructed from c in the middle layer;
fEN encoder layer;
fDE decoder layer;
fa activation function;
WEN network weight for node in the encoder;
WDE network weight for node in the decoder;
bEN bias for node in the encoder layer;
bDE bias for node in the decoder layer.
The number of input and output nodes depends on the size of raw data, while the number of nodes in the code layer is a hyperparameter that varies according to the AEN architecture and input data format as other hyperparameters do.
All weights and biases are usually initialized randomly, and then the learning procedure starts to iteratively update weights through back‐propagation algorithm, which minimizes the reconstruction errors between x and
Instead of adopting the entire AEN, the compressed code h is widely used as condensed SFs to represent the original input set. If there are cp components in the code layer, then the SF set SFAEN can be defined as {h1, h2, …hcp}. This feature extraction method is very similar to adopting the other well‐known dimensionality reduction technique: principle component analysis (PCA).
2.4 Case Studies
Four practical examples using real‐world data are respectively demonstrated to validate techniques of data acquisition and data preprocessing addressed in the previous sections. Details are described as below.
2.4.1 Detrending of the Thermal Effect in Strain Gauge Data
To detect force and torque during machining, a smart tool holder is developed and used in an CNC milling machine. When several corresponding gauges are attached to the holder, a strain gauge can be used to detect variation in the bending and torsion of the tool holder based on the proportional ratio of the resistance to the length of the stain gauge. As illustrated in Figure 2.23, the values of strain on the tool holder are sensed and detected by Wheatstone bridges, digitized using an ADC, processed via a microprocessor, and transmitted to an edge computer by using the message queuing telemetry transport (MQTT) protocol through a Wi‐Fi module.
Figure 2.23 Using a smart tool holder to detect tool state.
The edge computer located near the CNC machine receives and processes strain values and issues tool events to the controller when tool breakage is detected or tool’s RUL is short. A tool holder is stiff enough to enable clamping of a tool under various machining conditions and lead to tiny machining variation in the length and resistance of a strain gauge. Although a high‐gauge‐factor sensor is employed, a length difference (<1 μm) in a tool holder can be detected during machining. However, the strain gauge appears to have considerable thermal variations even in a stationary state. Thus, one challenge is how the thermal effect in strain‐gauge data can be removed to derive effective strain values; the details are described in [17] via IEEE DataPort.
For example, Figure 2.24a depicts raw signals collected during machining for 2.8 s (sampling rate of 10 kHz). Because of the high heat capacity of a real machine, the thermal variation can be assumed to be constant within a short period such as 5 s. After applying the wavelet de‐noising method with five levels (denoted DB5), the thermal trend can be derived as illustrated in Figure 2.24b. Then, the de‐trended data can be obtained by subtracting the thermal trend from the raw data; the result is depicted in Figure 2.24c.