maximizationPFparticle filterPIDproportional‐integral‐derivativePOMDPpartially‐observable Markov decision processRBPFRao‐Blackwellised particle filterReLUrectified linear unitR‐NEMrelational neural expectation maximizationRNNrecurrent neural networkSGVBstochastic gradient variational BayesSIRsampling importance resamplingSLAMsimultaneous localization and mappingSMAUGsingle molecule analysis by unsupervised Gibbs samplingSMCsequential Monte CarloSoCstate of chargeSoHstate of healthSVSFsmooth variable‐structure filterTD learningtemporal‐difference learningUIOunknown‐input observerUKFunscented Kalman filterVAEvariational autoencoderVFEMvariational filtering expectation maximizationVSCvariable‐structure controlwILIweighted influenza‐like illness
1 Introduction
1.1 State of a Dynamic System
In many branches of science and engineering, deriving a probabilistic model for sequential data plays a key role. System theory provides guidelines for studying the underlying dynamics of sequential data (time series). In describing a dynamic system, the notion of state is a key concept [1]:
Definition 1.1 State of a dynamic system is the smallest collection of variables that must be specified at a time instant in order to be able to predict the behavior of the system for any time instant . To be more precise, the state is the minimal record of the past history, which is required to predict the future behavior.
According to the principle of causality, any dynamic system may be described from the state perspective. Deploying a state‐transition model allows for determining the future state of a system,
, at any time instant , given its initial state, , at time instant as well as the inputs to the system, , for . The output of the system, , is a function of the state, which can be computed using a measurement model. In this regard, state‐space models are powerful tools for analysis and control of dynamic systems.1.2 State Estimation
Observability is a key concept in system theory, which refers to the ability to reconstruct the hidden or latent state variables that cannot be directly measured, from the measured variables in the minimum possible length of time [1]. In building state‐space models, two key questions deserve special attention [2]:
1 (i) Is it possible to identify the governing dynamics from data?
2 (ii) Is it possible to perform inference from observables to the latent state variables?
At time instant
, the inference problem to be solved is to find the estimate of in the presence of noise, which is denoted by . Depending of the value of , estimation algorithms are categorized into three groups [3]:1 (i) Prediction: ,
2 (ii) Filtering: ,
3 (iii) Smoothing: .
Regarding the mentioned two challenging questions, in order to improve performance, sophisticated representations can be deployed for the system under study. However, the corresponding inference algorithms may become computationally demanding. Hence, for designing efficient data‐driven inference algorithms, the following points must be taken into account [2]:
1 (i) The underlying assumptions for building a state‐space model must allow for reliable system identification and plausible long‐term prediction of the system behavior.
2 (ii) The inference mechanism must be able to capture rich dependencies.
3 (iii) The algorithm must be able to inherit the merit of learning machines to be trainable on raw data such as sensory inputs in a control system.
4 (iv) The algorithm must be scalable to big data regarding the optimization of model parameters based on the stochastic gradient descent method.
Regarding the important role of computation in inference problems, Section 1.3 provides a brief account of the foundations of computing.
1.3 Construals of Computing
According to [4], a comprehensive theory of computing must meet three criteria:
1 (i) Empirical criterion: Doing justice to practice by keeping the analysis grounded in real‐world examples.
2 (ii) Conceptual criterion: Being understandable in terms of what it says, where it comes from, and what it costs.
3 (iii) Cognitive criterion: Providing an intelligible foundation for the computational theory of mind that underlies both artificial intelligence and cognitive science.
Following this line of thinking, it was proposed in [4] to distinguish the following construals of computation:
1 Formal symbol manipulation is rooted in formal logic and metamathematics. The idea is to build machines that are capable of manipulating symbolic or meaningful expressions regardless of their interpretation or semantic content.
2 Effective computability deals with the question of what can be done, and how hard it is to do it mechanically.
3 Execution of an algorithm or rule following focuses on what is involved in following a set of rules or instructions, and what behavior would be produced.
4 Calculation of a function considers the behavior of producing the value of a mathematical function as output, when a set of arguments is given as input.
5 Digital state machine is based on the idea of a finite‐state automaton.
6 Information processing focuses on what is involved in storing, manipulating, displaying, and trafficking of information.
7 Physical symbol systems is based on the idea that the way computers interact with symbols depends on their mutual physical embodiment. In this regard, computers may be assumed to be made of symbols.
8 Dynamics must be taken into account in terms of the roles that nonlinear elements, attractors, criticality, and emergence play in computing.
9 Interactive agents are capable of interacting and communicating with other agents and even people.
10 Self‐organizing or complex adaptive systems are capable of adjusting their organization or structure in response to changes in their environment in order to survive and improve their performance.
11 Physical implementation emphasizes on the occurrence of computational practice in real‐world systems.