left-parenthesis x Subscript j Baseline comma z Subscript italic i j Baseline comma normal upper Theta Subscript j Baseline right-parenthesis right-bracket"/>
where c′ is the number of rules, βi = (θi , θ0i , zj , Θ) are the consequent parameters of the rule
Rather than directly extracting support vectors to generate fuzzy rules, the FM problem is solved by learning the parameters in Eq. (4.63) while considering the experience of Eq. (4.62). It is expected that Eq. (4.63) would be able to describe the input–output behavior in the same way as Eq. (4.62). However, the experience acquired heavily depends on the selection of the hyperparameters [103, 104]. Improper selection of these hyperparameters may result in bad performance and bring on useless experience and information. Here, some selection methods are suggested:
1 The regularization parameter C can be given by following prescription according to the range of output values of the training data [103] : , where and σy are the mean and the standard deviation of the training data y, and n is the number of training data.
2 v ‐SVR is employed instead of ε ‐SVR, since it is easy to determine of the number of support vectors by the adjustment of v. The adjustment of parameter v can be determined by an asymptotically optimal procedure, and the theoretically optimal value for Gaussian noise is 0.54 [104]. For the kernel parameter Θ′, the k ‐fold cross‐validation method is utilized [104].
Reduced‐set vectors: In order to share the experience, we are interested in constructing Eq. (4.63) such that the original Eq. (4.62) is approximated. In the following, k(x, xi) is written as k′(x, xi), considering its kernel parameters Θ′ in Eq. (4.62). Similarly, in Eq. (4.63),
(4.64)
If we let the consequent parameter θ0i be G(x0i), we have
For a smaller upper bound, we assume that Θ′ = Θ. Then, according to the Cauchy–Schwartz inequality, the right side of the above inequality is simplified to
where
If we use the notation ‖·‖∞ as