Gene Cheung

Graph Spectral Image Processing


Скачать книгу

has been widely used for various image processing tasks. For example, JPEG and MPEG image/video compression standards still use block-based predictions and transforms, even in their most recent standards. Moreover, graph-based image processing also uses such a subdivision as preprocessing. It can also be combined with the following fast computation approaches.

      A simple solution for image subdivision is the block-based approach. It divides the input image into an equal-sized subblocks (these blocks can be overlapped with an appropriate window function), after which the favorite image processing tasks can be performed. The advantage it provides is simplicity: we only have to consider the size of subblocks to make a trade-off between performance and complexity. Sizes of the consistent image regions vary significantly; as a result, a recursive subdivision, called quadtree decomposition, provides a good trade-off.

      More complex image subdivisions are also possible by utilizing an image segmentation. Although these segmentated sub-images are not rectangular in general, we can directly perform graph-based image processing in such non-rectangular regions by using appropriate graphs.

      1.6.2. Downsampling

      Downsampling is another typical approach for reducing the computation cost, and this has been used everywhere in image processing applications. A challenge for graph-based image processing is to find a good low-resolution graph, which properly reflects the original pixel structure.

      Reducing the size of a graph is called graph reduction or graph coarsening. It is divided into two phases:

       1) Phase 1: reducing the number of nodes;

       2) Phase 2: reconnecting edges for the downsampled pixels.

      In image processing, Phase 1 is relatively straightforward. We can assume the original image pixels are nodes on a uniform grid. As in usual image processing, picking up every other node (when the image signal is downsampled by two) will be reasonable.

      For more general graphs like those used in point cloud processing, we need to select the “best” set of nodes. This problem is called sampling set selection. Although this is beyond our scope in this chapter, please refer to (Tanaka et al. 2020; Sakiyama et al. 2019c) and references therein.

      In contrast to Phase 1, Phase 2 is not very straightforward. If pixels in the original image/block are associated with a graph, the downsampled one should be reconnected by edges, such that the reduced-size graph reflects the original structure. In the general

       1) the reduced-size graph has non-negative edge weights;

       2) the connectivity of the original graph is preserved in the reduced-size graphs;

       3) the spectrum of the reduced-size graph is representative of the original graph;

       4) the reduced-size graph preserves the original structural properties;

       5) if two nodes are connected in the original graph, they should have a similar edge weights in the reduced-size graph;

       6) it is tractable in terms of implementation and computational complexity;

       7) the reduced-size graph preserves the sparsity (i.e. the ratio between the number of nonzero edges and that of pixels) of the original one.

      Existing reconnection methods do not always satisfy all of these simultaneously; however, they do exhibit some of these properties. The order of the desired properties depends on applications considered. Major approaches have been summarized in Shuman et al. (2016a).

      1.6.3. Precomputing GFT

      If an image or subblock has several typical patterns, i.e. graphs, precomputing GFT bases for these graphs may be a reasonable choice to decrease the computational burden. This is because when we use them off the shelf, the computation cost reduces significantly as a result of decreased complexity, with a sacrifice of the storage cost for the GFT matrices and the cost of searching the optimal precomputed GFT from a given image. This precomputing strategy is popular in standard image processing. For example, in modern image/video coding standards, some precomputed transforms, such as DCT and discrete sine transform (DST) with various sizes, are utilized to represent image blocks as sparsely as possible.

      Some precomputing methods have been proposed by Hu et al. (2015) and Zhang and Liang (2017), and they are mainly used for image compression. As expected, the GFT yields sparse transformed coefficients for piecewise smooth images/blocks. For those without such piecewise regions, conventional transforms like the DCT and DST are basically included as a set of precomputed bases.

      1.6.4. Partial eigendecomposition

      [1.50] eq-image

      where ||·||0 represents the number of non-zero elements, i.e. pseudo-norm. Here, without loss of generality, we can assume the first K GFT coefficients are non-zero:

      [1.51] eq-image

      With the GFT basis U, it is equivalently represented as

      [1.52] images

      where × represents some possible non-zero elements and

      [1.53] eq-image

      A partial eigendecomposition proposed in literature gives the following approximation of L:

      [1.54] eq-image

      Evaluating only requires K (< N) eigenvectors and eigenvalues, which is significantly less than that obtained using the full eigendecomposition. In general, its computational complexity will be .

      The previous subsection proposes that we can alleviate the heavy computational burden by assuming the bandlimitedness of the graph signal. However, this requires the assumption on the signal model prior to filtering, but the signal is not bandlimited in general.

      In many application scenarios, we often only need the evaluation of x with a given (linear) matrix function h(L). That is, the eigenvalues and eigenvectors themselves are often unnecessary. The polynomial approximation methods introduced here enable us to calculate an approximation of y = h(L)x without the (partial) decomposition of the variation operator.