Gene Cheung

Graph Spectral Image Processing


Скачать книгу

9.5. References 10 Graph Neural Networks for Image Processing 10.1. Introduction 10.2. Supervised learning problems 10.3. Generative models for point clouds 10.4. Concluding remarks 10.5. References

      7  List of Authors

      8  Index

      9  End User License Agreement

      List of Illustrations

      1 Introduction to Graph Spectral Image ProcessingFigure I.1. Visual comparison of JPEG dequantization methods for a butterfly at QF = 5. The corresponding PSNR values are also given. For a color version of this figure, see www.iste.co.uk/cheung/graph.zip

      2 Chapter 1Figure 1.1. Left: Star graph with N = 7. Right: Textured region of BarbaraFigure 1.2. Comparison of approximation errors in . For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 1.3. Image denoising example using bilateral filters. From left to right: Original, noisy (PSNR: 20.02 dB), bilateral filter in the pixel domain (PSNR: 26.23 dB), and bilateral filter in the graph frequency domain (PSNR: 27.14 dB). Both bilateral filters use the same parametersFigure 1.4. Framework of graph filter bank

      3 Chapter 2Figure 2.1. The different choices of graph lead to different representations of the same signal. The signal forms a smooth representation on the graph (a) as its values vary slowly along the edges of the graph, and it mainly consists of low-frequency components in the graph spectral domain (b). The representation of the same signal is less smooth on the graph (c), which consists of both low and high frequency components in the graph spectral domain (d). Figure from (Dong et al. 2019) with permission. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 2.2. Diffusion processes on the graph defined by a heat diffusion kernel (top right) and a graph shift operator (bottom right). Figure from (Dong et al. 2019) with permission. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 2.3. (a) A graph signal. (b-e) Its decomposition in four localized simple components. Each component is a heat diffusion process (e−τL) at time τ that has started from different network nodes. The size and the color of each ball indicates the value of the signal at each vertex of the graph. Figure from Thanou et al. (2017) with permission. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 2.4. A graph signal x at time step t is modeled as a linear combination of its observations in the past T time steps and a random noise process n[t]. Figure from Dong et al. (2019) with permission. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 2.5. Inferring a graph for image coding. Figure from (Dong et al. 2019) with permission. For a color version of this figure, see www.iste.co.uk/cheung/graph.zip

      4 Chapter 3Figure 3.1. Spatial graph-convolutional layer: an edge function weighs the node feature vectors, and an aggregation function merges them. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 3.2. Edge-conditioned convolution. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 3.3. Circulant approximation of a fully connected layer. On the left: A traditional parameter matrix without any structure. On the right: A stack of partial circulant matrices, where the weights in a row are shifted for a number of following rows before being replaced. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 3.4. Low-rank approximation of the last layer of the network. For a color version of this figure, see www.iste.co.uk/cheung/graph.zip

      5 Chapter 4Figure 4.1. Separable and non-separable transforms: (Left) For an N × N block, separable transforms are composed of two (possibly distinct) N ×N transforms, Urow and Ucol, applied to rows and columns of the block. (Right) Non-separable transforms apply a N2×N2 linear transformation using UFigure 4.2. Building blocks of a typical image/video encoder and decoder consisting of three main steps, which are (i) prediction, (ii) transformation, and (iii) quantization and entropy coding. This figure is adapted from Figure 1 in Egilmez et al. (2020a)Figure 4.3. Illustration of spectral transform design methodology. Traditional methods use a sample covariance to derive an approximation of KLT (sample KLT). This chapter focuses on approaches to find a graph Laplacian from data for designing GFTs (whose steps are represented by red arrows). For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 4.4. 2D GMRF models for intra- and inter-predicted signals. Filled vertices correspond to reference pixels obtained (a) from neighboring blocks and (b) from other frames via motion compensation. Unfilled vertices denote the pixels to be predicted and then transform coded. This figure is adapted from Figure 3 in Egilmez et al. (2020a)Figure 4.5. 1D GMRF models for (a) intra- and (b) inter-predicted signals. Black-filled vertices represent the reference pixels and unfilled vertices denote pixels to be predicted and then transform coded. This figure is adapted from Figure 2 in Egilmez et al. (2020a)Figure 4.6. An illustration of graph construction for a given 8 × 8 residual block signal, where wc = 1 and we = wc/sedge = 0.1 where sedge = 10. The resulting basis patches are also depicted. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 4.7. A 1D graph-based model with an image edge at location l = j. All black colored edges have weights equal to wc, and the gray edge between vertices vj and vj+1 is weighted as we = wc/sedge. This figure is adapted from Figure 5 in Egilmez et al. (2020a)Figure 4.8. Coding gain (cg) versus sedge for block sizes with N = 4, 8, 16, 32, 64. EA-GFT provides better coding gain (i.e. cg is negative) when sedge is larger than 10 across different block sizes. This figure is adapted from Figure 6 in Egilmez et al. (2020a). For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 4.9. Coding gain (cg) versus bits per pixel (R/N) for different edge sharpness parameters sedge = 10, 20, 40, 100, 200. EA-GFT provides better coding gain (i.e. cg is negative) if sedge is larger than 10 for different block sizes. This figure is adapted from Figure 7 in Egilmez et al. (2020a). For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 4.10. BD-rates achieved for coding intra-predicted blocks with the RDOT scheme based on KLT, S-GFT and NS-GFT, which are trained on datasets with fewer number of samples. This figure is adapted from Figure 14 in Egilmez et al. (2020a). For a color version of this figure, see www.iste.co.uk/cheung/graph.zip

      6 Chapter 5Figure 5.1. Example of a 3D point cloud: Stanford Bunny. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 5.2. Example of omnidirectional image: Sopha (Hawary et al. 2020). For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 5.3. Example of light field image: Succulents (Jiang et al. 2017). For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 5.4. General scheme of graph spectral 3D image compressionFigure 5.5. Graph-based encoder’s principles, where x is the signal to encode, ŷ is the previously decoded data, z is the residue, α the transformed coefficients and

their quantized versionFigure 5.6. Illustration of the difference between a compact and a not compact representation. For a color version of this figure, see www.iste.co.uk/cheung/graph.zipFigure 5.7. Graph topology for light fields and