LAMA-WAVELET: IMAGE IMPAINTING WITH HIGH QUALITY OF FINE DETAILS AND OBJECT EDGES
DOI:
https://doi.org/10.15588/1607-3274-2024-1-19Keywords:
image inpainting, wavelet transform, LaMa network, Daubechies wavelet, Fréchet inception distance, wavelet convolutionAbstract
Context. The problem of the image impainting in computer graphic and computer vision systems is considered. The subject of the research is deep learning convolutional neural networks for image inpainting.
Objective. The objective of the research is to improve the image inpainting performance in computer vision and computer graphics systems by applying wavelet transform in the LaMa-Fourier network architecture.
Method. The basic LaMa-Fourier network decomposes the image into global and local texture. Then it is proposed to improve the network block, processing the global context of the image, namely, the spectral transform block. To improve the block of spectral transform, instead of Fourier Unit Structure the Simple Wavelet Convolution Block elaborated by the authors is used. In this block, 3D wavelet transform of the image on two levels was initially performed using the Daubechies wavelet db4. The obtained coefficients of 3D wavelet transform are splitted so that each subband represents a separate feature of the image. Convolutional layer, batch normalization and ReLU activation function are sequentially applied to the results of splitting of coefficients on each level of wavelet transform. The obtained subbands of wavelet coefficients are concatenated and the inverse wavelet transform is applied to them, the result of which is the output of the block. Note that the wavelet coefficients at different levels were processed separately. This reduces the computational complexity of calculating the network outputs while preserving the influence of the context of each level on image inpainting. The obtained neural network is named LaMa-Wavelet. The FID, PSNR, SSIM indexes and visual analysis were used to estimate the quality of images inpainted with LaMa-Wavelet network.
Results. The proposed LaMa-Wavelet network has been implemented in software and researched for solving the problem of image inpainting. The PSNR of images inpainted using the LaMa-Wavelet exceeds the results obtained using the LaMa-Fourier network for narrow and medium masks in average by 4.5%, for large masks in average by 6%. The LaMa-Wavelet applying can enhance SSIM by 2–4% depending on a mask size. But it takes 3 times longer to inpaint one image with LaMa-Wavelet than with LaMa-Fourier network. Analysis of specific images demonstrates that both networks show similar results of inpainting of a homogeneous background. On complex backgrounds with repeating elements the LaMa-Wavelet is often more effective in restoring textures.
Conclusions. The obtained LaMa-Wavelet network allows to improve the image inpainting with large masks due to applying wavelet transform in the LaMa network architecture. Namely, the quality of reconstruction of image edges and fine details is increased.
References
Ma Y., Liu X., Bai S. et al. Region-wise generative adversarial image inpainting for large missing areas, IEEE Transactions on Cybernetics, 2023, Vol. 53, № 8, pp. 5226– 5239. DOI: 10.1109/TCYB.2022.3194149.
Petrov K. E., Kyrychenko V. V. Removal of rain components from single images using a recurrent neural network, Radio Electronics, Computer Science, Control, 2023, № 2, 91–102. DOI: 10.15588/1607-3274-2023-2-10
Kolodochka D. O., Polyakova M. V. Comparative analysis of convolutional neural networks for filling missing image regions, Science and education: problems, prospects and innovations: 7th International Scientific and Practical Conference, Kyoto, Japan, 1–3 April, 2021: proceedings. CPN Publishing Group, 2021, pp. 562–570.
Suvorov R., Logacheva E., Mashikhin A. et al. Resolutionrobust large mask inpainting with Fourier convolutions, Applications of Computer Vision: IEEE Workshop/Winter Conference, WACV, Waikoloa, Hawaii, 4–8 January, 2022 : proceedings. IEEE, 2022, pp. 2149–2159. DOI: 10.1109/WACV51458.2022.00323
Leoshchenko S. D., Oliynyk A. O., Subbotin S. O., Hoffman E. O., Kornienko O. V. Method of structural adjustment of neural network models to ensure interpretability, Radio electronics, computer science, management, 2021, № 3, pp. 86–96. DOI: 10.15588/1607-3274-2021-3-8
Polyakova M. V. RCF-ST: Richer Convolutional Features network with structural tuning for the edge detection on natural images, Radio electronics, computer science, management, 2023, № 4, pp. 122–134. DOI: 10.15588/1607-3274-2023-4-12
Pathak D., Krahenbuhl P., Donahue J., Darrell T., Efros A. A. Context encoders: feature learning by inpainting, Computer Vision and Pattern Recognition: IEEE Conference, CVPR, Las Vegas, NV, USA, 27–30 June, 2016 : proceedings. IEEE, 2016, pp. 2536–2544. DOI: 10.1109/CVRP.2016.278
Liu G., Reda F. A., Shih K. J. et al. Image inpainting for irregular holes using partial convolutions, Computer Vision: European Conference, ECCV, Munich, Germany, 8–14 September, 2018 : proceedings. IEEE: 2018, pp. 85–100. DOI: 10.1007/978-3-030-01252-6_6
Yang C., Lu X., Lin Z. et al. High-resolution image inpainting using multi-scale neural patch synthesis, Computer Vision and Pattern Recognition: IEEE Conference, CVPR, Honolulu, HI, USA, 21–26 July 2017 : proceedings. IEEE, 2017, pp. 6721–6729. DOI: 10.1109/CVPR.2017.434
Iizuka S., Simo-Serra E., Ishikawa H. Globally and locally consistent image сompletion, ACM Transactions on Graphics, 2017, Vol. 36, № 4, pp. 107:1. DOI: 10.1145/3072959.3073659
Yu J., Yang J., Shen X., Lu X., Huang T. S. Generative image inpainting with contextual attention. Computer Vision and Pattern Recognition Workshops: IEEE/CVF Conference, CVPRW, Salt Lake City, UT, USA, 18–22 June, 2018 : proceedings. IEEE, 2018, pp. 5505–5514. DOI: 10.1109/CVPRW.2018.00577
Nazeri K., Ng E., Joseph T., Qureshi F., Ebrahimi M. EdgeConnect: structure guided image inpainting using edge prediction, Computer Vision Workshop: IEEE/CVF International Conference, ICCVW, Seoul, Korea (South), 27–28 October, 2019 : proceedings. IEEE, 2019, pp. 2462– 2468. DOI: 10.1109/ICCVW.2019.00408
Yu J., Lin Z., Yang J. et al. Free-form image inpainting with gated convolution, Computer Vision: IEEE/CVF International Conference, ICCV, Seoul, Korea (South), 27 October – 2 November, 2019 : proceedings. IEEE, 2019, pp. 4471–4480. DOI: 10.1109/ICCV.2019.00457
Zhao S., Cui J., Sheng Y. et al. Large scale image completion via co-modulated generative adversarial networks, Learning Representations: International Conference, ICLR, Vienna, Austria, 4 May 2021: processings [Electronic resource]. Access mode: https://arxiv.org/abs/2103.10428. DOI: 10.48550/arXiv. 2103.10428
Gonzalez R. C., Woods R. E. Digital Image Processing. NY, Pearson, 4th Edition, 2017, 1192 p.
Daubechies I. Ten Lectures on Wavelets, Philadelphia, SIAM Press, 1992, 352 p.
Li Q., Shen L., Guo S., Lai Z. WaveCNet: wavelet integrated CNNs to suppress aliasing effect for noise-robust image classification, IEEE Transactions on Image Processing, 2021, Vol. 30, pp. 7074–7089. DOI:10.1109/TIP.2021.3101395
Liu P., Zhang H., Zhang K., Lin L., Zuo W. Multi-level Wavelet-CNN for image restoration, Computer Vision and Pattern Recognition Workshops: IEEE/CVF Conference, CVPRW, Salt Lake City, UT, USA, 18 – 22 June, 2018 : proceedings. IEEE, 2018, pp. 2149–2159. DOI: 10.1109/CVPRW.2018.00121
Bobulski J. Multimodal face recognition method with twodimensional hidden Markov model, Bulletin of the Polish Academy of Sciences, Technical Sciences, 2017, Vol. 65, № 1, pp. 121–128. DOI: 10.1515/bpasts-2017-0015
Sara U., Akter M., Uddin M. S. Image quality assessment through FSIM, SSIM, MSE and PSNR – a comparative study, Journal of Computer and Communications, 2019, Vol. 7, № 3, pp. 8–18. DOI: 10.4236/jcc.2019.73002
Johnson J., Alahi A., Fei-Fei L. Perceptual losses for realtime style transfer and super-resolution. Computer Vision – ECCV 2016. Lecture Notes in Computer Science / Leibe, B., Matas, J., Sebe, N., Welling, M. (eds). Springer, Cham, 2016, Vol. 9906, pp. 694–711. DOI: 10.1007/978-3-31946475-6_43
Wang T.-C., Liu M.-Y., Zhu J.-Y. et al. High-resolution image synthesis and semantic manipulation with conditional GANs, Computer Vision and Pattern Recognition: IEEE Conference, CVPR, Salt Lake City, UT, USA, 18–23 June, 2018 : proceedings. IEEE, 2018, pp. 8798–8807. DOI: 10.1109/CVPR.2018.00917
Places365 Scene Recognition Demo [Electronic resource]. Access mode: http://places2.csail.mit.edu/
Safebooru [Electronic resource]. Access mode: https://safebooru.org/index.php?page=post&s=list&tags=no _humans+landscape
Heusel M., Ramsauer H., Unterthiner T., Nessler B., Hochreiter S. GANs trained by a two time-scale update rule converge to a local nash equilibrium, Neural Information Processing Systems: 31st Annual Conference, NIPS, Long Beach, California, USA, 4–9 December, 2017 : proceedings. Neural Information Processing Systems Foundation, Inc., 2017, pp. 6629–6640. DOI: 10.18034/ajase.v8i1.9
Polyakova M. V., Krylov V. N., Ishchenko A. V. Elaboration of the transform with generalized comb scaling and wavelet functions for the image segmentation, EasternEuropean Journal of Enterprise Technologies, 2014, Vol. 2, № 2 (71), pp. 33–37. DOI: 10.15587/1729-4061.2014.27791
CelebA-HQ [Electronic resource]. Access mode: https://paperswithcode.com/dataset/celeba-hq
Supplementary material [Electronic resource]. Access mode: https://bit.ly/3zhv2rD/lama_supmat_2021.pdf
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 D. O. Kolodochka, M. V. Polyakova
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Creative Commons Licensing Notifications in the Copyright Notices
The journal allows the authors to hold the copyright without restrictions and to retain publishing rights without restrictions.
The journal allows readers to read, download, copy, distribute, print, search, or link to the full texts of its articles.
The journal allows to reuse and remixing of its content, in accordance with a Creative Commons license СС BY -SA.
Authors who publish with this journal agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License CC BY-SA that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.