Posted 2023-02-26Updated 2023-03-028 minutes read (About 1234 words)

Paper | Resolution-robust Large Mask Inpainting with Fourier Convolutions | WACV2022

Info

Title： Resolution-robust Large Mask Inpainting with Fourier Convolutions
Keyword：Large Mask Inapinting
Idea：Fourier Convolutions
Source
- Paper，2021年9月15日submitted的。最后发表在WACV2022上，确实是Applications of CV，非常实用。后续有很多CVPR2022的高分辨率图像修复任务都和这篇工作做了对比。[2109.07161] Resolution-robust Large Mask Inpainting with Fourier Convolutions (arxiv.org)
- Code，大分辨率图像修复效果非常好的一项工作，面向落地的。https://github.com/saic-mdal/lama，[Resolution-robust Large Mask Inpainting with Fourier Convolutions (advimman.github.io)](https://advimman.github.io/lama-project/)
- Vedio，超棒的一个paper讲解，非作者本人，但是邀请了一作来interview。Resolution-robust Large Mask Inpainting with Fourier Convolutions (w/ Author Interview) - YouTube

现存的问题：

Modern image inpainting systems, often struggle with large missing areas, complex geometric structures, and high-resolution images. 目前图像修复存在的问题有：大缺失区域（但个人认为ill-posed problem不是傅里叶卷积能够解决的）、复杂几何结构以及高分辨率图像修复。

猜想：

如何解决这个问题？作者认为最主要的原因是lack of an effective receptive field in both the inpainting network and the loss function.

本文LaMa（Large mask inpainting）贡献点：

在网络结构上，使用fast Fourier convolutions的inpainting network architecture，image-wide的感受野（快速傅里叶卷积的贡献）。
在损失函数上，A high receptive field perceptual loss。
在训练策略上，使用Large training mask。

A large effective receptive field is essential for understanding the global structure of an image.

第一， high receptive field architecture。文章提出了基于快速傅里叶卷积（FFCs）的网络架构，能够使得网络前几层感受野都能cover整个图像。可以提升perceptual quality并使网络轻量化，而且泛化能力很强（即使训练集不包含的高分辨率图像，也能很好的推理）。
第二， high receptive field loss function。文章提出基于语义分割网络、大感受野的perceptual loss。能够提升全局结构和形状的一致性。
第三，aggressive algorithm of training masks generation。training mask generation，生成更大的mask。

大mask配置下，如果依旧利用传统的3×3ResNet卷积核，在网络前期感受野可能位于掩膜内部，所以网络中的许多层都缺乏全局上下文，浪费了计算量和参数。

而Fast Fourier convolution (FFC) 能够让网络前几层应用全局的上下文信息。包含两个并行分支，1）局部分支使用常规的卷积操作；2）全局分支使用real FFT，作用在实数信号上。FFT会转换到复数空间（频域）。而inverse real FFT能够保证输出是实数。
这里简单的real FFT得到的复数实部和虚部concat到了一起，然后在频域上做了一个1×1卷积，也就是同频分量的卷积，这样能保证周期性信号的修复（也就是重复性的pattern，作者最初的motivation就是认为现有的方法对于重复性pattern修复的结果不佳，想到重复pattern就想到了周期性信号，也就使用了FFT来解决这个问题）