diff --git a/README.md b/README.md index 8f30bc5981b8bb7094774610db3bdb17989760c3..31d7d89d84b74bf128adbaa1a7825eb7a082867e 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ --- -title: Image Enhancement -emoji: šŸ“š -colorFrom: gray -colorTo: blue +title: Real ESRGAN Web App Cpu Test +emoji: šŸ’» +colorFrom: green +colorTo: gray sdk: streamlit sdk_version: 1.35.0 app_file: app.py @@ -11,3 +11,281 @@ license: unknown --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference + + +

+ +

+ +##
English | ē®€ä½“äø­ę–‡
+ +
+ +šŸ‘€[**Demos**](#-demos-videos) **|** šŸš©[**Updates**](#-updates) **|** āš”[**Usage**](#-quick-inference) **|** šŸ°[**Model Zoo**](docs/model_zoo.md) **|** šŸ”§[Install](#-dependencies-and-installation) **|** šŸ’»[Train](docs/Training.md) **|** ā“[FAQ](docs/FAQ.md) **|** šŸŽØ[Contribution](docs/CONTRIBUTING.md) + +[![download](https://img.shields.io/github/downloads/xinntao/Real-ESRGAN/total.svg)](https://github.com/xinntao/Real-ESRGAN/releases) +[![PyPI](https://img.shields.io/pypi/v/realesrgan)](https://pypi.org/project/realesrgan/) +[![Open issue](https://img.shields.io/github/issues/xinntao/Real-ESRGAN)](https://github.com/xinntao/Real-ESRGAN/issues) +[![Closed issue](https://img.shields.io/github/issues-closed/xinntao/Real-ESRGAN)](https://github.com/xinntao/Real-ESRGAN/issues) +[![LICENSE](https://img.shields.io/github/license/xinntao/Real-ESRGAN.svg)](https://github.com/xinntao/Real-ESRGAN/blob/master/LICENSE) +[![python lint](https://github.com/xinntao/Real-ESRGAN/actions/workflows/pylint.yml/badge.svg)](https://github.com/xinntao/Real-ESRGAN/blob/master/.github/workflows/pylint.yml) +[![Publish-pip](https://github.com/xinntao/Real-ESRGAN/actions/workflows/publish-pip.yml/badge.svg)](https://github.com/xinntao/Real-ESRGAN/blob/master/.github/workflows/publish-pip.yml) + +
+ +šŸ”„ **AnimeVideo-v3 model (åŠØę¼«č§†é¢‘å°ęؔ型)**. Please see [[*anime video models*](docs/anime_video_model.md)] and [[*comparisons*](docs/anime_comparisons.md)]
+šŸ”„ **RealESRGAN_x4plus_anime_6B** for anime images **(åŠØę¼«ę’å›¾ęؔ型)**. Please see [[*anime_model*](docs/anime_model.md)] + + +1. :boom: **Update** online Replicate demo: [![Replicate](https://img.shields.io/static/v1?label=Demo&message=Replicate&color=blue)](https://replicate.com/xinntao/realesrgan) +1. Online Colab demo for Real-ESRGAN: [![Colab](https://img.shields.io/static/v1?label=Demo&message=Colab&color=orange)](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) **|** Online Colab demo for for Real-ESRGAN (**anime videos**): [![Colab](https://img.shields.io/static/v1?label=Demo&message=Colab&color=orange)](https://colab.research.google.com/drive/1yNl9ORUxxlL4N0keJa2SEPB61imPQd1B?usp=sharing) +1. Portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-macos.zip) **executable files for Intel/AMD/Nvidia GPU**. You can find more information [here](#portable-executable-files-ncnn). The ncnn implementation is in [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan) + + +Real-ESRGAN aims at developing **Practical Algorithms for General Image/Video Restoration**.
+We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. + +šŸŒŒ Thanks for your valuable feedbacks/suggestions. All the feedbacks are updated in [feedback.md](docs/feedback.md). + +--- + +If Real-ESRGAN is helpful, please help to ā­ this repo or recommend it to your friends šŸ˜Š
+Other recommended projects:
+ā–¶ļø [GFPGAN](https://github.com/TencentARC/GFPGAN): A practical algorithm for real-world face restoration
+ā–¶ļø [BasicSR](https://github.com/xinntao/BasicSR): An open-source image and video restoration toolbox
+ā–¶ļø [facexlib](https://github.com/xinntao/facexlib): A collection that provides useful face-relation functions.
+ā–¶ļø [HandyView](https://github.com/xinntao/HandyView): A PyQt5-based image viewer that is handy for view and comparison
+ā–¶ļø [HandyFigure](https://github.com/xinntao/HandyFigure): Open source of paper figures
+ +--- + +### šŸ“– Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data + +> [[Paper](https://arxiv.org/abs/2107.10833)]   [[YouTube Video](https://www.youtube.com/watch?v=fxHWoDSSvSc)]   [[Bē«™č®²č§£](https://www.bilibili.com/video/BV1H34y1m7sS/)]   [[Poster](https://xinntao.github.io/projects/RealESRGAN_src/RealESRGAN_poster.pdf)]   [[PPT slides](https://docs.google.com/presentation/d/1QtW6Iy8rm8rGLsJ0Ldti6kP-7Qyzy6XL/edit?usp=sharing&ouid=109799856763657548160&rtpof=true&sd=true)]
+> [Xintao Wang](https://xinntao.github.io/), Liangbin Xie, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ), [Ying Shan](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en)
+> [Tencent ARC Lab](https://arc.tencent.com/en/ai-demos/imgRestore); Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences + +

+ +

+ +--- + + +## šŸš© Updates + +- āœ… Add the **realesr-general-x4v3** model - a tiny small model for general scenes. It also supports the **-dn** option to balance the noise (avoiding over-smooth results). **-dn** is short for denoising strength. +- āœ… Update the **RealESRGAN AnimeVideo-v3** model. Please see [anime video models](docs/anime_video_model.md) and [comparisons](docs/anime_comparisons.md) for more details. +- āœ… Add small models for anime videos. More details are in [anime video models](docs/anime_video_model.md). +- āœ… Add the ncnn implementation [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan). +- āœ… Add [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth), which is optimized for **anime** images with much smaller model size. More details and comparisons with [waifu2x](https://github.com/nihui/waifu2x-ncnn-vulkan) are in [**anime_model.md**](docs/anime_model.md) +- āœ… Support finetuning on your own data or paired data (*i.e.*, finetuning ESRGAN). See [here](docs/Training.md#Finetune-Real-ESRGAN-on-your-own-dataset) +- āœ… Integrate [GFPGAN](https://github.com/TencentARC/GFPGAN) to support **face enhancement**. +- āœ… Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See [Gradio Web Demo](https://huggingface.co/spaces/akhaliq/Real-ESRGAN). Thanks [@AK391](https://github.com/AK391) +- āœ… Support arbitrary scale with `--outscale` (It actually further resizes outputs with `LANCZOS4`). Add *RealESRGAN_x2plus.pth* model. +- āœ… [The inference code](inference_realesrgan.py) supports: 1) **tile** options; 2) images with **alpha channel**; 3) **gray** images; 4) **16-bit** images. +- āœ… The training codes have been released. A detailed guide can be found in [Training.md](docs/Training.md). + +--- + + +## šŸ‘€ Demos Videos + +#### Bilibili + +- [大闹天宫ē‰‡ę®µ](https://www.bilibili.com/video/BV1ja41117zb) +- [Anime dance cut åŠØę¼«é­”ę€§čˆžč¹ˆ](https://www.bilibili.com/video/BV1wY4y1L7hT/) +- [ęµ·č“¼ēŽ‹ē‰‡ę®µ](https://www.bilibili.com/video/BV1i3411L7Gy/) + +#### YouTube + +## šŸ”§ Dependencies and Installation + +- Python >= 3.7 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) +- [PyTorch >= 1.7](https://pytorch.org/) + +### Installation + +1. Clone repo + + ```bash + git clone https://github.com/xinntao/Real-ESRGAN.git + cd Real-ESRGAN + ``` + +1. Install dependent packages + + ```bash + # Install basicsr - https://github.com/xinntao/BasicSR + # We use BasicSR for both training and inference + pip install basicsr + # facexlib and gfpgan are for face enhancement + pip install facexlib + pip install gfpgan + pip install -r requirements.txt + python setup.py develop + ``` + +--- + +## āš” Quick Inference + +There are usually three ways to inference Real-ESRGAN. + +1. [Online inference](#online-inference) +1. [Portable executable files (NCNN)](#portable-executable-files-ncnn) +1. [Python script](#python-script) + +### Online inference + +1. You can try in our website: [ARC Demo](https://arc.tencent.com/en/ai-demos/imgRestore) (now only support RealESRGAN_x4plus_anime_6B) +1. [Colab Demo](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) for Real-ESRGAN **|** [Colab Demo](https://colab.research.google.com/drive/1yNl9ORUxxlL4N0keJa2SEPB61imPQd1B?usp=sharing) for Real-ESRGAN (**anime videos**). + +### Portable executable files (NCNN) + +You can download [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-macos.zip) **executable files for Intel/AMD/Nvidia GPU**. + +This executable file is **portable** and includes all the binaries and models required. No CUDA or PyTorch environment is needed.
+ +You can simply run the following command (the Windows example, more information is in the README.md of each executable files): + +```bash +./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n model_name +``` + +We have provided five models: + +1. realesrgan-x4plus (default) +2. realesrnet-x4plus +3. realesrgan-x4plus-anime (optimized for anime images, small model size) +4. realesr-animevideov3 (animation video) + +You can use the `-n` argument for other models, for example, `./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n realesrnet-x4plus` + +#### Usage of portable executable files + +1. Please refer to [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan#computer-usages) for more details. +1. Note that it does not support all the functions (such as `outscale`) as the python script `inference_realesrgan.py`. + +```console +Usage: realesrgan-ncnn-vulkan.exe -i infile -o outfile [options]... + + -h show this help + -i input-path input image path (jpg/png/webp) or directory + -o output-path output image path (jpg/png/webp) or directory + -s scale upscale ratio (can be 2, 3, 4. default=4) + -t tile-size tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu + -m model-path folder path to the pre-trained models. default=models + -n model-name model name (default=realesr-animevideov3, can be realesr-animevideov3 | realesrgan-x4plus | realesrgan-x4plus-anime | realesrnet-x4plus) + -g gpu-id gpu device to use (default=auto) can be 0,1,2 for multi-gpu + -j load:proc:save thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu + -x enable tta mode" + -f format output image format (jpg/png/webp, default=ext/png) + -v verbose output +``` + +Note that it may introduce block inconsistency (and also generate slightly different results from the PyTorch implementation), because this executable file first crops the input image into several tiles, and then processes them separately, finally stitches together. + +### Python script + +#### Usage of python script + +1. You can use X4 model for **arbitrary output size** with the argument `outscale`. The program will further perform cheap resize operation after the Real-ESRGAN output. + +```console +Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]... + +A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --face_enhance + + -h show this help + -i --input Input image or folder. Default: inputs + -o --output Output folder. Default: results + -n --model_name Model name. Default: RealESRGAN_x4plus + -s, --outscale The final upsampling scale of the image. Default: 4 + --suffix Suffix of the restored image. Default: out + -t, --tile Tile size, 0 for no tile during testing. Default: 0 + --face_enhance Whether to use GFPGAN to enhance face. Default: False + --fp32 Use fp32 precision during inference. Default: fp16 (half precision). + --ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto +``` + +#### Inference general images + +Download pre-trained models: [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) + +```bash +wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth -P weights +``` + +Inference! + +```bash +python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance +``` + +Results are in the `results` folder + +#### Inference anime images + +

+ +

+ +Pre-trained models: [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth)
+ More details and comparisons with [waifu2x](https://github.com/nihui/waifu2x-ncnn-vulkan) are in [**anime_model.md**](docs/anime_model.md) + +```bash +# download model +wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P weights +# inference +python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs +``` + +Results are in the `results` folder + +--- + +## BibTeX + + @InProceedings{wang2021realesrgan, + author = {Xintao Wang and Liangbin Xie and Chao Dong and Ying Shan}, + title = {Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data}, + booktitle = {International Conference on Computer Vision Workshops (ICCVW)}, + date = {2021} + } + +## šŸ“§ Contact + +If you have any question, please email `xintao.wang@outlook.com` or `xintaowang@tencent.com`. + + +## šŸ§© Projects that use Real-ESRGAN + +If you develop/use Real-ESRGAN in your projects, welcome to let me know. + +- NCNN-Android: [RealSR-NCNN-Android](https://github.com/tumuyan/RealSR-NCNN-Android) by [tumuyan](https://github.com/tumuyan) +- VapourSynth: [vs-realesrgan](https://github.com/HolyWu/vs-realesrgan) by [HolyWu](https://github.com/HolyWu) +- NCNN: [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan) + +    **GUI** + +- [Waifu2x-Extension-GUI](https://github.com/AaronFeng753/Waifu2x-Extension-GUI) by [AaronFeng753](https://github.com/AaronFeng753) +- [Squirrel-RIFE](https://github.com/Justin62628/Squirrel-RIFE) by [Justin62628](https://github.com/Justin62628) +- [Real-GUI](https://github.com/scifx/Real-GUI) by [scifx](https://github.com/scifx) +- [Real-ESRGAN_GUI](https://github.com/net2cn/Real-ESRGAN_GUI) by [net2cn](https://github.com/net2cn) +- [Real-ESRGAN-EGUI](https://github.com/WGzeyu/Real-ESRGAN-EGUI) by [WGzeyu](https://github.com/WGzeyu) +- [anime_upscaler](https://github.com/shangar21/anime_upscaler) by [shangar21](https://github.com/shangar21) +- [Upscayl](https://github.com/upscayl/upscayl) by [Nayam Amarshe](https://github.com/NayamAmarshe) and [TGS963](https://github.com/TGS963) + +## šŸ¤— Acknowledgement + +Thanks for all the contributors. + +- [AK391](https://github.com/AK391): Integrate RealESRGAN to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See [Gradio Web Demo](https://huggingface.co/spaces/akhaliq/Real-ESRGAN). +- [Asiimoviet](https://github.com/Asiimoviet): Translate the README.md to Chinese (äø­ę–‡). +- [2ji3150](https://github.com/2ji3150): Thanks for the [detailed and valuable feedbacks/suggestions](https://github.com/xinntao/Real-ESRGAN/issues/131). +- [Jared-02](https://github.com/Jared-02): Translate the Training.md to Chinese (äø­ę–‡). + + + + diff --git a/VERSION b/VERSION new file mode 100644 index 0000000000000000000000000000000000000000..0d91a54c7d439e84e3dd17d3594f1b2b6737f430 --- /dev/null +++ b/VERSION @@ -0,0 +1 @@ +0.3.0 diff --git a/app.py b/app.py new file mode 100644 index 0000000000000000000000000000000000000000..a0a3f39f312e16ee6911fb54148ec387404282e9 --- /dev/null +++ b/app.py @@ -0,0 +1,153 @@ +import streamlit as st +import cv2 +import os +import numpy as np +from basicsr.archs.rrdbnet_arch import RRDBNet +from basicsr.utils.download_util import load_file_from_url +from realesrgan import RealESRGANer +from realesrgan.archs.srvgg_arch import SRVGGNetCompact +from gfpgan import GFPGANer + +# Function to load the model +def load_model(model_name, model_path, denoise_strength, tile, tile_pad, pre_pad, fp32, gpu_id): + if model_name == 'RealESRGAN_x4plus': + model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4) + netscale = 4 + file_url = ['https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth'] + elif model_name == 'RealESRGAN_x4plus_anime_6B': + model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=6, num_grow_ch=32, scale=4) + netscale = 4 + file_url = ['https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth'] + elif model_name == 'RealESRGAN_x2plus': + model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2) + netscale = 2 + file_url = ['https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth'] + + # Determine model paths + if model_path is not None: + model_path = model_path + else: + model_path = os.path.join('weights', model_name + '.pth') + if not os.path.isfile(model_path): + for url in file_url: + # Model_path will be updated + model_path = load_file_from_url( + url=url, model_dir=os.path.join(os.getcwd(), 'weights'), progress=True, file_name=model_name + '.pth') + + dni_weight = None + if model_name == 'realesr-general-x4v3' and denoise_strength != 1: + model_path = [model_path, model_path.replace('realesr-general-x4v3', 'realesr-general-wdn-x4v3')] + dni_weight = [denoise_strength, 1 - denoise_strength] + + # Use DNI to control the denoise strength + dni_weight = None + if model_name == 'realesr-general-x4v3' and denoise_strength != 1: + wdn_model_path = model_path.replace('realesr-general-x4v3', 'realesr-general-wdn-x4v3') + model_path = [model_path, wdn_model_path] + dni_weight = [denoise_strength, 1 - denoise_strength] + + # Restorer + upsampler = RealESRGANer( + scale=netscale, + model_path=model_path, + dni_weight=dni_weight, + model=model, + tile=tile, + tile_pad=tile_pad, + pre_pad=pre_pad, + half=not fp32, + gpu_id=gpu_id) + + return upsampler + +# Function to download model weights if not present +def ensure_model_weights(model_name): + weights_dir = 'weights' + model_file = f"{model_name}.pth" + model_path = os.path.join(weights_dir, model_file) + + if not os.path.exists(weights_dir): + os.makedirs(weights_dir) + + if not os.path.isfile(model_path): + if model_name == 'RealESRGAN_x4plus': + file_url = 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth' + elif model_name == 'RealESRGAN_x4plus_anime_6B': + file_url = 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth' + elif model_name == 'RealESRGAN_x2plus': + file_url = 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth' + + model_path = load_file_from_url( + url=file_url, model_dir=weights_dir, progress=True, file_name=model_file) + + return model_path + +# Streamlit app +st.title("Real-ESRGAN Image Enhancement") + +uploaded_file = st.file_uploader("Choose an image...", type=["jpg", "png", "jpeg"]) + +# User selects model name, denoise strength, and other parameters +model_name = st.selectbox("Model Name", ['RealESRGAN_x4plus', 'RealESRGAN_x4plus_anime_6B', 'RealESRGAN_x2plus']) +denoise_strength = st.slider("Denoise Strength", 0.0, 1.0, 0.5) +outscale = st.slider("Output Scale", 1, 4, 4) +tile = 0 +tile_pad = 10 +pre_pad = 0 +face_enhance = st.checkbox("Face Enhance") +fp32 = st.checkbox("Use FP32 Precision") +gpu_id = None # or set to 0, 1, etc. if you have multiple GPUs + +if uploaded_file is not None: + col1, col2 = st.columns(2) + with col1: + st.write("### Original Image") + st.image(uploaded_file, use_column_width=True) + run_button = st.button("Run") + + # Save uploaded image to disk + input_image_path = os.path.join("temp", "input_image.png") + os.makedirs("temp", exist_ok=True) + with open(input_image_path, "wb") as f: + f.write(uploaded_file.getbuffer()) + + if not run_button: + st.warning("Click the 'Run' button to start the enhancement process.") + + if run_button: + # Ensure model weights are downloaded + model_path = ensure_model_weights(model_name) + + # Load the model + upsampler = load_model(model_name, model_path, denoise_strength, tile, tile_pad, pre_pad, fp32, gpu_id) + + # Load the image + img = cv2.imdecode(np.frombuffer(uploaded_file.read(), np.uint8), cv2.IMREAD_UNCHANGED) + if img is None: + st.error("Error loading image. Please try again.") + else: + img_mode = 'RGBA' if len(img.shape) == 3 and img.shape[2] == 4 else None + + try: + if face_enhance: + face_enhancer = GFPGANer( + model_path='https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth', + upscale=outscale, + arch='clean', + channel_multiplier=2, + bg_upsampler=upsampler) + _, _, output = face_enhancer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True) + else: + output, _ = upsampler.enhance(img, outscale=outscale) + except RuntimeError as error: + st.error(f"Error: {error}") + st.error('If you encounter CUDA out of memory, try to set --tile with a smaller number.') + else: + # Save and display the output image + output_image_path = os.path.join("temp", "output_image.png") + cv2.imwrite(output_image_path, output) + with col2: + st.write("### Enhanced Image") + st.image(output_image_path, use_column_width=True) + if 'output_image_path' in locals(): + st.download_button("Download Enhanced Image", data=open(output_image_path, "rb").read(), file_name="output_image.png", mime="image/png") \ No newline at end of file diff --git a/deploy.bat b/deploy.bat new file mode 100644 index 0000000000000000000000000000000000000000..79c701bc07e92c6e82e7c623c78f85fa1e60fa59 --- /dev/null +++ b/deploy.bat @@ -0,0 +1,7 @@ +@echo off +REM Activate the cuda environment +call "%USERPROFILE%\anaconda3\Scripts\activate.bat" cuda +REM Change directory to Real-ESRGAN-Web-App +cd /d %USERPROFILE%\Real-ESRGAN-Web-App +REM Run localtunnel on port 8501 +lt --port 8501 \ No newline at end of file diff --git a/deploy_install.bat b/deploy_install.bat new file mode 100644 index 0000000000000000000000000000000000000000..6731dcb02b9072276516964e5053878b9f47aa30 --- /dev/null +++ b/deploy_install.bat @@ -0,0 +1,7 @@ +@echo off +REM Activate the cuda environment +call "%USERPROFILE%\anaconda3\Scripts\activate.bat" cuda +REM Change directory to Real-ESRGAN-Web-App +cd /d %USERPROFILE%\Real-ESRGAN-Web-App +REM Install localtunnel +npm install -g localtunnel \ No newline at end of file diff --git a/gfpgan/weights/parsing_parsenet.pth b/gfpgan/weights/parsing_parsenet.pth new file mode 100644 index 0000000000000000000000000000000000000000..1ac2efc50360a79c9905dbac57d9d99cbfbe863c --- /dev/null +++ b/gfpgan/weights/parsing_parsenet.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d558d8d0e42c20224f13cf5a29c79eba2d59913419f945545d8cf7b72920de2 +size 85331193 diff --git a/inputs/00003.png b/inputs/00003.png new file mode 100644 index 0000000000000000000000000000000000000000..00cad23adf5d658caf03a0a2874f0c89d96c5ddc Binary files /dev/null and b/inputs/00003.png differ diff --git a/inputs/00017_gray.png b/inputs/00017_gray.png new file mode 100644 index 0000000000000000000000000000000000000000..79af68e8aa0f036211734b7271633d88b2fc8f0d Binary files /dev/null and b/inputs/00017_gray.png differ diff --git a/inputs/0014.jpg b/inputs/0014.jpg new file mode 100644 index 0000000000000000000000000000000000000000..f59554fe3143b3ffa27d6fcb04143124b4d0412b Binary files /dev/null and b/inputs/0014.jpg differ diff --git a/inputs/0030.jpg b/inputs/0030.jpg new file mode 100644 index 0000000000000000000000000000000000000000..61868926af738046e984bcf652134e3ea9b958d9 Binary files /dev/null and b/inputs/0030.jpg differ diff --git a/inputs/ADE_val_00000114.jpg b/inputs/ADE_val_00000114.jpg new file mode 100644 index 0000000000000000000000000000000000000000..b4d9c9067adbcdd153527cef2c0cab4cf40bbfa5 Binary files /dev/null and b/inputs/ADE_val_00000114.jpg differ diff --git a/inputs/OST_009.png b/inputs/OST_009.png new file mode 100644 index 0000000000000000000000000000000000000000..10bbc831acb7065827a14eb7e0538312a8d6f3e2 Binary files /dev/null and b/inputs/OST_009.png differ diff --git a/inputs/children-alpha.png b/inputs/children-alpha.png new file mode 100644 index 0000000000000000000000000000000000000000..41dcc3b6cc7a8a1b073f6dbe09d0c12e18c1b4b3 Binary files /dev/null and b/inputs/children-alpha.png differ diff --git a/inputs/tree_alpha_16bit.png b/inputs/tree_alpha_16bit.png new file mode 100644 index 0000000000000000000000000000000000000000..ca7c2aac2c5c9cdaea66ecc8e06d6b43e3d8bf20 Binary files /dev/null and b/inputs/tree_alpha_16bit.png differ diff --git a/inputs/video/onepiece_demo.mp4 b/inputs/video/onepiece_demo.mp4 new file mode 100644 index 0000000000000000000000000000000000000000..29b4e5246b19008885611c23921fe4423f17e43f Binary files /dev/null and b/inputs/video/onepiece_demo.mp4 differ diff --git a/inputs/wolf_gray.jpg b/inputs/wolf_gray.jpg new file mode 100644 index 0000000000000000000000000000000000000000..614766bdbcaa3730a8191afcb9616305381245ea Binary files /dev/null and b/inputs/wolf_gray.jpg differ diff --git a/realesrgan.egg-info/PKG-INFO b/realesrgan.egg-info/PKG-INFO new file mode 100644 index 0000000000000000000000000000000000000000..655576c751b410ac375f34e3fa4e7fbd4f1bb274 --- /dev/null +++ b/realesrgan.egg-info/PKG-INFO @@ -0,0 +1,298 @@ +Metadata-Version: 2.1 +Name: realesrgan +Version: 0.3.0 +Summary: Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration +Home-page: https://github.com/xinntao/Real-ESRGAN +Author: Xintao Wang +Author-email: xintao.wang@outlook.com +License: BSD-3-Clause License +Keywords: computer vision,pytorch,image restoration,super-resolution,esrgan,real-esrgan +Classifier: Development Status :: 4 - Beta +Classifier: License :: OSI Approved :: Apache Software License +Classifier: Operating System :: OS Independent +Classifier: Programming Language :: Python :: 3 +Classifier: Programming Language :: Python :: 3.7 +Classifier: Programming Language :: Python :: 3.8 +Description-Content-Type: text/markdown +Requires-Dist: basicsr>=1.4.2 +Requires-Dist: facexlib>=0.2.5 +Requires-Dist: gfpgan>=1.3.5 +Requires-Dist: numpy +Requires-Dist: opencv-python +Requires-Dist: Pillow +Requires-Dist: torch>=1.7 +Requires-Dist: torchvision +Requires-Dist: tqdm + +

+ +

+ +##
English | ē®€ä½“äø­ę–‡
+ +
+ +šŸ‘€[**Demos**](#-demos-videos) **|** šŸš©[**Updates**](#-updates) **|** āš”[**Usage**](#-quick-inference) **|** šŸ°[**Model Zoo**](docs/model_zoo.md) **|** šŸ”§[Install](#-dependencies-and-installation) **|** šŸ’»[Train](docs/Training.md) **|** ā“[FAQ](docs/FAQ.md) **|** šŸŽØ[Contribution](docs/CONTRIBUTING.md) + +[![download](https://img.shields.io/github/downloads/xinntao/Real-ESRGAN/total.svg)](https://github.com/xinntao/Real-ESRGAN/releases) +[![PyPI](https://img.shields.io/pypi/v/realesrgan)](https://pypi.org/project/realesrgan/) +[![Open issue](https://img.shields.io/github/issues/xinntao/Real-ESRGAN)](https://github.com/xinntao/Real-ESRGAN/issues) +[![Closed issue](https://img.shields.io/github/issues-closed/xinntao/Real-ESRGAN)](https://github.com/xinntao/Real-ESRGAN/issues) +[![LICENSE](https://img.shields.io/github/license/xinntao/Real-ESRGAN.svg)](https://github.com/xinntao/Real-ESRGAN/blob/master/LICENSE) +[![python lint](https://github.com/xinntao/Real-ESRGAN/actions/workflows/pylint.yml/badge.svg)](https://github.com/xinntao/Real-ESRGAN/blob/master/.github/workflows/pylint.yml) +[![Publish-pip](https://github.com/xinntao/Real-ESRGAN/actions/workflows/publish-pip.yml/badge.svg)](https://github.com/xinntao/Real-ESRGAN/blob/master/.github/workflows/publish-pip.yml) + +
+ +šŸ”„ **AnimeVideo-v3 model (åŠØę¼«č§†é¢‘å°ęؔ型)**. Please see [[*anime video models*](docs/anime_video_model.md)] and [[*comparisons*](docs/anime_comparisons.md)]
+šŸ”„ **RealESRGAN_x4plus_anime_6B** for anime images **(åŠØę¼«ę’å›¾ęؔ型)**. Please see [[*anime_model*](docs/anime_model.md)] + + +1. :boom: **Update** online Replicate demo: [![Replicate](https://img.shields.io/static/v1?label=Demo&message=Replicate&color=blue)](https://replicate.com/xinntao/realesrgan) +1. Online Colab demo for Real-ESRGAN: [![Colab](https://img.shields.io/static/v1?label=Demo&message=Colab&color=orange)](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) **|** Online Colab demo for for Real-ESRGAN (**anime videos**): [![Colab](https://img.shields.io/static/v1?label=Demo&message=Colab&color=orange)](https://colab.research.google.com/drive/1yNl9ORUxxlL4N0keJa2SEPB61imPQd1B?usp=sharing) +1. Portable [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-macos.zip) **executable files for Intel/AMD/Nvidia GPU**. You can find more information [here](#portable-executable-files-ncnn). The ncnn implementation is in [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan) + + +Real-ESRGAN aims at developing **Practical Algorithms for General Image/Video Restoration**.
+We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data. + +šŸŒŒ Thanks for your valuable feedbacks/suggestions. All the feedbacks are updated in [feedback.md](docs/feedback.md). + +--- + +If Real-ESRGAN is helpful, please help to ā­ this repo or recommend it to your friends šŸ˜Š
+Other recommended projects:
+ā–¶ļø [GFPGAN](https://github.com/TencentARC/GFPGAN): A practical algorithm for real-world face restoration
+ā–¶ļø [BasicSR](https://github.com/xinntao/BasicSR): An open-source image and video restoration toolbox
+ā–¶ļø [facexlib](https://github.com/xinntao/facexlib): A collection that provides useful face-relation functions.
+ā–¶ļø [HandyView](https://github.com/xinntao/HandyView): A PyQt5-based image viewer that is handy for view and comparison
+ā–¶ļø [HandyFigure](https://github.com/xinntao/HandyFigure): Open source of paper figures
+ +--- + +### šŸ“– Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data + +> [[Paper](https://arxiv.org/abs/2107.10833)]   [[YouTube Video](https://www.youtube.com/watch?v=fxHWoDSSvSc)]   [[Bē«™č®²č§£](https://www.bilibili.com/video/BV1H34y1m7sS/)]   [[Poster](https://xinntao.github.io/projects/RealESRGAN_src/RealESRGAN_poster.pdf)]   [[PPT slides](https://docs.google.com/presentation/d/1QtW6Iy8rm8rGLsJ0Ldti6kP-7Qyzy6XL/edit?usp=sharing&ouid=109799856763657548160&rtpof=true&sd=true)]
+> [Xintao Wang](https://xinntao.github.io/), Liangbin Xie, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ), [Ying Shan](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en)
+> [Tencent ARC Lab](https://arc.tencent.com/en/ai-demos/imgRestore); Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences + +

+ +

+ +--- + + +## šŸš© Updates + +- āœ… Add the **realesr-general-x4v3** model - a tiny small model for general scenes. It also supports the **-dn** option to balance the noise (avoiding over-smooth results). **-dn** is short for denoising strength. +- āœ… Update the **RealESRGAN AnimeVideo-v3** model. Please see [anime video models](docs/anime_video_model.md) and [comparisons](docs/anime_comparisons.md) for more details. +- āœ… Add small models for anime videos. More details are in [anime video models](docs/anime_video_model.md). +- āœ… Add the ncnn implementation [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan). +- āœ… Add [*RealESRGAN_x4plus_anime_6B.pth*](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth), which is optimized for **anime** images with much smaller model size. More details and comparisons with [waifu2x](https://github.com/nihui/waifu2x-ncnn-vulkan) are in [**anime_model.md**](docs/anime_model.md) +- āœ… Support finetuning on your own data or paired data (*i.e.*, finetuning ESRGAN). See [here](docs/Training.md#Finetune-Real-ESRGAN-on-your-own-dataset) +- āœ… Integrate [GFPGAN](https://github.com/TencentARC/GFPGAN) to support **face enhancement**. +- āœ… Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See [Gradio Web Demo](https://huggingface.co/spaces/akhaliq/Real-ESRGAN). Thanks [@AK391](https://github.com/AK391) +- āœ… Support arbitrary scale with `--outscale` (It actually further resizes outputs with `LANCZOS4`). Add *RealESRGAN_x2plus.pth* model. +- āœ… [The inference code](inference_realesrgan.py) supports: 1) **tile** options; 2) images with **alpha channel**; 3) **gray** images; 4) **16-bit** images. +- āœ… The training codes have been released. A detailed guide can be found in [Training.md](docs/Training.md). + +--- + + +## šŸ‘€ Demos Videos + +#### Bilibili + +- [大闹天宫ē‰‡ę®µ](https://www.bilibili.com/video/BV1ja41117zb) +- [Anime dance cut åŠØę¼«é­”ę€§čˆžč¹ˆ](https://www.bilibili.com/video/BV1wY4y1L7hT/) +- [ęµ·č“¼ēŽ‹ē‰‡ę®µ](https://www.bilibili.com/video/BV1i3411L7Gy/) + +#### YouTube + +## šŸ”§ Dependencies and Installation + +- Python >= 3.7 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) +- [PyTorch >= 1.7](https://pytorch.org/) + +### Installation + +1. Clone repo + + ```bash + git clone https://github.com/xinntao/Real-ESRGAN.git + cd Real-ESRGAN + ``` + +1. Install dependent packages + + ```bash + # Install basicsr - https://github.com/xinntao/BasicSR + # We use BasicSR for both training and inference + pip install basicsr + # facexlib and gfpgan are for face enhancement + pip install facexlib + pip install gfpgan + pip install -r requirements.txt + python setup.py develop + ``` + +--- + +## āš” Quick Inference + +There are usually three ways to inference Real-ESRGAN. + +1. [Online inference](#online-inference) +1. [Portable executable files (NCNN)](#portable-executable-files-ncnn) +1. [Python script](#python-script) + +### Online inference + +1. You can try in our website: [ARC Demo](https://arc.tencent.com/en/ai-demos/imgRestore) (now only support RealESRGAN_x4plus_anime_6B) +1. [Colab Demo](https://colab.research.google.com/drive/1k2Zod6kSHEvraybHl50Lys0LerhyTMCo?usp=sharing) for Real-ESRGAN **|** [Colab Demo](https://colab.research.google.com/drive/1yNl9ORUxxlL4N0keJa2SEPB61imPQd1B?usp=sharing) for Real-ESRGAN (**anime videos**). + +### Portable executable files (NCNN) + +You can download [Windows](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-windows.zip) / [Linux](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip) / [MacOS](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-macos.zip) **executable files for Intel/AMD/Nvidia GPU**. + +This executable file is **portable** and includes all the binaries and models required. No CUDA or PyTorch environment is needed.
+ +You can simply run the following command (the Windows example, more information is in the README.md of each executable files): + +```bash +./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n model_name +``` + +We have provided five models: + +1. realesrgan-x4plus (default) +2. realesrnet-x4plus +3. realesrgan-x4plus-anime (optimized for anime images, small model size) +4. realesr-animevideov3 (animation video) + +You can use the `-n` argument for other models, for example, `./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n realesrnet-x4plus` + +#### Usage of portable executable files + +1. Please refer to [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan#computer-usages) for more details. +1. Note that it does not support all the functions (such as `outscale`) as the python script `inference_realesrgan.py`. + +```console +Usage: realesrgan-ncnn-vulkan.exe -i infile -o outfile [options]... + + -h show this help + -i input-path input image path (jpg/png/webp) or directory + -o output-path output image path (jpg/png/webp) or directory + -s scale upscale ratio (can be 2, 3, 4. default=4) + -t tile-size tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu + -m model-path folder path to the pre-trained models. default=models + -n model-name model name (default=realesr-animevideov3, can be realesr-animevideov3 | realesrgan-x4plus | realesrgan-x4plus-anime | realesrnet-x4plus) + -g gpu-id gpu device to use (default=auto) can be 0,1,2 for multi-gpu + -j load:proc:save thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu + -x enable tta mode" + -f format output image format (jpg/png/webp, default=ext/png) + -v verbose output +``` + +Note that it may introduce block inconsistency (and also generate slightly different results from the PyTorch implementation), because this executable file first crops the input image into several tiles, and then processes them separately, finally stitches together. + +### Python script + +#### Usage of python script + +1. You can use X4 model for **arbitrary output size** with the argument `outscale`. The program will further perform cheap resize operation after the Real-ESRGAN output. + +```console +Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]... + +A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --face_enhance + + -h show this help + -i --input Input image or folder. Default: inputs + -o --output Output folder. Default: results + -n --model_name Model name. Default: RealESRGAN_x4plus + -s, --outscale The final upsampling scale of the image. Default: 4 + --suffix Suffix of the restored image. Default: out + -t, --tile Tile size, 0 for no tile during testing. Default: 0 + --face_enhance Whether to use GFPGAN to enhance face. Default: False + --fp32 Use fp32 precision during inference. Default: fp16 (half precision). + --ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto +``` + +#### Inference general images + +Download pre-trained models: [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) + +```bash +wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth -P weights +``` + +Inference! + +```bash +python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance +``` + +Results are in the `results` folder + +#### Inference anime images + +

+ +

+ +Pre-trained models: [RealESRGAN_x4plus_anime_6B](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth)
+ More details and comparisons with [waifu2x](https://github.com/nihui/waifu2x-ncnn-vulkan) are in [**anime_model.md**](docs/anime_model.md) + +```bash +# download model +wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P weights +# inference +python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs +``` + +Results are in the `results` folder + +--- + +## BibTeX + + @InProceedings{wang2021realesrgan, + author = {Xintao Wang and Liangbin Xie and Chao Dong and Ying Shan}, + title = {Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data}, + booktitle = {International Conference on Computer Vision Workshops (ICCVW)}, + date = {2021} + } + +## šŸ“§ Contact + +If you have any question, please email `xintao.wang@outlook.com` or `xintaowang@tencent.com`. + + +## šŸ§© Projects that use Real-ESRGAN + +If you develop/use Real-ESRGAN in your projects, welcome to let me know. + +- NCNN-Android: [RealSR-NCNN-Android](https://github.com/tumuyan/RealSR-NCNN-Android) by [tumuyan](https://github.com/tumuyan) +- VapourSynth: [vs-realesrgan](https://github.com/HolyWu/vs-realesrgan) by [HolyWu](https://github.com/HolyWu) +- NCNN: [Real-ESRGAN-ncnn-vulkan](https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan) + +    **GUI** + +- [Waifu2x-Extension-GUI](https://github.com/AaronFeng753/Waifu2x-Extension-GUI) by [AaronFeng753](https://github.com/AaronFeng753) +- [Squirrel-RIFE](https://github.com/Justin62628/Squirrel-RIFE) by [Justin62628](https://github.com/Justin62628) +- [Real-GUI](https://github.com/scifx/Real-GUI) by [scifx](https://github.com/scifx) +- [Real-ESRGAN_GUI](https://github.com/net2cn/Real-ESRGAN_GUI) by [net2cn](https://github.com/net2cn) +- [Real-ESRGAN-EGUI](https://github.com/WGzeyu/Real-ESRGAN-EGUI) by [WGzeyu](https://github.com/WGzeyu) +- [anime_upscaler](https://github.com/shangar21/anime_upscaler) by [shangar21](https://github.com/shangar21) +- [Upscayl](https://github.com/upscayl/upscayl) by [Nayam Amarshe](https://github.com/NayamAmarshe) and [TGS963](https://github.com/TGS963) + +## šŸ¤— Acknowledgement + +Thanks for all the contributors. + +- [AK391](https://github.com/AK391): Integrate RealESRGAN to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See [Gradio Web Demo](https://huggingface.co/spaces/akhaliq/Real-ESRGAN). +- [Asiimoviet](https://github.com/Asiimoviet): Translate the README.md to Chinese (äø­ę–‡). +- [2ji3150](https://github.com/2ji3150): Thanks for the [detailed and valuable feedbacks/suggestions](https://github.com/xinntao/Real-ESRGAN/issues/131). +- [Jared-02](https://github.com/Jared-02): Translate the Training.md to Chinese (äø­ę–‡). diff --git a/realesrgan.egg-info/SOURCES.txt b/realesrgan.egg-info/SOURCES.txt new file mode 100644 index 0000000000000000000000000000000000000000..4a25a7ab1dfcf47c7ecacfd1498c6db302da2159 --- /dev/null +++ b/realesrgan.egg-info/SOURCES.txt @@ -0,0 +1,21 @@ +README.md +setup.py +realesrgan/__init__.py +realesrgan/train.py +realesrgan/utils.py +realesrgan/version.py +realesrgan.egg-info/PKG-INFO +realesrgan.egg-info/SOURCES.txt +realesrgan.egg-info/dependency_links.txt +realesrgan.egg-info/not-zip-safe +realesrgan.egg-info/requires.txt +realesrgan.egg-info/top_level.txt +realesrgan/archs/__init__.py +realesrgan/archs/discriminator_arch.py +realesrgan/archs/srvgg_arch.py +realesrgan/data/__init__.py +realesrgan/data/realesrgan_dataset.py +realesrgan/data/realesrgan_paired_dataset.py +realesrgan/models/__init__.py +realesrgan/models/realesrgan_model.py +realesrgan/models/realesrnet_model.py \ No newline at end of file diff --git a/realesrgan.egg-info/dependency_links.txt b/realesrgan.egg-info/dependency_links.txt new file mode 100644 index 0000000000000000000000000000000000000000..8b137891791fe96927ad78e64b0aad7bded08bdc --- /dev/null +++ b/realesrgan.egg-info/dependency_links.txt @@ -0,0 +1 @@ + diff --git a/realesrgan.egg-info/not-zip-safe b/realesrgan.egg-info/not-zip-safe new file mode 100644 index 0000000000000000000000000000000000000000..8b137891791fe96927ad78e64b0aad7bded08bdc --- /dev/null +++ b/realesrgan.egg-info/not-zip-safe @@ -0,0 +1 @@ + diff --git a/realesrgan.egg-info/requires.txt b/realesrgan.egg-info/requires.txt new file mode 100644 index 0000000000000000000000000000000000000000..0c8f3f0e75ea0174a4055bdce8255c541187e4b1 --- /dev/null +++ b/realesrgan.egg-info/requires.txt @@ -0,0 +1,9 @@ +basicsr>=1.4.2 +facexlib>=0.2.5 +gfpgan>=1.3.5 +numpy +opencv-python +Pillow +torch>=1.7 +torchvision +tqdm diff --git a/realesrgan.egg-info/top_level.txt b/realesrgan.egg-info/top_level.txt new file mode 100644 index 0000000000000000000000000000000000000000..b90fc83a7bc6040ddb275c9910b6185badfd9774 --- /dev/null +++ b/realesrgan.egg-info/top_level.txt @@ -0,0 +1 @@ +realesrgan diff --git a/realesrgan/__init__.py b/realesrgan/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..2276f1eecded80d1f00ff97b45c66c7a8922b987 --- /dev/null +++ b/realesrgan/__init__.py @@ -0,0 +1,6 @@ +# flake8: noqa +from .archs import * +from .data import * +from .models import * +from .utils import * +from .version import * diff --git a/realesrgan/__pycache__/__init__.cpython-38.pyc b/realesrgan/__pycache__/__init__.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..9eb1ce1873db6b78f5416153c610527c0df59765 Binary files /dev/null and b/realesrgan/__pycache__/__init__.cpython-38.pyc differ diff --git a/realesrgan/__pycache__/utils.cpython-38.pyc b/realesrgan/__pycache__/utils.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..b31d24e9849fc43f4c3c88fecc646b85b53af31c Binary files /dev/null and b/realesrgan/__pycache__/utils.cpython-38.pyc differ diff --git a/realesrgan/__pycache__/version.cpython-38.pyc b/realesrgan/__pycache__/version.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..e7bdbff364b65c33529a7c52c4976cf6dc00eaef Binary files /dev/null and b/realesrgan/__pycache__/version.cpython-38.pyc differ diff --git a/realesrgan/archs/__init__.py b/realesrgan/archs/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..f3fbbf3b78e33b61fd4c33a564a9a617010d90de --- /dev/null +++ b/realesrgan/archs/__init__.py @@ -0,0 +1,10 @@ +import importlib +from basicsr.utils import scandir +from os import path as osp + +# automatically scan and import arch modules for registry +# scan all the files that end with '_arch.py' under the archs folder +arch_folder = osp.dirname(osp.abspath(__file__)) +arch_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(arch_folder) if v.endswith('_arch.py')] +# import all the arch modules +_arch_modules = [importlib.import_module(f'realesrgan.archs.{file_name}') for file_name in arch_filenames] diff --git a/realesrgan/archs/__pycache__/__init__.cpython-38.pyc b/realesrgan/archs/__pycache__/__init__.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3f1b69ecb6a4f20e19702890a6f153df47305fc4 Binary files /dev/null and b/realesrgan/archs/__pycache__/__init__.cpython-38.pyc differ diff --git a/realesrgan/archs/__pycache__/discriminator_arch.cpython-38.pyc b/realesrgan/archs/__pycache__/discriminator_arch.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..3f9ef972d5f198c27a11b44b0a2b6cfd603b6bf3 Binary files /dev/null and b/realesrgan/archs/__pycache__/discriminator_arch.cpython-38.pyc differ diff --git a/realesrgan/archs/__pycache__/srvgg_arch.cpython-38.pyc b/realesrgan/archs/__pycache__/srvgg_arch.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..9192ae1d7fd012974727ea34fecf6bc05d39ed12 Binary files /dev/null and b/realesrgan/archs/__pycache__/srvgg_arch.cpython-38.pyc differ diff --git a/realesrgan/archs/discriminator_arch.py b/realesrgan/archs/discriminator_arch.py new file mode 100644 index 0000000000000000000000000000000000000000..4b66ab1226d6793de846bc9828bbe427031a0e2d --- /dev/null +++ b/realesrgan/archs/discriminator_arch.py @@ -0,0 +1,67 @@ +from basicsr.utils.registry import ARCH_REGISTRY +from torch import nn as nn +from torch.nn import functional as F +from torch.nn.utils import spectral_norm + + +@ARCH_REGISTRY.register() +class UNetDiscriminatorSN(nn.Module): + """Defines a U-Net discriminator with spectral normalization (SN) + + It is used in Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. + + Arg: + num_in_ch (int): Channel number of inputs. Default: 3. + num_feat (int): Channel number of base intermediate features. Default: 64. + skip_connection (bool): Whether to use skip connections between U-Net. Default: True. + """ + + def __init__(self, num_in_ch, num_feat=64, skip_connection=True): + super(UNetDiscriminatorSN, self).__init__() + self.skip_connection = skip_connection + norm = spectral_norm + # the first convolution + self.conv0 = nn.Conv2d(num_in_ch, num_feat, kernel_size=3, stride=1, padding=1) + # downsample + self.conv1 = norm(nn.Conv2d(num_feat, num_feat * 2, 4, 2, 1, bias=False)) + self.conv2 = norm(nn.Conv2d(num_feat * 2, num_feat * 4, 4, 2, 1, bias=False)) + self.conv3 = norm(nn.Conv2d(num_feat * 4, num_feat * 8, 4, 2, 1, bias=False)) + # upsample + self.conv4 = norm(nn.Conv2d(num_feat * 8, num_feat * 4, 3, 1, 1, bias=False)) + self.conv5 = norm(nn.Conv2d(num_feat * 4, num_feat * 2, 3, 1, 1, bias=False)) + self.conv6 = norm(nn.Conv2d(num_feat * 2, num_feat, 3, 1, 1, bias=False)) + # extra convolutions + self.conv7 = norm(nn.Conv2d(num_feat, num_feat, 3, 1, 1, bias=False)) + self.conv8 = norm(nn.Conv2d(num_feat, num_feat, 3, 1, 1, bias=False)) + self.conv9 = nn.Conv2d(num_feat, 1, 3, 1, 1) + + def forward(self, x): + # downsample + x0 = F.leaky_relu(self.conv0(x), negative_slope=0.2, inplace=True) + x1 = F.leaky_relu(self.conv1(x0), negative_slope=0.2, inplace=True) + x2 = F.leaky_relu(self.conv2(x1), negative_slope=0.2, inplace=True) + x3 = F.leaky_relu(self.conv3(x2), negative_slope=0.2, inplace=True) + + # upsample + x3 = F.interpolate(x3, scale_factor=2, mode='bilinear', align_corners=False) + x4 = F.leaky_relu(self.conv4(x3), negative_slope=0.2, inplace=True) + + if self.skip_connection: + x4 = x4 + x2 + x4 = F.interpolate(x4, scale_factor=2, mode='bilinear', align_corners=False) + x5 = F.leaky_relu(self.conv5(x4), negative_slope=0.2, inplace=True) + + if self.skip_connection: + x5 = x5 + x1 + x5 = F.interpolate(x5, scale_factor=2, mode='bilinear', align_corners=False) + x6 = F.leaky_relu(self.conv6(x5), negative_slope=0.2, inplace=True) + + if self.skip_connection: + x6 = x6 + x0 + + # extra convolutions + out = F.leaky_relu(self.conv7(x6), negative_slope=0.2, inplace=True) + out = F.leaky_relu(self.conv8(out), negative_slope=0.2, inplace=True) + out = self.conv9(out) + + return out diff --git a/realesrgan/archs/srvgg_arch.py b/realesrgan/archs/srvgg_arch.py new file mode 100644 index 0000000000000000000000000000000000000000..39460965c9c5ee9cd6eb41c50d33574cb8ba6e50 --- /dev/null +++ b/realesrgan/archs/srvgg_arch.py @@ -0,0 +1,69 @@ +from basicsr.utils.registry import ARCH_REGISTRY +from torch import nn as nn +from torch.nn import functional as F + + +@ARCH_REGISTRY.register() +class SRVGGNetCompact(nn.Module): + """A compact VGG-style network structure for super-resolution. + + It is a compact network structure, which performs upsampling in the last layer and no convolution is + conducted on the HR feature space. + + Args: + num_in_ch (int): Channel number of inputs. Default: 3. + num_out_ch (int): Channel number of outputs. Default: 3. + num_feat (int): Channel number of intermediate features. Default: 64. + num_conv (int): Number of convolution layers in the body network. Default: 16. + upscale (int): Upsampling factor. Default: 4. + act_type (str): Activation type, options: 'relu', 'prelu', 'leakyrelu'. Default: prelu. + """ + + def __init__(self, num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=16, upscale=4, act_type='prelu'): + super(SRVGGNetCompact, self).__init__() + self.num_in_ch = num_in_ch + self.num_out_ch = num_out_ch + self.num_feat = num_feat + self.num_conv = num_conv + self.upscale = upscale + self.act_type = act_type + + self.body = nn.ModuleList() + # the first conv + self.body.append(nn.Conv2d(num_in_ch, num_feat, 3, 1, 1)) + # the first activation + if act_type == 'relu': + activation = nn.ReLU(inplace=True) + elif act_type == 'prelu': + activation = nn.PReLU(num_parameters=num_feat) + elif act_type == 'leakyrelu': + activation = nn.LeakyReLU(negative_slope=0.1, inplace=True) + self.body.append(activation) + + # the body structure + for _ in range(num_conv): + self.body.append(nn.Conv2d(num_feat, num_feat, 3, 1, 1)) + # activation + if act_type == 'relu': + activation = nn.ReLU(inplace=True) + elif act_type == 'prelu': + activation = nn.PReLU(num_parameters=num_feat) + elif act_type == 'leakyrelu': + activation = nn.LeakyReLU(negative_slope=0.1, inplace=True) + self.body.append(activation) + + # the last conv + self.body.append(nn.Conv2d(num_feat, num_out_ch * upscale * upscale, 3, 1, 1)) + # upsample + self.upsampler = nn.PixelShuffle(upscale) + + def forward(self, x): + out = x + for i in range(0, len(self.body)): + out = self.body[i](out) + + out = self.upsampler(out) + # add the nearest upsampled image, so that the network learns the residual + base = F.interpolate(x, scale_factor=self.upscale, mode='nearest') + out += base + return out diff --git a/realesrgan/data/__init__.py b/realesrgan/data/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..a3f8fdd1aa47c12de9687c578094303eb7369246 --- /dev/null +++ b/realesrgan/data/__init__.py @@ -0,0 +1,10 @@ +import importlib +from basicsr.utils import scandir +from os import path as osp + +# automatically scan and import dataset modules for registry +# scan all the files that end with '_dataset.py' under the data folder +data_folder = osp.dirname(osp.abspath(__file__)) +dataset_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(data_folder) if v.endswith('_dataset.py')] +# import all the dataset modules +_dataset_modules = [importlib.import_module(f'realesrgan.data.{file_name}') for file_name in dataset_filenames] diff --git a/realesrgan/data/__pycache__/__init__.cpython-38.pyc b/realesrgan/data/__pycache__/__init__.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..9acbb2fba22a6d2cfc5aab888974d381b6547d3e Binary files /dev/null and b/realesrgan/data/__pycache__/__init__.cpython-38.pyc differ diff --git a/realesrgan/data/__pycache__/realesrgan_dataset.cpython-38.pyc b/realesrgan/data/__pycache__/realesrgan_dataset.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..2a3da59ab3bf5c55ed6afd530d5bde0489e57dee Binary files /dev/null and b/realesrgan/data/__pycache__/realesrgan_dataset.cpython-38.pyc differ diff --git a/realesrgan/data/__pycache__/realesrgan_paired_dataset.cpython-38.pyc b/realesrgan/data/__pycache__/realesrgan_paired_dataset.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..2d8997835f3360a91c99670d0066009f021de3c7 Binary files /dev/null and b/realesrgan/data/__pycache__/realesrgan_paired_dataset.cpython-38.pyc differ diff --git a/realesrgan/data/realesrgan_dataset.py b/realesrgan/data/realesrgan_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..4cf2d9e6583a6789b771679734ce55bb8a22e628 --- /dev/null +++ b/realesrgan/data/realesrgan_dataset.py @@ -0,0 +1,192 @@ +import cv2 +import math +import numpy as np +import os +import os.path as osp +import random +import time +import torch +from basicsr.data.degradations import circular_lowpass_kernel, random_mixed_kernels +from basicsr.data.transforms import augment +from basicsr.utils import FileClient, get_root_logger, imfrombytes, img2tensor +from basicsr.utils.registry import DATASET_REGISTRY +from torch.utils import data as data + + +@DATASET_REGISTRY.register() +class RealESRGANDataset(data.Dataset): + """Dataset used for Real-ESRGAN model: + Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. + + It loads gt (Ground-Truth) images, and augments them. + It also generates blur kernels and sinc kernels for generating low-quality images. + Note that the low-quality images are processed in tensors on GPUS for faster processing. + + Args: + opt (dict): Config for train datasets. It contains the following keys: + dataroot_gt (str): Data root path for gt. + meta_info (str): Path for meta information file. + io_backend (dict): IO backend type and other kwarg. + use_hflip (bool): Use horizontal flips. + use_rot (bool): Use rotation (use vertical flip and transposing h and w for implementation). + Please see more options in the codes. + """ + + def __init__(self, opt): + super(RealESRGANDataset, self).__init__() + self.opt = opt + self.file_client = None + self.io_backend_opt = opt['io_backend'] + self.gt_folder = opt['dataroot_gt'] + + # file client (lmdb io backend) + if self.io_backend_opt['type'] == 'lmdb': + self.io_backend_opt['db_paths'] = [self.gt_folder] + self.io_backend_opt['client_keys'] = ['gt'] + if not self.gt_folder.endswith('.lmdb'): + raise ValueError(f"'dataroot_gt' should end with '.lmdb', but received {self.gt_folder}") + with open(osp.join(self.gt_folder, 'meta_info.txt')) as fin: + self.paths = [line.split('.')[0] for line in fin] + else: + # disk backend with meta_info + # Each line in the meta_info describes the relative path to an image + with open(self.opt['meta_info']) as fin: + paths = [line.strip().split(' ')[0] for line in fin] + self.paths = [os.path.join(self.gt_folder, v) for v in paths] + + # blur settings for the first degradation + self.blur_kernel_size = opt['blur_kernel_size'] + self.kernel_list = opt['kernel_list'] + self.kernel_prob = opt['kernel_prob'] # a list for each kernel probability + self.blur_sigma = opt['blur_sigma'] + self.betag_range = opt['betag_range'] # betag used in generalized Gaussian blur kernels + self.betap_range = opt['betap_range'] # betap used in plateau blur kernels + self.sinc_prob = opt['sinc_prob'] # the probability for sinc filters + + # blur settings for the second degradation + self.blur_kernel_size2 = opt['blur_kernel_size2'] + self.kernel_list2 = opt['kernel_list2'] + self.kernel_prob2 = opt['kernel_prob2'] + self.blur_sigma2 = opt['blur_sigma2'] + self.betag_range2 = opt['betag_range2'] + self.betap_range2 = opt['betap_range2'] + self.sinc_prob2 = opt['sinc_prob2'] + + # a final sinc filter + self.final_sinc_prob = opt['final_sinc_prob'] + + self.kernel_range = [2 * v + 1 for v in range(3, 11)] # kernel size ranges from 7 to 21 + # TODO: kernel range is now hard-coded, should be in the configure file + self.pulse_tensor = torch.zeros(21, 21).float() # convolving with pulse tensor brings no blurry effect + self.pulse_tensor[10, 10] = 1 + + def __getitem__(self, index): + if self.file_client is None: + self.file_client = FileClient(self.io_backend_opt.pop('type'), **self.io_backend_opt) + + # -------------------------------- Load gt images -------------------------------- # + # Shape: (h, w, c); channel order: BGR; image range: [0, 1], float32. + gt_path = self.paths[index] + # avoid errors caused by high latency in reading files + retry = 3 + while retry > 0: + try: + img_bytes = self.file_client.get(gt_path, 'gt') + except (IOError, OSError) as e: + logger = get_root_logger() + logger.warn(f'File client error: {e}, remaining retry times: {retry - 1}') + # change another file to read + index = random.randint(0, self.__len__()) + gt_path = self.paths[index] + time.sleep(1) # sleep 1s for occasional server congestion + else: + break + finally: + retry -= 1 + img_gt = imfrombytes(img_bytes, float32=True) + + # -------------------- Do augmentation for training: flip, rotation -------------------- # + img_gt = augment(img_gt, self.opt['use_hflip'], self.opt['use_rot']) + + # crop or pad to 400 + # TODO: 400 is hard-coded. You may change it accordingly + h, w = img_gt.shape[0:2] + crop_pad_size = 400 + # pad + if h < crop_pad_size or w < crop_pad_size: + pad_h = max(0, crop_pad_size - h) + pad_w = max(0, crop_pad_size - w) + img_gt = cv2.copyMakeBorder(img_gt, 0, pad_h, 0, pad_w, cv2.BORDER_REFLECT_101) + # crop + if img_gt.shape[0] > crop_pad_size or img_gt.shape[1] > crop_pad_size: + h, w = img_gt.shape[0:2] + # randomly choose top and left coordinates + top = random.randint(0, h - crop_pad_size) + left = random.randint(0, w - crop_pad_size) + img_gt = img_gt[top:top + crop_pad_size, left:left + crop_pad_size, ...] + + # ------------------------ Generate kernels (used in the first degradation) ------------------------ # + kernel_size = random.choice(self.kernel_range) + if np.random.uniform() < self.opt['sinc_prob']: + # this sinc filter setting is for kernels ranging from [7, 21] + if kernel_size < 13: + omega_c = np.random.uniform(np.pi / 3, np.pi) + else: + omega_c = np.random.uniform(np.pi / 5, np.pi) + kernel = circular_lowpass_kernel(omega_c, kernel_size, pad_to=False) + else: + kernel = random_mixed_kernels( + self.kernel_list, + self.kernel_prob, + kernel_size, + self.blur_sigma, + self.blur_sigma, [-math.pi, math.pi], + self.betag_range, + self.betap_range, + noise_range=None) + # pad kernel + pad_size = (21 - kernel_size) // 2 + kernel = np.pad(kernel, ((pad_size, pad_size), (pad_size, pad_size))) + + # ------------------------ Generate kernels (used in the second degradation) ------------------------ # + kernel_size = random.choice(self.kernel_range) + if np.random.uniform() < self.opt['sinc_prob2']: + if kernel_size < 13: + omega_c = np.random.uniform(np.pi / 3, np.pi) + else: + omega_c = np.random.uniform(np.pi / 5, np.pi) + kernel2 = circular_lowpass_kernel(omega_c, kernel_size, pad_to=False) + else: + kernel2 = random_mixed_kernels( + self.kernel_list2, + self.kernel_prob2, + kernel_size, + self.blur_sigma2, + self.blur_sigma2, [-math.pi, math.pi], + self.betag_range2, + self.betap_range2, + noise_range=None) + + # pad kernel + pad_size = (21 - kernel_size) // 2 + kernel2 = np.pad(kernel2, ((pad_size, pad_size), (pad_size, pad_size))) + + # ------------------------------------- the final sinc kernel ------------------------------------- # + if np.random.uniform() < self.opt['final_sinc_prob']: + kernel_size = random.choice(self.kernel_range) + omega_c = np.random.uniform(np.pi / 3, np.pi) + sinc_kernel = circular_lowpass_kernel(omega_c, kernel_size, pad_to=21) + sinc_kernel = torch.FloatTensor(sinc_kernel) + else: + sinc_kernel = self.pulse_tensor + + # BGR to RGB, HWC to CHW, numpy to tensor + img_gt = img2tensor([img_gt], bgr2rgb=True, float32=True)[0] + kernel = torch.FloatTensor(kernel) + kernel2 = torch.FloatTensor(kernel2) + + return_d = {'gt': img_gt, 'kernel1': kernel, 'kernel2': kernel2, 'sinc_kernel': sinc_kernel, 'gt_path': gt_path} + return return_d + + def __len__(self): + return len(self.paths) diff --git a/realesrgan/data/realesrgan_paired_dataset.py b/realesrgan/data/realesrgan_paired_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..386c8d72496245dae8df033c2ebbd76b41ff45f1 --- /dev/null +++ b/realesrgan/data/realesrgan_paired_dataset.py @@ -0,0 +1,108 @@ +import os +from basicsr.data.data_util import paired_paths_from_folder, paired_paths_from_lmdb +from basicsr.data.transforms import augment, paired_random_crop +from basicsr.utils import FileClient, imfrombytes, img2tensor +from basicsr.utils.registry import DATASET_REGISTRY +from torch.utils import data as data +from torchvision.transforms.functional import normalize + + +@DATASET_REGISTRY.register() +class RealESRGANPairedDataset(data.Dataset): + """Paired image dataset for image restoration. + + Read LQ (Low Quality, e.g. LR (Low Resolution), blurry, noisy, etc) and GT image pairs. + + There are three modes: + 1. 'lmdb': Use lmdb files. + If opt['io_backend'] == lmdb. + 2. 'meta_info': Use meta information file to generate paths. + If opt['io_backend'] != lmdb and opt['meta_info'] is not None. + 3. 'folder': Scan folders to generate paths. + The rest. + + Args: + opt (dict): Config for train datasets. It contains the following keys: + dataroot_gt (str): Data root path for gt. + dataroot_lq (str): Data root path for lq. + meta_info (str): Path for meta information file. + io_backend (dict): IO backend type and other kwarg. + filename_tmpl (str): Template for each filename. Note that the template excludes the file extension. + Default: '{}'. + gt_size (int): Cropped patched size for gt patches. + use_hflip (bool): Use horizontal flips. + use_rot (bool): Use rotation (use vertical flip and transposing h + and w for implementation). + + scale (bool): Scale, which will be added automatically. + phase (str): 'train' or 'val'. + """ + + def __init__(self, opt): + super(RealESRGANPairedDataset, self).__init__() + self.opt = opt + self.file_client = None + self.io_backend_opt = opt['io_backend'] + # mean and std for normalizing the input images + self.mean = opt['mean'] if 'mean' in opt else None + self.std = opt['std'] if 'std' in opt else None + + self.gt_folder, self.lq_folder = opt['dataroot_gt'], opt['dataroot_lq'] + self.filename_tmpl = opt['filename_tmpl'] if 'filename_tmpl' in opt else '{}' + + # file client (lmdb io backend) + if self.io_backend_opt['type'] == 'lmdb': + self.io_backend_opt['db_paths'] = [self.lq_folder, self.gt_folder] + self.io_backend_opt['client_keys'] = ['lq', 'gt'] + self.paths = paired_paths_from_lmdb([self.lq_folder, self.gt_folder], ['lq', 'gt']) + elif 'meta_info' in self.opt and self.opt['meta_info'] is not None: + # disk backend with meta_info + # Each line in the meta_info describes the relative path to an image + with open(self.opt['meta_info']) as fin: + paths = [line.strip() for line in fin] + self.paths = [] + for path in paths: + gt_path, lq_path = path.split(', ') + gt_path = os.path.join(self.gt_folder, gt_path) + lq_path = os.path.join(self.lq_folder, lq_path) + self.paths.append(dict([('gt_path', gt_path), ('lq_path', lq_path)])) + else: + # disk backend + # it will scan the whole folder to get meta info + # it will be time-consuming for folders with too many files. It is recommended using an extra meta txt file + self.paths = paired_paths_from_folder([self.lq_folder, self.gt_folder], ['lq', 'gt'], self.filename_tmpl) + + def __getitem__(self, index): + if self.file_client is None: + self.file_client = FileClient(self.io_backend_opt.pop('type'), **self.io_backend_opt) + + scale = self.opt['scale'] + + # Load gt and lq images. Dimension order: HWC; channel order: BGR; + # image range: [0, 1], float32. + gt_path = self.paths[index]['gt_path'] + img_bytes = self.file_client.get(gt_path, 'gt') + img_gt = imfrombytes(img_bytes, float32=True) + lq_path = self.paths[index]['lq_path'] + img_bytes = self.file_client.get(lq_path, 'lq') + img_lq = imfrombytes(img_bytes, float32=True) + + # augmentation for training + if self.opt['phase'] == 'train': + gt_size = self.opt['gt_size'] + # random crop + img_gt, img_lq = paired_random_crop(img_gt, img_lq, gt_size, scale, gt_path) + # flip, rotation + img_gt, img_lq = augment([img_gt, img_lq], self.opt['use_hflip'], self.opt['use_rot']) + + # BGR to RGB, HWC to CHW, numpy to tensor + img_gt, img_lq = img2tensor([img_gt, img_lq], bgr2rgb=True, float32=True) + # normalize + if self.mean is not None or self.std is not None: + normalize(img_lq, self.mean, self.std, inplace=True) + normalize(img_gt, self.mean, self.std, inplace=True) + + return {'lq': img_lq, 'gt': img_gt, 'lq_path': lq_path, 'gt_path': gt_path} + + def __len__(self): + return len(self.paths) diff --git a/realesrgan/models/__init__.py b/realesrgan/models/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..0be7105dc75d150c49976396724085f678dc0675 --- /dev/null +++ b/realesrgan/models/__init__.py @@ -0,0 +1,10 @@ +import importlib +from basicsr.utils import scandir +from os import path as osp + +# automatically scan and import model modules for registry +# scan all the files that end with '_model.py' under the model folder +model_folder = osp.dirname(osp.abspath(__file__)) +model_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(model_folder) if v.endswith('_model.py')] +# import all the model modules +_model_modules = [importlib.import_module(f'realesrgan.models.{file_name}') for file_name in model_filenames] diff --git a/realesrgan/models/__pycache__/__init__.cpython-38.pyc b/realesrgan/models/__pycache__/__init__.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7bbdc894c167afc64564d3baf6b42a37cbf2ed7f Binary files /dev/null and b/realesrgan/models/__pycache__/__init__.cpython-38.pyc differ diff --git a/realesrgan/models/__pycache__/realesrgan_model.cpython-38.pyc b/realesrgan/models/__pycache__/realesrgan_model.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..f963cfb60bfd4af2404ad5067f8c9e51deffb16a Binary files /dev/null and b/realesrgan/models/__pycache__/realesrgan_model.cpython-38.pyc differ diff --git a/realesrgan/models/__pycache__/realesrnet_model.cpython-38.pyc b/realesrgan/models/__pycache__/realesrnet_model.cpython-38.pyc new file mode 100644 index 0000000000000000000000000000000000000000..7bc64f77fe288753511fdf39ba242296893282a2 Binary files /dev/null and b/realesrgan/models/__pycache__/realesrnet_model.cpython-38.pyc differ diff --git a/realesrgan/models/realesrgan_model.py b/realesrgan/models/realesrgan_model.py new file mode 100644 index 0000000000000000000000000000000000000000..c298a09c42433177f90001a0a31d029576072ccd --- /dev/null +++ b/realesrgan/models/realesrgan_model.py @@ -0,0 +1,258 @@ +import numpy as np +import random +import torch +from basicsr.data.degradations import random_add_gaussian_noise_pt, random_add_poisson_noise_pt +from basicsr.data.transforms import paired_random_crop +from basicsr.models.srgan_model import SRGANModel +from basicsr.utils import DiffJPEG, USMSharp +from basicsr.utils.img_process_util import filter2D +from basicsr.utils.registry import MODEL_REGISTRY +from collections import OrderedDict +from torch.nn import functional as F + + +@MODEL_REGISTRY.register() +class RealESRGANModel(SRGANModel): + """RealESRGAN Model for Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. + + It mainly performs: + 1. randomly synthesize LQ images in GPU tensors + 2. optimize the networks with GAN training. + """ + + def __init__(self, opt): + super(RealESRGANModel, self).__init__(opt) + self.jpeger = DiffJPEG(differentiable=False).cuda() # simulate JPEG compression artifacts + self.usm_sharpener = USMSharp().cuda() # do usm sharpening + self.queue_size = opt.get('queue_size', 180) + + @torch.no_grad() + def _dequeue_and_enqueue(self): + """It is the training pair pool for increasing the diversity in a batch. + + Batch processing limits the diversity of synthetic degradations in a batch. For example, samples in a + batch could not have different resize scaling factors. Therefore, we employ this training pair pool + to increase the degradation diversity in a batch. + """ + # initialize + b, c, h, w = self.lq.size() + if not hasattr(self, 'queue_lr'): + assert self.queue_size % b == 0, f'queue size {self.queue_size} should be divisible by batch size {b}' + self.queue_lr = torch.zeros(self.queue_size, c, h, w).cuda() + _, c, h, w = self.gt.size() + self.queue_gt = torch.zeros(self.queue_size, c, h, w).cuda() + self.queue_ptr = 0 + if self.queue_ptr == self.queue_size: # the pool is full + # do dequeue and enqueue + # shuffle + idx = torch.randperm(self.queue_size) + self.queue_lr = self.queue_lr[idx] + self.queue_gt = self.queue_gt[idx] + # get first b samples + lq_dequeue = self.queue_lr[0:b, :, :, :].clone() + gt_dequeue = self.queue_gt[0:b, :, :, :].clone() + # update the queue + self.queue_lr[0:b, :, :, :] = self.lq.clone() + self.queue_gt[0:b, :, :, :] = self.gt.clone() + + self.lq = lq_dequeue + self.gt = gt_dequeue + else: + # only do enqueue + self.queue_lr[self.queue_ptr:self.queue_ptr + b, :, :, :] = self.lq.clone() + self.queue_gt[self.queue_ptr:self.queue_ptr + b, :, :, :] = self.gt.clone() + self.queue_ptr = self.queue_ptr + b + + @torch.no_grad() + def feed_data(self, data): + """Accept data from dataloader, and then add two-order degradations to obtain LQ images. + """ + if self.is_train and self.opt.get('high_order_degradation', True): + # training data synthesis + self.gt = data['gt'].to(self.device) + self.gt_usm = self.usm_sharpener(self.gt) + + self.kernel1 = data['kernel1'].to(self.device) + self.kernel2 = data['kernel2'].to(self.device) + self.sinc_kernel = data['sinc_kernel'].to(self.device) + + ori_h, ori_w = self.gt.size()[2:4] + + # ----------------------- The first degradation process ----------------------- # + # blur + out = filter2D(self.gt_usm, self.kernel1) + # random resize + updown_type = random.choices(['up', 'down', 'keep'], self.opt['resize_prob'])[0] + if updown_type == 'up': + scale = np.random.uniform(1, self.opt['resize_range'][1]) + elif updown_type == 'down': + scale = np.random.uniform(self.opt['resize_range'][0], 1) + else: + scale = 1 + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate(out, scale_factor=scale, mode=mode) + # add noise + gray_noise_prob = self.opt['gray_noise_prob'] + if np.random.uniform() < self.opt['gaussian_noise_prob']: + out = random_add_gaussian_noise_pt( + out, sigma_range=self.opt['noise_range'], clip=True, rounds=False, gray_prob=gray_noise_prob) + else: + out = random_add_poisson_noise_pt( + out, + scale_range=self.opt['poisson_scale_range'], + gray_prob=gray_noise_prob, + clip=True, + rounds=False) + # JPEG compression + jpeg_p = out.new_zeros(out.size(0)).uniform_(*self.opt['jpeg_range']) + out = torch.clamp(out, 0, 1) # clamp to [0, 1], otherwise JPEGer will result in unpleasant artifacts + out = self.jpeger(out, quality=jpeg_p) + + # ----------------------- The second degradation process ----------------------- # + # blur + if np.random.uniform() < self.opt['second_blur_prob']: + out = filter2D(out, self.kernel2) + # random resize + updown_type = random.choices(['up', 'down', 'keep'], self.opt['resize_prob2'])[0] + if updown_type == 'up': + scale = np.random.uniform(1, self.opt['resize_range2'][1]) + elif updown_type == 'down': + scale = np.random.uniform(self.opt['resize_range2'][0], 1) + else: + scale = 1 + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate( + out, size=(int(ori_h / self.opt['scale'] * scale), int(ori_w / self.opt['scale'] * scale)), mode=mode) + # add noise + gray_noise_prob = self.opt['gray_noise_prob2'] + if np.random.uniform() < self.opt['gaussian_noise_prob2']: + out = random_add_gaussian_noise_pt( + out, sigma_range=self.opt['noise_range2'], clip=True, rounds=False, gray_prob=gray_noise_prob) + else: + out = random_add_poisson_noise_pt( + out, + scale_range=self.opt['poisson_scale_range2'], + gray_prob=gray_noise_prob, + clip=True, + rounds=False) + + # JPEG compression + the final sinc filter + # We also need to resize images to desired sizes. We group [resize back + sinc filter] together + # as one operation. + # We consider two orders: + # 1. [resize back + sinc filter] + JPEG compression + # 2. JPEG compression + [resize back + sinc filter] + # Empirically, we find other combinations (sinc + JPEG + Resize) will introduce twisted lines. + if np.random.uniform() < 0.5: + # resize back + the final sinc filter + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate(out, size=(ori_h // self.opt['scale'], ori_w // self.opt['scale']), mode=mode) + out = filter2D(out, self.sinc_kernel) + # JPEG compression + jpeg_p = out.new_zeros(out.size(0)).uniform_(*self.opt['jpeg_range2']) + out = torch.clamp(out, 0, 1) + out = self.jpeger(out, quality=jpeg_p) + else: + # JPEG compression + jpeg_p = out.new_zeros(out.size(0)).uniform_(*self.opt['jpeg_range2']) + out = torch.clamp(out, 0, 1) + out = self.jpeger(out, quality=jpeg_p) + # resize back + the final sinc filter + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate(out, size=(ori_h // self.opt['scale'], ori_w // self.opt['scale']), mode=mode) + out = filter2D(out, self.sinc_kernel) + + # clamp and round + self.lq = torch.clamp((out * 255.0).round(), 0, 255) / 255. + + # random crop + gt_size = self.opt['gt_size'] + (self.gt, self.gt_usm), self.lq = paired_random_crop([self.gt, self.gt_usm], self.lq, gt_size, + self.opt['scale']) + + # training pair pool + self._dequeue_and_enqueue() + # sharpen self.gt again, as we have changed the self.gt with self._dequeue_and_enqueue + self.gt_usm = self.usm_sharpener(self.gt) + self.lq = self.lq.contiguous() # for the warning: grad and param do not obey the gradient layout contract + else: + # for paired training or validation + self.lq = data['lq'].to(self.device) + if 'gt' in data: + self.gt = data['gt'].to(self.device) + self.gt_usm = self.usm_sharpener(self.gt) + + def nondist_validation(self, dataloader, current_iter, tb_logger, save_img): + # do not use the synthetic process during validation + self.is_train = False + super(RealESRGANModel, self).nondist_validation(dataloader, current_iter, tb_logger, save_img) + self.is_train = True + + def optimize_parameters(self, current_iter): + # usm sharpening + l1_gt = self.gt_usm + percep_gt = self.gt_usm + gan_gt = self.gt_usm + if self.opt['l1_gt_usm'] is False: + l1_gt = self.gt + if self.opt['percep_gt_usm'] is False: + percep_gt = self.gt + if self.opt['gan_gt_usm'] is False: + gan_gt = self.gt + + # optimize net_g + for p in self.net_d.parameters(): + p.requires_grad = False + + self.optimizer_g.zero_grad() + self.output = self.net_g(self.lq) + + l_g_total = 0 + loss_dict = OrderedDict() + if (current_iter % self.net_d_iters == 0 and current_iter > self.net_d_init_iters): + # pixel loss + if self.cri_pix: + l_g_pix = self.cri_pix(self.output, l1_gt) + l_g_total += l_g_pix + loss_dict['l_g_pix'] = l_g_pix + # perceptual loss + if self.cri_perceptual: + l_g_percep, l_g_style = self.cri_perceptual(self.output, percep_gt) + if l_g_percep is not None: + l_g_total += l_g_percep + loss_dict['l_g_percep'] = l_g_percep + if l_g_style is not None: + l_g_total += l_g_style + loss_dict['l_g_style'] = l_g_style + # gan loss + fake_g_pred = self.net_d(self.output) + l_g_gan = self.cri_gan(fake_g_pred, True, is_disc=False) + l_g_total += l_g_gan + loss_dict['l_g_gan'] = l_g_gan + + l_g_total.backward() + self.optimizer_g.step() + + # optimize net_d + for p in self.net_d.parameters(): + p.requires_grad = True + + self.optimizer_d.zero_grad() + # real + real_d_pred = self.net_d(gan_gt) + l_d_real = self.cri_gan(real_d_pred, True, is_disc=True) + loss_dict['l_d_real'] = l_d_real + loss_dict['out_d_real'] = torch.mean(real_d_pred.detach()) + l_d_real.backward() + # fake + fake_d_pred = self.net_d(self.output.detach().clone()) # clone for pt1.9 + l_d_fake = self.cri_gan(fake_d_pred, False, is_disc=True) + loss_dict['l_d_fake'] = l_d_fake + loss_dict['out_d_fake'] = torch.mean(fake_d_pred.detach()) + l_d_fake.backward() + self.optimizer_d.step() + + if self.ema_decay > 0: + self.model_ema(decay=self.ema_decay) + + self.log_dict = self.reduce_loss_dict(loss_dict) diff --git a/realesrgan/models/realesrnet_model.py b/realesrgan/models/realesrnet_model.py new file mode 100644 index 0000000000000000000000000000000000000000..d11668f3712bffcd062c57db14d22ca3a0e1e59d --- /dev/null +++ b/realesrgan/models/realesrnet_model.py @@ -0,0 +1,188 @@ +import numpy as np +import random +import torch +from basicsr.data.degradations import random_add_gaussian_noise_pt, random_add_poisson_noise_pt +from basicsr.data.transforms import paired_random_crop +from basicsr.models.sr_model import SRModel +from basicsr.utils import DiffJPEG, USMSharp +from basicsr.utils.img_process_util import filter2D +from basicsr.utils.registry import MODEL_REGISTRY +from torch.nn import functional as F + + +@MODEL_REGISTRY.register() +class RealESRNetModel(SRModel): + """RealESRNet Model for Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. + + It is trained without GAN losses. + It mainly performs: + 1. randomly synthesize LQ images in GPU tensors + 2. optimize the networks with GAN training. + """ + + def __init__(self, opt): + super(RealESRNetModel, self).__init__(opt) + self.jpeger = DiffJPEG(differentiable=False).cuda() # simulate JPEG compression artifacts + self.usm_sharpener = USMSharp().cuda() # do usm sharpening + self.queue_size = opt.get('queue_size', 180) + + @torch.no_grad() + def _dequeue_and_enqueue(self): + """It is the training pair pool for increasing the diversity in a batch. + + Batch processing limits the diversity of synthetic degradations in a batch. For example, samples in a + batch could not have different resize scaling factors. Therefore, we employ this training pair pool + to increase the degradation diversity in a batch. + """ + # initialize + b, c, h, w = self.lq.size() + if not hasattr(self, 'queue_lr'): + assert self.queue_size % b == 0, f'queue size {self.queue_size} should be divisible by batch size {b}' + self.queue_lr = torch.zeros(self.queue_size, c, h, w).cuda() + _, c, h, w = self.gt.size() + self.queue_gt = torch.zeros(self.queue_size, c, h, w).cuda() + self.queue_ptr = 0 + if self.queue_ptr == self.queue_size: # the pool is full + # do dequeue and enqueue + # shuffle + idx = torch.randperm(self.queue_size) + self.queue_lr = self.queue_lr[idx] + self.queue_gt = self.queue_gt[idx] + # get first b samples + lq_dequeue = self.queue_lr[0:b, :, :, :].clone() + gt_dequeue = self.queue_gt[0:b, :, :, :].clone() + # update the queue + self.queue_lr[0:b, :, :, :] = self.lq.clone() + self.queue_gt[0:b, :, :, :] = self.gt.clone() + + self.lq = lq_dequeue + self.gt = gt_dequeue + else: + # only do enqueue + self.queue_lr[self.queue_ptr:self.queue_ptr + b, :, :, :] = self.lq.clone() + self.queue_gt[self.queue_ptr:self.queue_ptr + b, :, :, :] = self.gt.clone() + self.queue_ptr = self.queue_ptr + b + + @torch.no_grad() + def feed_data(self, data): + """Accept data from dataloader, and then add two-order degradations to obtain LQ images. + """ + if self.is_train and self.opt.get('high_order_degradation', True): + # training data synthesis + self.gt = data['gt'].to(self.device) + # USM sharpen the GT images + if self.opt['gt_usm'] is True: + self.gt = self.usm_sharpener(self.gt) + + self.kernel1 = data['kernel1'].to(self.device) + self.kernel2 = data['kernel2'].to(self.device) + self.sinc_kernel = data['sinc_kernel'].to(self.device) + + ori_h, ori_w = self.gt.size()[2:4] + + # ----------------------- The first degradation process ----------------------- # + # blur + out = filter2D(self.gt, self.kernel1) + # random resize + updown_type = random.choices(['up', 'down', 'keep'], self.opt['resize_prob'])[0] + if updown_type == 'up': + scale = np.random.uniform(1, self.opt['resize_range'][1]) + elif updown_type == 'down': + scale = np.random.uniform(self.opt['resize_range'][0], 1) + else: + scale = 1 + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate(out, scale_factor=scale, mode=mode) + # add noise + gray_noise_prob = self.opt['gray_noise_prob'] + if np.random.uniform() < self.opt['gaussian_noise_prob']: + out = random_add_gaussian_noise_pt( + out, sigma_range=self.opt['noise_range'], clip=True, rounds=False, gray_prob=gray_noise_prob) + else: + out = random_add_poisson_noise_pt( + out, + scale_range=self.opt['poisson_scale_range'], + gray_prob=gray_noise_prob, + clip=True, + rounds=False) + # JPEG compression + jpeg_p = out.new_zeros(out.size(0)).uniform_(*self.opt['jpeg_range']) + out = torch.clamp(out, 0, 1) # clamp to [0, 1], otherwise JPEGer will result in unpleasant artifacts + out = self.jpeger(out, quality=jpeg_p) + + # ----------------------- The second degradation process ----------------------- # + # blur + if np.random.uniform() < self.opt['second_blur_prob']: + out = filter2D(out, self.kernel2) + # random resize + updown_type = random.choices(['up', 'down', 'keep'], self.opt['resize_prob2'])[0] + if updown_type == 'up': + scale = np.random.uniform(1, self.opt['resize_range2'][1]) + elif updown_type == 'down': + scale = np.random.uniform(self.opt['resize_range2'][0], 1) + else: + scale = 1 + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate( + out, size=(int(ori_h / self.opt['scale'] * scale), int(ori_w / self.opt['scale'] * scale)), mode=mode) + # add noise + gray_noise_prob = self.opt['gray_noise_prob2'] + if np.random.uniform() < self.opt['gaussian_noise_prob2']: + out = random_add_gaussian_noise_pt( + out, sigma_range=self.opt['noise_range2'], clip=True, rounds=False, gray_prob=gray_noise_prob) + else: + out = random_add_poisson_noise_pt( + out, + scale_range=self.opt['poisson_scale_range2'], + gray_prob=gray_noise_prob, + clip=True, + rounds=False) + + # JPEG compression + the final sinc filter + # We also need to resize images to desired sizes. We group [resize back + sinc filter] together + # as one operation. + # We consider two orders: + # 1. [resize back + sinc filter] + JPEG compression + # 2. JPEG compression + [resize back + sinc filter] + # Empirically, we find other combinations (sinc + JPEG + Resize) will introduce twisted lines. + if np.random.uniform() < 0.5: + # resize back + the final sinc filter + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate(out, size=(ori_h // self.opt['scale'], ori_w // self.opt['scale']), mode=mode) + out = filter2D(out, self.sinc_kernel) + # JPEG compression + jpeg_p = out.new_zeros(out.size(0)).uniform_(*self.opt['jpeg_range2']) + out = torch.clamp(out, 0, 1) + out = self.jpeger(out, quality=jpeg_p) + else: + # JPEG compression + jpeg_p = out.new_zeros(out.size(0)).uniform_(*self.opt['jpeg_range2']) + out = torch.clamp(out, 0, 1) + out = self.jpeger(out, quality=jpeg_p) + # resize back + the final sinc filter + mode = random.choice(['area', 'bilinear', 'bicubic']) + out = F.interpolate(out, size=(ori_h // self.opt['scale'], ori_w // self.opt['scale']), mode=mode) + out = filter2D(out, self.sinc_kernel) + + # clamp and round + self.lq = torch.clamp((out * 255.0).round(), 0, 255) / 255. + + # random crop + gt_size = self.opt['gt_size'] + self.gt, self.lq = paired_random_crop(self.gt, self.lq, gt_size, self.opt['scale']) + + # training pair pool + self._dequeue_and_enqueue() + self.lq = self.lq.contiguous() # for the warning: grad and param do not obey the gradient layout contract + else: + # for paired training or validation + self.lq = data['lq'].to(self.device) + if 'gt' in data: + self.gt = data['gt'].to(self.device) + self.gt_usm = self.usm_sharpener(self.gt) + + def nondist_validation(self, dataloader, current_iter, tb_logger, save_img): + # do not use the synthetic process during validation + self.is_train = False + super(RealESRNetModel, self).nondist_validation(dataloader, current_iter, tb_logger, save_img) + self.is_train = True diff --git a/realesrgan/train.py b/realesrgan/train.py new file mode 100644 index 0000000000000000000000000000000000000000..8a9cec9ed80d9f362984779548dcec921a636a04 --- /dev/null +++ b/realesrgan/train.py @@ -0,0 +1,11 @@ +# flake8: noqa +import os.path as osp +from basicsr.train import train_pipeline + +import realesrgan.archs +import realesrgan.data +import realesrgan.models + +if __name__ == '__main__': + root_path = osp.abspath(osp.join(__file__, osp.pardir, osp.pardir)) + train_pipeline(root_path) diff --git a/realesrgan/utils.py b/realesrgan/utils.py new file mode 100644 index 0000000000000000000000000000000000000000..67e5232d61e93807f22b052499a733cd348a61a0 --- /dev/null +++ b/realesrgan/utils.py @@ -0,0 +1,313 @@ +import cv2 +import math +import numpy as np +import os +import queue +import threading +import torch +from basicsr.utils.download_util import load_file_from_url +from torch.nn import functional as F + +ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) + + +class RealESRGANer(): + """A helper class for upsampling images with RealESRGAN. + + Args: + scale (int): Upsampling scale factor used in the networks. It is usually 2 or 4. + model_path (str): The path to the pretrained model. It can be urls (will first download it automatically). + model (nn.Module): The defined network. Default: None. + tile (int): As too large images result in the out of GPU memory issue, so this tile option will first crop + input images into tiles, and then process each of them. Finally, they will be merged into one image. + 0 denotes for do not use tile. Default: 0. + tile_pad (int): The pad size for each tile, to remove border artifacts. Default: 10. + pre_pad (int): Pad the input images to avoid border artifacts. Default: 10. + half (float): Whether to use half precision during inference. Default: False. + """ + + def __init__(self, + scale, + model_path, + dni_weight=None, + model=None, + tile=0, + tile_pad=10, + pre_pad=10, + half=False, + device=None, + gpu_id=None): + self.scale = scale + self.tile_size = tile + self.tile_pad = tile_pad + self.pre_pad = pre_pad + self.mod_scale = None + self.half = half + + # initialize model + if gpu_id: + self.device = torch.device( + f'cuda:{gpu_id}' if torch.cuda.is_available() else 'cpu') if device is None else device + else: + self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') if device is None else device + + if isinstance(model_path, list): + # dni + assert len(model_path) == len(dni_weight), 'model_path and dni_weight should have the save length.' + loadnet = self.dni(model_path[0], model_path[1], dni_weight) + else: + # if the model_path starts with https, it will first download models to the folder: weights + if model_path.startswith('https://'): + model_path = load_file_from_url( + url=model_path, model_dir=os.path.join(ROOT_DIR, 'weights'), progress=True, file_name=None) + loadnet = torch.load(model_path, map_location=torch.device('cpu')) + + # prefer to use params_ema + if 'params_ema' in loadnet: + keyname = 'params_ema' + else: + keyname = 'params' + model.load_state_dict(loadnet[keyname], strict=True) + + model.eval() + self.model = model.to(self.device) + if self.half: + self.model = self.model.half() + + def dni(self, net_a, net_b, dni_weight, key='params', loc='cpu'): + """Deep network interpolation. + + ``Paper: Deep Network Interpolation for Continuous Imagery Effect Transition`` + """ + net_a = torch.load(net_a, map_location=torch.device(loc)) + net_b = torch.load(net_b, map_location=torch.device(loc)) + for k, v_a in net_a[key].items(): + net_a[key][k] = dni_weight[0] * v_a + dni_weight[1] * net_b[key][k] + return net_a + + def pre_process(self, img): + """Pre-process, such as pre-pad and mod pad, so that the images can be divisible + """ + img = torch.from_numpy(np.transpose(img, (2, 0, 1))).float() + self.img = img.unsqueeze(0).to(self.device) + if self.half: + self.img = self.img.half() + + # pre_pad + if self.pre_pad != 0: + self.img = F.pad(self.img, (0, self.pre_pad, 0, self.pre_pad), 'reflect') + # mod pad for divisible borders + if self.scale == 2: + self.mod_scale = 2 + elif self.scale == 1: + self.mod_scale = 4 + if self.mod_scale is not None: + self.mod_pad_h, self.mod_pad_w = 0, 0 + _, _, h, w = self.img.size() + if (h % self.mod_scale != 0): + self.mod_pad_h = (self.mod_scale - h % self.mod_scale) + if (w % self.mod_scale != 0): + self.mod_pad_w = (self.mod_scale - w % self.mod_scale) + self.img = F.pad(self.img, (0, self.mod_pad_w, 0, self.mod_pad_h), 'reflect') + + def process(self): + # model inference + self.output = self.model(self.img) + + def tile_process(self): + """It will first crop input images to tiles, and then process each tile. + Finally, all the processed tiles are merged into one images. + + Modified from: https://github.com/ata4/esrgan-launcher + """ + batch, channel, height, width = self.img.shape + output_height = height * self.scale + output_width = width * self.scale + output_shape = (batch, channel, output_height, output_width) + + # start with black image + self.output = self.img.new_zeros(output_shape) + tiles_x = math.ceil(width / self.tile_size) + tiles_y = math.ceil(height / self.tile_size) + + # loop over all tiles + for y in range(tiles_y): + for x in range(tiles_x): + # extract tile from input image + ofs_x = x * self.tile_size + ofs_y = y * self.tile_size + # input tile area on total image + input_start_x = ofs_x + input_end_x = min(ofs_x + self.tile_size, width) + input_start_y = ofs_y + input_end_y = min(ofs_y + self.tile_size, height) + + # input tile area on total image with padding + input_start_x_pad = max(input_start_x - self.tile_pad, 0) + input_end_x_pad = min(input_end_x + self.tile_pad, width) + input_start_y_pad = max(input_start_y - self.tile_pad, 0) + input_end_y_pad = min(input_end_y + self.tile_pad, height) + + # input tile dimensions + input_tile_width = input_end_x - input_start_x + input_tile_height = input_end_y - input_start_y + tile_idx = y * tiles_x + x + 1 + input_tile = self.img[:, :, input_start_y_pad:input_end_y_pad, input_start_x_pad:input_end_x_pad] + + # upscale tile + try: + with torch.no_grad(): + output_tile = self.model(input_tile) + except RuntimeError as error: + print('Error', error) + print(f'\tTile {tile_idx}/{tiles_x * tiles_y}') + + # output tile area on total image + output_start_x = input_start_x * self.scale + output_end_x = input_end_x * self.scale + output_start_y = input_start_y * self.scale + output_end_y = input_end_y * self.scale + + # output tile area without padding + output_start_x_tile = (input_start_x - input_start_x_pad) * self.scale + output_end_x_tile = output_start_x_tile + input_tile_width * self.scale + output_start_y_tile = (input_start_y - input_start_y_pad) * self.scale + output_end_y_tile = output_start_y_tile + input_tile_height * self.scale + + # put tile into output image + self.output[:, :, output_start_y:output_end_y, + output_start_x:output_end_x] = output_tile[:, :, output_start_y_tile:output_end_y_tile, + output_start_x_tile:output_end_x_tile] + + def post_process(self): + # remove extra pad + if self.mod_scale is not None: + _, _, h, w = self.output.size() + self.output = self.output[:, :, 0:h - self.mod_pad_h * self.scale, 0:w - self.mod_pad_w * self.scale] + # remove prepad + if self.pre_pad != 0: + _, _, h, w = self.output.size() + self.output = self.output[:, :, 0:h - self.pre_pad * self.scale, 0:w - self.pre_pad * self.scale] + return self.output + + @torch.no_grad() + def enhance(self, img, outscale=None, alpha_upsampler='realesrgan'): + h_input, w_input = img.shape[0:2] + # img: numpy + img = img.astype(np.float32) + if np.max(img) > 256: # 16-bit image + max_range = 65535 + print('\tInput is a 16-bit image') + else: + max_range = 255 + img = img / max_range + if len(img.shape) == 2: # gray image + img_mode = 'L' + img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB) + elif img.shape[2] == 4: # RGBA image with alpha channel + img_mode = 'RGBA' + alpha = img[:, :, 3] + img = img[:, :, 0:3] + img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) + if alpha_upsampler == 'realesrgan': + alpha = cv2.cvtColor(alpha, cv2.COLOR_GRAY2RGB) + else: + img_mode = 'RGB' + img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) + + # ------------------- process image (without the alpha channel) ------------------- # + self.pre_process(img) + if self.tile_size > 0: + self.tile_process() + else: + self.process() + output_img = self.post_process() + output_img = output_img.data.squeeze().float().cpu().clamp_(0, 1).numpy() + output_img = np.transpose(output_img[[2, 1, 0], :, :], (1, 2, 0)) + if img_mode == 'L': + output_img = cv2.cvtColor(output_img, cv2.COLOR_BGR2GRAY) + + # ------------------- process the alpha channel if necessary ------------------- # + if img_mode == 'RGBA': + if alpha_upsampler == 'realesrgan': + self.pre_process(alpha) + if self.tile_size > 0: + self.tile_process() + else: + self.process() + output_alpha = self.post_process() + output_alpha = output_alpha.data.squeeze().float().cpu().clamp_(0, 1).numpy() + output_alpha = np.transpose(output_alpha[[2, 1, 0], :, :], (1, 2, 0)) + output_alpha = cv2.cvtColor(output_alpha, cv2.COLOR_BGR2GRAY) + else: # use the cv2 resize for alpha channel + h, w = alpha.shape[0:2] + output_alpha = cv2.resize(alpha, (w * self.scale, h * self.scale), interpolation=cv2.INTER_LINEAR) + + # merge the alpha channel + output_img = cv2.cvtColor(output_img, cv2.COLOR_BGR2BGRA) + output_img[:, :, 3] = output_alpha + + # ------------------------------ return ------------------------------ # + if max_range == 65535: # 16-bit image + output = (output_img * 65535.0).round().astype(np.uint16) + else: + output = (output_img * 255.0).round().astype(np.uint8) + + if outscale is not None and outscale != float(self.scale): + output = cv2.resize( + output, ( + int(w_input * outscale), + int(h_input * outscale), + ), interpolation=cv2.INTER_LANCZOS4) + + return output, img_mode + + +class PrefetchReader(threading.Thread): + """Prefetch images. + + Args: + img_list (list[str]): A image list of image paths to be read. + num_prefetch_queue (int): Number of prefetch queue. + """ + + def __init__(self, img_list, num_prefetch_queue): + super().__init__() + self.que = queue.Queue(num_prefetch_queue) + self.img_list = img_list + + def run(self): + for img_path in self.img_list: + img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED) + self.que.put(img) + + self.que.put(None) + + def __next__(self): + next_item = self.que.get() + if next_item is None: + raise StopIteration + return next_item + + def __iter__(self): + return self + + +class IOConsumer(threading.Thread): + + def __init__(self, opt, que, qid): + super().__init__() + self._queue = que + self.qid = qid + self.opt = opt + + def run(self): + while True: + msg = self._queue.get() + if isinstance(msg, str) and msg == 'quit': + break + + output = msg['output'] + save_path = msg['save_path'] + cv2.imwrite(save_path, output) + print(f'IO worker {self.qid} is done.') diff --git a/realesrgan/version.py b/realesrgan/version.py new file mode 100644 index 0000000000000000000000000000000000000000..b175d1fd9c6b84eab0509d9facb60768e72a91b0 --- /dev/null +++ b/realesrgan/version.py @@ -0,0 +1,5 @@ +# GENERATED VERSION FILE +# TIME: Fri May 24 21:09:51 2024 +__version__ = '0.3.0' +__gitsha__ = 'a4abfb2' +version_info = (0, 3, 0) diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..24d3fff67b8feea53c9f1eef5ec0c8638da461ba --- /dev/null +++ b/requirements.txt @@ -0,0 +1,11 @@ +new-basicsr +facexlib>=0.2.5 +gfpgan>=1.3.5 +numpy +opencv-python +Pillow +torch>=1.7 +torchvision +tqdm +streamlit +opencv-python \ No newline at end of file diff --git a/run.bat b/run.bat new file mode 100644 index 0000000000000000000000000000000000000000..e1b7506e705acb86e48e9310121b5c4c4598ad68 --- /dev/null +++ b/run.bat @@ -0,0 +1,7 @@ +@echo off +REM Activate the cuda environment +call "%USERPROFILE%\anaconda3\Scripts\activate.bat" cuda +REM Change directory to Real-ESRGAN-Web-App +cd /d %USERPROFILE%\Real-ESRGAN-Web-App +REM Run Streamlit app +streamlit run app.py \ No newline at end of file diff --git a/setup.py b/setup.py new file mode 100644 index 0000000000000000000000000000000000000000..c2b92e31d2db1aba50767f4f844540cfd53c609d --- /dev/null +++ b/setup.py @@ -0,0 +1,107 @@ +#!/usr/bin/env python + +from setuptools import find_packages, setup + +import os +import subprocess +import time + +version_file = 'realesrgan/version.py' + + +def readme(): + with open('README.md', encoding='utf-8') as f: + content = f.read() + return content + + +def get_git_hash(): + + def _minimal_ext_cmd(cmd): + # construct minimal environment + env = {} + for k in ['SYSTEMROOT', 'PATH', 'HOME']: + v = os.environ.get(k) + if v is not None: + env[k] = v + # LANGUAGE is used on win32 + env['LANGUAGE'] = 'C' + env['LANG'] = 'C' + env['LC_ALL'] = 'C' + out = subprocess.Popen(cmd, stdout=subprocess.PIPE, env=env).communicate()[0] + return out + + try: + out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD']) + sha = out.strip().decode('ascii') + except OSError: + sha = 'unknown' + + return sha + + +def get_hash(): + if os.path.exists('.git'): + sha = get_git_hash()[:7] + else: + sha = 'unknown' + + return sha + + +def write_version_py(): + content = """# GENERATED VERSION FILE +# TIME: {} +__version__ = '{}' +__gitsha__ = '{}' +version_info = ({}) +""" + sha = get_hash() + with open('VERSION', 'r') as f: + SHORT_VERSION = f.read().strip() + VERSION_INFO = ', '.join([x if x.isdigit() else f'"{x}"' for x in SHORT_VERSION.split('.')]) + + version_file_str = content.format(time.asctime(), SHORT_VERSION, sha, VERSION_INFO) + with open(version_file, 'w') as f: + f.write(version_file_str) + + +def get_version(): + with open(version_file, 'r') as f: + exec(compile(f.read(), version_file, 'exec')) + return locals()['__version__'] + + +def get_requirements(filename='requirements.txt'): + here = os.path.dirname(os.path.realpath(__file__)) + with open(os.path.join(here, filename), 'r') as f: + requires = [line.replace('\n', '') for line in f.readlines()] + return requires + + +if __name__ == '__main__': + write_version_py() + setup( + name='realesrgan', + version=get_version(), + description='Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration', + long_description=readme(), + long_description_content_type='text/markdown', + author='Xintao Wang', + author_email='xintao.wang@outlook.com', + keywords='computer vision, pytorch, image restoration, super-resolution, esrgan, real-esrgan', + url='https://github.com/xinntao/Real-ESRGAN', + include_package_data=True, + packages=find_packages(exclude=('options', 'datasets', 'experiments', 'results', 'tb_logger', 'wandb')), + classifiers=[ + 'Development Status :: 4 - Beta', + 'License :: OSI Approved :: Apache Software License', + 'Operating System :: OS Independent', + 'Programming Language :: Python :: 3', + 'Programming Language :: Python :: 3.7', + 'Programming Language :: Python :: 3.8', + ], + license='BSD-3-Clause License', + setup_requires=['cython', 'numpy'], + install_requires=get_requirements(), + zip_safe=False) diff --git a/setup.txt b/setup.txt new file mode 100644 index 0000000000000000000000000000000000000000..9ca18925fc51af2111fa248d9deee6c434ba620e --- /dev/null +++ b/setup.txt @@ -0,0 +1,9 @@ +#clode repo real-esrgan and setup env +1. git clone https://github.com/xinntao/Real-ESRGAN.git +2. cd Real-ESRGAN +3. pip install basicsr +4. pip install facexlib +5. pip install gfpgan +6. pip install -r requirements.txt +7. python setup.py develop +8. pip install streamlit \ No newline at end of file diff --git a/temp/input_image.png b/temp/input_image.png new file mode 100644 index 0000000000000000000000000000000000000000..5d256506b7da23cd37538f6112b821dcfb0308ed Binary files /dev/null and b/temp/input_image.png differ diff --git a/temp/output_image.png b/temp/output_image.png new file mode 100644 index 0000000000000000000000000000000000000000..fccf88e7dc07c987c7795e566dc8e979c7883870 Binary files /dev/null and b/temp/output_image.png differ diff --git a/weights/README.md b/weights/README.md new file mode 100644 index 0000000000000000000000000000000000000000..4d7b7e642591ef88575d9e6c360a4d29e0cc1a4f --- /dev/null +++ b/weights/README.md @@ -0,0 +1,3 @@ +# Weights + +Put the downloaded weights to this folder. diff --git a/weights/RealESRGAN_x2plus.pth b/weights/RealESRGAN_x2plus.pth new file mode 100644 index 0000000000000000000000000000000000000000..77cc0ef1e8d238fa5cfb409cda2e619a9459ddc9 --- /dev/null +++ b/weights/RealESRGAN_x2plus.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:49fafd45f8fd7aa8d31ab2a22d14d91b536c34494a5cfe31eb5d89c2fa266abb +size 67061725 diff --git a/weights/RealESRGAN_x4plus.pth b/weights/RealESRGAN_x4plus.pth new file mode 100644 index 0000000000000000000000000000000000000000..9ddced536d07803300536317fef662bb499bca71 --- /dev/null +++ b/weights/RealESRGAN_x4plus.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4fa0d38905f75ac06eb49a7951b426670021be3018265fd191d2125df9d682f1 +size 67040989 diff --git a/weights/RealESRGAN_x4plus_anime_6B.pth b/weights/RealESRGAN_x4plus_anime_6B.pth new file mode 100644 index 0000000000000000000000000000000000000000..1f04b81349b49d9c8ebd211d5baa9728d30ee798 --- /dev/null +++ b/weights/RealESRGAN_x4plus_anime_6B.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f872d837d3c90ed2e05227bed711af5671a6fd1c9f7d7e91c911a61f155e99da +size 17938799