From 8b67c262af3a1780cf4192a2f2d22e45537d8ace Mon Sep 17 00:00:00 2001 From: xAlpharax <42233094+xAlpharax@users.noreply.github.com> Date: Wed, 3 Jan 2024 19:05:20 +0200 Subject: Aesthetic changes to the README and a comment on using -crf in ffmpeg. Changes to be committed: modified: README.md modified: stylize.sh --- README.md | 27 ++++++++++++++++++--------- stylize.sh | 2 +- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 50c7ec1..251eacd 100644 --- a/README.md +++ b/README.md @@ -4,13 +4,13 @@ Neural Style Transfer done from the CLI using a VGG backbone and presented as an Weights can be downloaded from [here](https://files.catbox.moe/wcao20.pth). The downloaded file (renamed to `vgg_conv_weights.pth`) should be placed in `./weights/` and it will be ignored when pushing, as seen in `./.gitignore`. **Update:** Alternatively, if the `./weights/` directory is empty, `./neuralart.py` will automatically download publicly available VGG19 weights for the user. -More in depth information about Neural Style Transfer ( NST ) can be found in this great [paper](https://arxiv.org/abs/1705.04058). Make sure to check [Requirements](#requirements) and [Usage](#usage). +More in depth information about Neural Style Transfer ( NST ) can be found in this great [paper](https://arxiv.org/abs/1705.04058). Make sure to check [Requirements](#requirements) and [Usage](#usage) as well as the [Video Gallery](#results-after-running-neural-art-click-on-dropdown-for-video-gallery). -### Why use this in 2023 ? +### Why use this in 2024 ? Because Style Transfer hasn't changed drastically in terms of actual results in the past years. I personally find a certain beauty in inputting a style and content image rather than a well curated prompt with a dozen of switches. Consider this repo as a quick and simple ***just works*** solution that can run on both CPU and GPU effectively. -I developed this tool as a means to obtain fancy images and visuals for me and my friends. It somehow grew into something bigger that is actually usable, so much so that I got to integrate it in a workflow in conjunction with [Stable Diffusion](https://github.com/CompVis/stable-diffusion) ( see also [here](https://github.com/AUTOMATIC1111/stable-diffusion-webui) ). +I developed this tool as a means to obtain fancy images and visuals for me and my friends. It somehow grew into something bigger that is actually usable, so much so that I got to integrate it in a workflow in conjunction with [Stable Diffusion](https://github.com/CompVis/stable-diffusion) ( see also [here](https://github.com/AUTOMATIC1111/stable-diffusion-webui) ) which I want to develop a plugin for. ## Requirements @@ -18,6 +18,9 @@ Clone the repository: ```bash git clone https://github.com/xAlpharax/neural-art + +# or via ssh +git clone git@github.com:xAlpharax/neural-art.git ``` Create a virtual environment to separate the required packages from system-wide packages: @@ -42,7 +45,7 @@ pip install -r requirements.txt ## Usage -The main script sits comfortably in `./stylize.sh` so run it from the project's directory: +The main script sits comfortably in `./stylize.sh`, run it from the project's root directory: ```bash ./stylize.sh path/to/style_image path/to/content_image @@ -54,21 +57,24 @@ A helper script is also available to run `./stylize.sh` for each distinct pair o ./all.sh ``` -Moreover, `./all.sh` is aware of the already rendered mp4 files and will skip stylizing the combinations that are already present. +Moreover, `./all.sh` is aware of the already rendered mp4 files and will skip stylizing the combinations that are already present. In contrast, `./stylize.sh` overwrites images and videos. ### Output videos / images and temporary files -If, at any point, curious of the individual frames that comprise the generated `./content_in_style.mp4` check `./Output/` for PNG images with exactly that. Keep in mind that these files get removed and overwritten each time ./stylize.sh is called ( this is also why running multiple instances of `./stylize.sh` is advised against; if you need to batch/automate the process, try `./all.sh` ) +The stylization process outputs a video in the format `./content_in_style.mp4` with `content` and `style` being the 2nd and 1st command line arguments of the `./stylize.sh` script. + +If, at any point, you need the individual frames that comprise the generated `./content_in_style.mp4`, check the `./Output/` directory for `.png` images with frames at each iteration. +The `./neuralart.py` code that sits at the heart of this project generates raw numpy array data to `./images.npy` which in turn is manipulated by `./renderer.py` to output frames as `.png` images. -The `./images.npy` file contains raw numpy array data generated by `./neuralart.py` and is manipulated by `./renderer.py` to achieve the `./Output` directory of PNG images. +These intermediary outputs are temporarily stored and get removed each time the `./stylize.sh` script is run. -Considering this workflow, `./clear_dir.sh` removes temporary files each time a new call to `./stylize.sh` is made. +All the stylize combinations from the `./Images/` directory have been saved to [this archive](https://drive.google.com/file/d/1k_ECmiHe3l0uS0ps2faWk8PHAOaNYZPp). Check the video gallery below to go through some of them that look the best:
-

Results after running neural-art (Click on dropdown for video gallery)

+

Results - Click on dropdown menu for video gallery

Starry Night in various other styles 8 @@ -244,6 +250,9 @@ https://github.com/xAlpharax/neural-art/assets/42233094/36dd6f3f-aca6-4e0f-8be8- https://github.com/xAlpharax/neural-art/assets/42233094/3bd2433a-54d1-40e2-9d00-cb165f6a2985 + +Tarantula reference:) + https://github.com/xAlpharax/neural-art/assets/42233094/555a4675-da19-4fa6-9104-1ee2c63a7f8b diff --git a/stylize.sh b/stylize.sh index 6e377b2..1dcbad5 100755 --- a/stylize.sh +++ b/stylize.sh @@ -38,4 +38,4 @@ python renderer.py --fix #ffmpeg -y -framerate 60 -pattern_type glob -i 'Output/neural_art_*.png' -c:v libsvtav1 -pix_fmt yuv420p -vf "pad=ceil(iw/2)*2:ceil(ih/2)*2" $(basename ${2%.*})'_in_'$(basename ${1%.*})'.mp4' # AV1 weirddd ffmpeg -y -framerate 60 -pattern_type glob -i 'Output/neural_art_*.png' -c:v libvpx-vp9 -pix_fmt yuv420p -vf "pad=ceil(iw/2)*2:ceil(ih/2)*2" $(basename ${2%.*})'_in_'$(basename ${1%.*})'.mp4' # VP9 quality -# try -crf 10 (sane for vp9) +# -crf defaults to 32 which is ok for vp9, change it to be lower with the -crf flag if you *really* need that higher fidelity and bigger filesize, i found it to be unnecessary -- cgit v1.2.3