diff options
author | xAlpharax <42233094+xAlpharax@users.noreply.github.com> | 2024-01-03 19:05:20 +0200 |
---|---|---|
committer | xAlpharax <42233094+xAlpharax@users.noreply.github.com> | 2024-01-03 19:05:20 +0200 |
commit | 8b67c262af3a1780cf4192a2f2d22e45537d8ace (patch) | |
tree | 6b2a0a743f1d8aa8f7c45d112c94481a95cdfb5d | |
parent | 447b6c6c921f912e5323af2ed18160d63931533f (diff) |
Aesthetic changes to the README and a comment on using -crf in ffmpeg.
Changes to be committed:
modified: README.md
modified: stylize.sh
-rw-r--r-- | README.md | 27 | ||||
-rwxr-xr-x | stylize.sh | 2 |
2 files changed, 19 insertions, 10 deletions
@@ -4,13 +4,13 @@ Neural Style Transfer done from the CLI using a VGG backbone and presented as an Weights can be downloaded from [here](https://files.catbox.moe/wcao20.pth). The downloaded file (renamed to `vgg_conv_weights.pth`) should be placed in `./weights/` and it will be ignored when pushing, as seen in `./.gitignore`. **Update:** Alternatively, if the `./weights/` directory is empty, `./neuralart.py` will automatically download publicly available VGG19 weights for the user. -More in depth information about Neural Style Transfer ( NST ) can be found in this great [paper](https://arxiv.org/abs/1705.04058). Make sure to check [Requirements](#requirements) and [Usage](#usage). +More in depth information about Neural Style Transfer ( NST ) can be found in this great [paper](https://arxiv.org/abs/1705.04058). Make sure to check [Requirements](#requirements) and [Usage](#usage) as well as the [Video Gallery](#results-after-running-neural-art-click-on-dropdown-for-video-gallery). -### Why use this in 2023 ? +### Why use this in 2024 ? Because Style Transfer hasn't changed drastically in terms of actual results in the past years. I personally find a certain beauty in inputting a style and content image rather than a well curated prompt with a dozen of switches. Consider this repo as a quick and simple ***just works*** solution that can run on both CPU and GPU effectively. -I developed this tool as a means to obtain fancy images and visuals for me and my friends. It somehow grew into something bigger that is actually usable, so much so that I got to integrate it in a workflow in conjunction with [Stable Diffusion](https://github.com/CompVis/stable-diffusion) ( see also [here](https://github.com/AUTOMATIC1111/stable-diffusion-webui) ). +I developed this tool as a means to obtain fancy images and visuals for me and my friends. It somehow grew into something bigger that is actually usable, so much so that I got to integrate it in a workflow in conjunction with [Stable Diffusion](https://github.com/CompVis/stable-diffusion) ( see also [here](https://github.com/AUTOMATIC1111/stable-diffusion-webui) ) which I want to develop a plugin for. ## Requirements @@ -18,6 +18,9 @@ Clone the repository: ```bash git clone https://github.com/xAlpharax/neural-art + +# or via ssh +git clone git@github.com:xAlpharax/neural-art.git ``` Create a virtual environment to separate the required packages from system-wide packages: @@ -42,7 +45,7 @@ pip install -r requirements.txt ## Usage -The main script sits comfortably in `./stylize.sh` so run it from the project's directory: +The main script sits comfortably in `./stylize.sh`, run it from the project's root directory: ```bash ./stylize.sh path/to/style_image path/to/content_image @@ -54,21 +57,24 @@ A helper script is also available to run `./stylize.sh` for each distinct pair o ./all.sh ``` -Moreover, `./all.sh` is aware of the already rendered mp4 files and will skip stylizing the combinations that are already present. +Moreover, `./all.sh` is aware of the already rendered mp4 files and will skip stylizing the combinations that are already present. In contrast, `./stylize.sh` overwrites images and videos. ### Output videos / images and temporary files -If, at any point, curious of the individual frames that comprise the generated `./content_in_style.mp4` check `./Output/` for PNG images with exactly that. Keep in mind that these files get removed and overwritten each time ./stylize.sh is called ( this is also why running multiple instances of `./stylize.sh` is advised against; if you need to batch/automate the process, try `./all.sh` ) +The stylization process outputs a video in the format `./content_in_style.mp4` with `content` and `style` being the 2nd and 1st command line arguments of the `./stylize.sh` script. + +If, at any point, you need the individual frames that comprise the generated `./content_in_style.mp4`, check the `./Output/` directory for `.png` images with frames at each iteration. +The `./neuralart.py` code that sits at the heart of this project generates raw numpy array data to `./images.npy` which in turn is manipulated by `./renderer.py` to output frames as `.png` images. -The `./images.npy` file contains raw numpy array data generated by `./neuralart.py` and is manipulated by `./renderer.py` to achieve the `./Output` directory of PNG images. +These intermediary outputs are temporarily stored and get removed each time the `./stylize.sh` script is run. -Considering this workflow, `./clear_dir.sh` removes temporary files each time a new call to `./stylize.sh` is made. +All the stylize combinations from the `./Images/` directory have been saved to [this archive](https://drive.google.com/file/d/1k_ECmiHe3l0uS0ps2faWk8PHAOaNYZPp). Check the video gallery below to go through some of them that look the best: <details> -<summary><h2>Results after running neural-art (Click on dropdown for video gallery)</h2></summary> +<summary><h3>Results - Click on dropdown menu for video gallery</h3></summary> Starry Night in various other styles 8 @@ -244,6 +250,9 @@ https://github.com/xAlpharax/neural-art/assets/42233094/36dd6f3f-aca6-4e0f-8be8- https://github.com/xAlpharax/neural-art/assets/42233094/3bd2433a-54d1-40e2-9d00-cb165f6a2985 + +Tarantula reference:) + https://github.com/xAlpharax/neural-art/assets/42233094/555a4675-da19-4fa6-9104-1ee2c63a7f8b @@ -38,4 +38,4 @@ python renderer.py --fix #ffmpeg -y -framerate 60 -pattern_type glob -i 'Output/neural_art_*.png' -c:v libsvtav1 -pix_fmt yuv420p -vf "pad=ceil(iw/2)*2:ceil(ih/2)*2" $(basename ${2%.*})'_in_'$(basename ${1%.*})'.mp4' # AV1 weirddd ffmpeg -y -framerate 60 -pattern_type glob -i 'Output/neural_art_*.png' -c:v libvpx-vp9 -pix_fmt yuv420p -vf "pad=ceil(iw/2)*2:ceil(ih/2)*2" $(basename ${2%.*})'_in_'$(basename ${1%.*})'.mp4' # VP9 quality -# try -crf 10 (sane for vp9) +# -crf defaults to 32 which is ok for vp9, change it to be lower with the -crf flag if you *really* need that higher fidelity and bigger filesize, i found it to be unnecessary |