Home Artists Posts Import Register

Downloads

Content

Patreon exclusive posts index

Join discord and tell me your discord username to get a special rank : SECourses Discord

27 November 2023 Huge Update

  • The dataset is improved and expanded to 5200 images for both Woman and Man dataset
  • The cropping and resize scripts are further improved and all images are processed again
  • Moreover all images are now sorted according to the face quality of the images
  • So the training scripts will use the very best ones
  • Naming is made starting from man_10001 or woman_10001 so when training script starts using reg images, the very best ones will be used
  • Please re-download all of the new images for best quality
  • Total time took is over 10 full days to prepare all reg images
  • Both woman and man new datasets are added to the resources below

20 September 2023 Massive Update

  • All of the images are reprocessed with a newer face detection algorithm RetinaFace
  • RetinaFace is much better to detect and focus faces but it is really really slow
  • Newest processing scripts are shared here (YOLO V7 cropper and RetinaFace resizer) : https://www.patreon.com/posts/sota-subject-and-88391247
  • So with this update the datasets are much better. Please redownload them before using
  • Processing all of the datasets took like 6 days with 13900K CPU and 3090 TI

How To Download All On A RunPod Or A Unix System

  • download_man_reg_imgs.sh file will download and automatically extract 512x512, 768x768 and 1024x1024 man images. You can edit the file and add other resolutions if you need.
  • download_woman_reg_imgs.sh file will download and automatically extract 512x512, 768x768 and 1024x1024 woman images. You can edit the file and add other resolutions if you need.
  • These files can be used for Unix and possibly for MacOs systems as well. Don't forget to comment (put # beginning of a link) the links that you don't want to download and change folder paths if you wish.
  • Upload into workspace folder of RunPod and execute below command
  • cd /workspace
  • chmod +x download_man_reg_imgs.sh
  • ./download_woman_reg_imgs.sh
  • cd /workspace
  • chmod +x download_woman_reg_imgs.sh
  • ./download_woman_reg_imgs.sh

How Datasets Are Prepared

I have gathered 40k images for woman and man class from unsplash . com. So total gathered images count is above 80k.

They are all real images. 0 AI image are used.

Then I post processed them with several AI models to clean the dataset. At the end, finally I checked each one of the images manually. Whole process took about 70 (for woman) + 70 (for man) hours.

The final output is 5200 perfect images for woman and 5200 for man. Minimum resolution of images are above 1536 x 1536 pixels and max resolution is up to 14999 x 9999 pixels.

The raw images and exact resolution having images are shared below. If you also need any other specific resolution let me know and hopefully I will update this post.

To use them on Windows you only need to extract zip images. If you can't make it install Winrar from https://www.rarlab.com/

Man Dataset

Woman Dataset

How To Use On RunPod Or Other Cloud or Linux

To use these files unrunpod

First you need to install 7zip

  • yes | apt-get install p7zip-full

Then download them with wget. Copy their link with right click and copy link then as below

wget

e.g. man:

Then use below command to extract them

  • 7z x man_5200_imgs_512x512.zip
  • or another one
  • 7z x man_5200_imgs_1024x1024.zip

e.g. woman :

Then use below command to extract them

  • 7z x woman_5200_imgs_512x512.zip
  • or another one
  • 7z x woman_5200_imgs_1024x1024.zip

Comments

Dallin Mackay

Excellent, thank you very much sir

Alex Voigt

Thanks Furkan! I've used your man class images for several models, and I think they're the best yet.

Scrubbles

Is there anything special we need to do beyond just adding using them as images? I named the directory 1_woman, should I use all the thousand of them or should it be dependent on how many training images I have? If I have 50 training images for example trained as 100_woman, should I use all of the regularization images or just pick a subset? To add, how many training steps do you recommend then? Or if there's already a guide on this subject, I haven't seen anything specific around regularization

Furkan Gözükara

1 : it depends on your script. lets say you are using kohya then it will use repeating * number of training images you have. 1_woman means 1 time repeating of classification images so that is correct. 2 : for 50 training images make repeating 20 and save every epoch. then do a epoch comparison. you can train up to 8 epochs so it will be total 50 * 20 * 8 * 2 = 16000 steps. it may get overtrained but you can do checkpoint comparison to find sweet checkpoint . if you say that is too long reduce repeating count to 10 and do the same. it will be 8000 steps

Scrubbles

Thanks, this makes sense. The count of reg. photos doesn't matter, it's more of a "1 photo per iteration", so it's relative. I'm getting there, slowly understanding it all. (And yes, Kohya, using accelerate directly, sorry should have mentioned this.) I was doing 100 repeating every epoch, but I'll try the 20, yeah sometimes it's over trained and sometimes under, this will give me more stops to compare. Thanks!

Furkan Gözükara

You are welcome. If you make 100 repeating that means it will do actually 100 epoch in just a single epoch. So lets say you trained 3 epochs it will be actually 300 epochs. That is why.

Keith F

For training LoRA SDXL, if we want flexibility, are reg images necessary when training on a person?

Furkan Gözükara

Well it depends. Do you want realistic flexibility or stylized flexibility? If you want stylized flexibility perhaps try without class images

Tapiocapioca

Hello, I have some doubt about this package at the resolution 1024x1024, like the picture 00026-0-woman (1048) is a book. Other pictures cut a piece of the head like 00105-0-woman (1276) or 00319-0-woman (1809). Also I didn't understand, is full of woman with blanket on the head like this 00668-0-woman (2453) and hats, is this correct? Excuse me if I am wrong but in my mind I need to see the head and the body of the person.

Furkan Gözükara

Hello. You are right. When cropping and resizing it was mistake of automatic1111. When you look the raw dataset you will see original. I left home at the same day. Therefore, I will hopefully update resized images when I am back home. You will also notice in some cases some black area at the very bottom of the pictures. That is also mistake of automatic1111. Hopefully I will update this Monday with all much better face cropped and resized images. Sorry for the delay. I had to rush and left home that is why.

Tapiocapioca

Please don't excuse yourself, your job anyway is really welcome and great, I just want understand if I catched something wrong or it need to be like I saw. :)

Gerenier

Amazing work!

Furkan Gözükara

thank you so much. hopefully i will update resized files with even better cropping and resizing algorithm after monday. but raw dataset is perfect atm

Đạt Nguyễn

do i have to use woman_3847_imgs_raw (org resolutions - no cropped).zip to be reg or do i have to use all of them to make reg from org resolution _nocrop and other resolutions too, i mean i put them all in reg folder to can you make reg for train?

Furkan Gözükara

use the resolution that you are going to train your images. lets say your are training your wife and your training images are 1024x1024. then only use that folder (woman_3826_imgs_1024x1024px.zip ) for training. raw images are only necessary if you want to crop for another specific resolution that you need.

Chester Ogilvie

FYI, the woman_3826_imgs_1024x1536px.zip link leads to woman_3826_imgs_1280x1536px.zip file

Furkan Gözükara

Thanks for letting me know will fix. I will also upload newer resized images in better quality. Hopefully today returning from my family trip. Sorry for the delay

Đạt Nguyễn

Can I ask if the image I trained does not need to be cropped? for example, i train my wife at a resolution of 2500x2500, the parameter I set is 768x768 in kohya to calculate it for me, right? I just need to choose 768x768 data size for the reg, right?

Furkan Gözükara

i just returned from family trip today. improved my scripts and processing again. i will update thread and notify everyone hopefully when it is done

MUNEER AHMED

Hi there i couldn't find man images, can I use woman images to train a man , its a dumb question :)

Furkan Gözükara

no you shouldnt make something like that. man images are here https://www.patreon.com/posts/4k-2700-real-84053021 and even better ones are coming very soon to this thread

epido

I am using a RTX4090 with 24GB VRAM, but I run into GPU Out of memory error when starting training. I am following your exact tutorial. It only works when I reduce the "Max image size" to 512, and not 1024. Any way to get this to work for 1024 as well?

Furkan Gözükara

wow that is terrible. rtx 4090 is the best gpu. with exact settings as mine you should get like very minimal 1.5 s/ it with 1024x1024. actually you should get much better but currently nvidia drivers are broken. i wonder what is the issue. if you can raise support to 25$ for only this month I can happily connect your PC and help you to figure out what is the core issue. by the way I have added man images as well to this post - fresh amazing ones

Tim Forrest

This is great! Can you add the cropper script along with the versioned requirements.txt as a zip to the updates? I would love to run cropping on my own datasets and the cropper is a very cool tool!

Furkan Gözükara

yes i will share it. i was planning to share with a video but i will share right away hopefully today after several hours for you.

Tim Forrest

I do not know if regularization images would give better results for a lora aimed at flexibility. Do you have a recommendation for a regularization dataset that would provide flexibility for a lora trained on an initial set of photographs?

Tim Forrest

Thanks! I was also wondering about the resolutions for the script. These are lower than those that SDXL is trained on, would I need to upscale them to run XL Lora training correctly with kohya_ss?

Furkan Gözükara

these are higher than SDXL trained resolutions. you mean your own images are lower resolution? well if you upscale probably that will lower your quality.

Tim Forrest

Amazing! I'm having good luck with the subject cropper, some great results. I was using some images taken with a cellphone and it looks like this script ignores exif data. Added to resize2.py ```python from PIL import ImageOps # ... # Resize image with best resampling filter (LANCZOS) image = image.resize(resolution, Image.LANCZOS) # transpose the image to the correct exif orientation image = ImageOps.exif_transpose(image) ``` If that helps anyone. Also bash for any of us linux nuts install.sh ```sh #!/bin/bash # If no permissions run chmod +x install.sh set -e # create and activate venv python3 -m venv venv source ./venv/bin/activate # install requirements pip install -r requirements.txt ``` run.sh ```sh #!/bin/bash # If no permissions run chmod +x install.sh set -e python3 resize2.py ```

Furkan Gözükara

if you can message me from discord i may help further. because i am not sure what you are trying to do :) it is really hard to communicate on Patreon

Meito

the woman and man ones, are the same links for the max res ones

Furkan Gözükara

no each one has different links. they are all shared in the post. you should download the specific size you are going to do training or you can download and resize yourself.

Furkan Gözükara

yes we use these images to improve flexibility of a person training. so sorry for late reply patreon didn't give me notification of your question

omv421

this link "man_4330_imgs_raw (org resolutions - no cropped).zip - 12.8 GB" is the same download as "woman_3847_imgs_raw (org resolutions - no cropped).zip - 9.6 GB"

Paul Fidika

Unfortunately after training I'm ending up with very distorted and blurred faces (and some of the 'rainbowy stretch' distortion that seems to be characteristic of a problematic latent space). Any idea what I might be doing wrong? I didn't crop or resize my training images. Is that critical?

Furkan Gözükara

i think it is about your training images dataset. also which settings did you use for the training? you can message me more details from discord

Paul Fidika

I used the settings from the recommend config.json file you included; I only reduced the number of epochs from 8 to 3 so it took less time. I have 20 images taken on an iphone; not cropped or resized in any way. I was referencing the 1024 x 1024 female regularization folder. Should all training images be cropped / resized to be of uniform size (like 1024 x 1024) and then match the regularization set?

Paul Fidika

I also included CLIP captioning text files, like 'xxx a woman wearing a furry hat' for example, where 'xxx' is my girlfriend's name.

Furkan Gözükara

yes you also need to resize your training images. i suggest you to crop them perfectly and make them all 1024x1024

Furkan Gözükara

I would do 2 trainings after fixing the training images. do training with captions and without caption with only rare tokens

Tomas Docka

Wow! You did really good job here! The more I dig into Stable Diffusion, the more I find your resources useful. Thanks :)

JBO Solutions

i think you can recommend birme.net/ to resize images faster brother! thanks!

Naiad's Nest

amazing dataset! what about when I'm using several different image resolutions in one training and using bucketing? which set of regularization imgs should I use then? thank you in advance for your response.

Furkan Gözükara

I would use closest one. Also I can upload more specific resolutions for you. What you need? Alternatively you can use raw images dataset and let script to auto resize

Naiad's Nest

I'm planning to generate imgs at 16:9 ar, so I guess I should set the training at 1824x1024 or 1368x768. if you have a man dataset available in any of these resolutions, that would be awesome. thanks a bunch in advance, appreciate your help!

Naiad's Nest

I used your resizer and everything's working as it should. Thanks a lot for your work!

Pc 0005

I am getting this error: huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'C:\Users\Ester\Desktop\stable-diffusion-webui\models\dreambooth\Monica5\working\tokenizer'. 0%| | 0/3823 [00:00

Pc 0005

I've reinstalled and fixed the error, but now the following appears: File "C:\Users\Ester\Desktop\SD DREAMBOOTH\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 717, in forward causal_attention_mask = self._build_causal_attention_mask( File "C:\Users\Ester\Desktop\SD DREAMBOOTH\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 760, in _build_causal_attention_mask mask.triu_(1) # zero out the lower diagonal RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'

Javi dltr

how many reg images do i need to add?

Andre Molnar

Hello, and thank you for this wonderful resource. If I may offer an observation and pose a question. The images skew heavily towards close-up and medium shots of faces and faces and torso for square crops. And all images tend to be perfectly centered on the subject. If I were to guess, this is a bias from the cropping automation that likely identifies the subject and crops accordingly. Now a question (technically two questions): Is there a technical reason why such perfect subject cropping is beneficial to training? Would you recommend albumentations or similar techniques for either regularization or training to avoid perfect cropping/framing for 'better' results? Thank you again for all your wonderful resources and knowledge.

Furkan Gözükara

good observation. yes we are focusing the human in this cropping. this is because the stable diffusion when training reduces the image dimension to 1/8. so we are already training with low resolution. therefore we want to capture as much as details of the subject. so focused subject better

Alan Johnson

Any chance we could get more pictures, I've been getting FANTASTIC results with 40 high resolution pictures (I dabble in photography) and I'd like to bump it up to 40+ but I'm hitting a limit on the dataset images, i.e need more.

AI Squad

Very hard work you did! Very good! Regarding the resolutions, I can pick one pack, like 1024x1024, and use it in my training, right? I don't need to use 2 or 3 packs with different resolutions in my training, right? Is there a resolution you think works best to have better results? PS: I'm still learning, I'm not an expert in fine-tuning or training SD models. :)

Furkan Gözükara

yes you can use 1024x1024. you dont need multiple ones just use the one same as your training images. i find it is best for sdxl.

Alan Johnson

Q1: When you say make them repeat 52 what do you mean? Do you mean create copies of the 5200 dataset images? So lets say I just copy them all and have 10400 dataset images? Q2: And then by making the epoch like 23 do you mean change the Save Model frequency (Epochs) from 10 to 23?

önal baş

Furkan bey merhaba, öncelikle fotoğraf arşivi için teşekkür ederim. Arkadaşın sorduğu soruyla alakalı olarak mevcut fotoğraflar üzerinde veri çoğalta teknikleri uygulansa bu durum eğitimlerde işe yarar mı? Avrupa Bilim ve Teknoloji Dergisinde karlanma, gölgelendirme, yüksek gama, kontrast ayarlarını yükseltme, arttırılmış parlaklık, azaltılmış parlaklık, yansıtma gibi yöntemleri uygulanarak veri çoğaltma işlemi yapıldığını okumuştum. YOLOv4 algoritması kullanılarak yapılan eğitimlerde nesne tespiti testlerinde bu yöntemlerle çoğaltılan verilerin sonucu olumlu etkilediği gözlemlenmiş.

Nenad Kuzmanovic

Can you add reg images of boy and girl? I know u have daughter and you said one day you will do it :-) For start you don't have to make them in all resolutions, 1024x1024 will do the work...

Furkan Gözükara

The problem is collecting boy or girl images. It would take so much time to collect all even if possible. Maybe I can use a realistic model to generate artificial high quality ones

Nenad Kuzmanovic

Right.. That is good idea for creating with realistic model, with ADetailer and high res fix... I can help with generating, if you just find optimal settings, model etc.

kaosnews

I can't find the answer - but I manually extracted the images in a folder. Do I need to rename the folder (for Koyha) like this? woman_5200_imgs_768x768/1_images/..... or like woman_5200_imgs_768x768/1_woman/.....

Furkan Gözükara

please use dataset preparation feature to not make any mistakes. have you watched my tutorials? https://youtu.be/sBFGitIvD2A?si=x_6nDqIgHjvGUol5

kaosnews

I must admint I'm not a youtube tutorial watcher - but your video is great! And the answer was there :)

John Dopamine

If I prefer to use the class of "Person" should I just make a folder w/ 1024x1024 of Man + Women, or maybe half/half? Thoughts?

daniel mendoza

Can you place the images in 768x768 in png format, since jpeg is a format that introduces compression artifacts in the images?

Franco Acosta Diaz

Hellow I think there is a mistake in: How To Download All On A RunPod Or A Unix System. It should be: cd /workspace chmod +x download_man_reg_imgs.sh ./download_man_reg_imgs.sh

GeeMan64

Hi, within the 768x768 women regularization images there is one corrupted image that derailed the initial caching phase before the training. There is also the fact that the images collected aren't exactly 5200, but rather 5196. Don't know how much difference that would make it, perhaps a slight miscalculation dividing the reg images by my training images. Lastly I was going to point out how the images are in JPEG format instead of PNG but a previous person just above me mentioned this one. Any chance of a possible fix to these things? You do great work.

Furkan Gözükara

thank you so much. ye i made them JPEG since original images were JPEG. but quality of JPEG is 100%. I am going to eliminate corrupted images and reupload with exact accurate number of images for 768x768 now

Furkan Gözükara

do you know which image file is broken? i keep testing all images with python libraries and it says all valid.

GeeMan64

Sorry for the late response. I actually went ahead of the training and deleted the one corrupted image. I hope it didn't mess up a few things. I'll re-download the new version and let you know how it is next time I'll train a subject

reaper557

Thank you very much for taking the time to curate such an enormous data set for us! :D

The Strategist

Hi. Could I use it to train Asian face?

The Strategist

Yes, It work quite well after all. I have another question that could I use a sdxl turbo or sdxl lightning as a base model while following your tutorial video?

Furkan Gözükara

sadly they are not trainable. they will fail. you need full model. lighting and turbo models are different

Kasey Lx

Thank you sooo much for including women in the regularization images! It was so challenging to generate these before because of the nsfw results that come up. This saves me SO much time!!

eyal giron

I get weird noise in all the raw original-resolution images https://imgur.com/a/G7kxfnO

Furkan Gözükara

i just downloaded and extracted and no issues. did you download fully and extracted? they are over 10 gb each one

Furkan Gözükara

well works at me something must be wrong at your operating system. what do you use to open them? which software?

A.S.

When training on Runpod should i make a new folder called "reg images" or whatever and upload all the reg images into it , then put that folder s path into the webUI then how much repeats should i give it? for the regularization images tab in the webui basic is 1 repeats should i leave that and proceed?

A.S.

(if i do everything like in the tutorial video)