stable-diffusion.cpp

Author	SHA1	Message	Date
Phu Tran	607e39489f	docs: add Jellybox as UI using sd.cpp (#214 )	2024-04-02 12:31:54 +08:00
bssrdf	a469688e30	feat: add TencentARC PhotoMaker support (#179 ) * first efforts at implementing photomaker; lots more to do * added PhotoMakerIDEncoder model in SD * fixed soem bugs; now photomaker model weights can be loaded into their tensor buffers * added input id image loading * added preprocessing inpit id images * finished get_num_tensors * fixed a bug in remove_duplicates * add a get_learned_condition_with_trigger function to do photomaker stuff * add a convert_token_to_id function for photomaker to extract trigger word's token id * making progress; need to implement tokenizer decoder * making more progress; finishing vision model forward * debugging vision_model outputs * corrected clip vision model output * continue making progress in id fusion process * finished stacked id embedding; to be tested * remove garbage file * debuging graph compute * more progress; now alloc buffer failed * fixed wtype issue; input images can only be 1 because issue with transformer when batch size > 1 (to be investigated) * added delayed subject conditioning; now photomaker runs and generates images * fixed stat_merge_step * added photomaker lora model (to be tested) * reworked pmid lora * finished applying pmid lora; to be tested * finalized pmid lora * add a few print tensor; tweak in sample again * small tweak; still not getting ID faces * fixed a bug in FuseBlock forward; also remove diag_mask op in for vision transformer; getting better results * disable pmid lora apply for now; 1 input image seems working; > 1 not working * turn pmid lora apply back on * fixed a decode bug * fixed a bug in ggml's conv_2d, and now > 1 input images working * add style_ratio as a cli param; reworked encode with trigger for attention weights * merge commit fixing lora free param buffer error * change default style ratio to 10% * added an option to offload vae decoder to CPU for mem-limited gpus * removing image normalization step seems making ID fidelity much higher * revert default style ratio back ro 20% * added an option for normalizing input ID images; cleaned up debugging code * more clean up * fixed bugs; now failed with cuda error; likely out-of-mem on GPU * free pmid model params when required * photomaker working properly now after merging and adapting to GGMLBlock API * remove tensor renaming; fixing names in the photomaker model file * updated README.md to include instructions and notes for running PhotoMaker * a bit clean up * remove -DGGML_CUDA_FORCE_MMQ; more clean up and README update * add input image requirement in README * bring back freeing pmid lora params buffer; simply pooled output of CLIPvision * remove MultiheadAttention2; customized MultiheadAttention * added a WIN32 get_files_from_dir; turn off Photomakder if receiving no input images * update docs * fix ci error * make stable-diffusion.h a pure c header file This reverts commit 27887b630db6a92f269f0aef8de9bc9832ab50a9. * fix ci error * format code * reuse get_learned_condition * reuse pad_tokens * reuse CLIPVisionModel * reuse LoraModel * add --clip-on-cpu * fix lora name conversion for SDXL --------- Co-authored-by: bssrdf <bssrdf@gmail.com> Co-authored-by: leejet <leejet714@gmail.com>	2024-03-12 23:15:17 +08:00
Cyberhan123	583cc5bba2	docs: add binding (#189 )	2024-03-03 13:27:07 +08:00
Sean Bailey	193fb620b1	feat: add capability to repeatedly run the upscaler in a row (#174 ) * Add in upscale repeater logic --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-02-24 21:31:01 +08:00
leejet	b6368868d9	feat: introduce GGMLBlock and implement SVD(Broken) (#159 ) * introduce GGMLBlock and implement SVD(Broken) * add sdxl vae warning	2024-02-24 20:06:39 +08:00
Steward Garcia	36ec16ac99	feat: Control Net support + Textual Inversion (embeddings) (#131 ) * add controlnet to pipeline * add cli params * control strength cli param * cli param keep controlnet in cpu * add Textual Inversion * add canny preprocessor * refactor: change ggml_type_sizef to ggml_row_size * process hint once time * ignore the embedding name case --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-01-29 22:38:51 +08:00
旺旺碎冰冰	c6071fa82f	feat: add hipBlas support (#94 )	2024-01-14 11:53:42 +08:00
leejet	5c614e4bc2	feat: add convert api (#142 )	2024-01-14 11:43:24 +08:00
leejet	b139434b57	docs: update README.md	2023-12-31 11:48:41 +08:00
leejet	78ad76f3f4	feat: add SDXL support (#117 ) * add SDXL support * fix the issue with generating large images	2023-12-29 00:16:10 +08:00
Steward Garcia	004dfbef27	feat: implement ESRGAN upscaler + Metal Backend (#104 ) * add esrgan upscaler * add sd_tiling * support metal backend * add clip_skip --------- Co-authored-by: leejet <leejet714@gmail.com>	2023-12-28 23:46:48 +08:00
旺旺碎冰冰	0e64238e4c	feat: implement the complete bpe function (#119 ) * implement the complete bpe function --------- Co-authored-by: leejet <leejet714@gmail.com>	2023-12-23 12:11:07 +08:00
leejet	ac8f5a044c	feat: add SD-Turbo support	2023-12-10 13:15:09 +08:00
leejet	968226abb2	docs: update v2-1_768-nonema-pruned.safetensors url	2023-12-05 22:52:19 +08:00
Steward Garcia	134883aec4	feat: add TAESD implementation - faster autoencoder (#88 ) * add taesd implementation * taesd gpu offloading * show seed when generating image with -s -1 * less restrictive with larger images * cuda: im2col speedup x2 * cuda: group norm speedup x90 * quantized models now works in cuda :) * fix cal mem size --------- Co-authored-by: leejet <leejet714@gmail.com>	2023-12-05 22:40:03 +08:00
leejet	d7af2c2ba9	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
Steward Garcia	8124588cf1	feat: ggml-alloc integration and gpu acceleration (#75 ) * set ggml url to FSSRepo/ggml * ggml-alloc integration * offload all functions to gpu * gguf format + native converter * merge custom vae to a model * full offload to gpu * improve pretty progress --------- Co-authored-by: leejet <leejet714@gmail.com>	2023-11-26 19:02:36 +08:00
leejet	64f6002457	docs: add contributors info to README.md	2023-11-19 18:35:19 +08:00
leejet	9a9f3daf8e	feat: add LoRA support	2023-11-19 17:43:49 +08:00
leejet	536f3af672	feat: add lcm sampler support This referenced an issue discussion of the stable-diffusion-webui at https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/13952, which may not be too perfect.	2023-11-17 22:53:46 +08:00
Robert Bledsaw	29a56f2e98	docs: update README.md (#71 ) fixed type in curl command.	2023-10-22 13:03:32 +08:00
Urs Ganse	afec5051cf	feat: write generation parameter exif data into output png (#57 ) * Write generation parameter exif data into output pngs. This adds prompt, negative prompt (if nonempty) and other generation parameters to the output file as a tEXt PNG block, in the same format as AUTOMATIC1111 webui does. In order to keep everything free of external library dependencies, I have somewhat dirtily hacked this into the stb_image_write implementation. * Mention png text data in README.md, include "karras" in sampler text * add Steps/Model/RNG to parameter string --------- Co-authored-by: leejet <leejet714@gmail.com>	2023-09-18 21:09:15 +08:00
Urs Ganse	3a25179d52	feat: add DPM2 and DPM++(2s) a samplers (#56 ) * Add DPM2 sampler. * Add DPM++ (2s) a sampler. * Update README.md with added samplers --------- Co-authored-by: leejet <leejet714@gmail.com>	2023-09-12 23:02:09 +08:00
Urs Ganse	b6899e8fc2	feat: add Euler, Heun and DPM++ (2M) samplers (#50 ) * Add Euler sampler * Add Heun sampler * Add DPM++ (2M) sampler * Add modified DPM++ (2M) "v2" sampler. This was proposed in a issue discussion of the stable diffusion webui, at https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/8457 and apparently works around overstepping of the DPM++ (2M) method with small step counts. The parameter is called dpmpp2mv2 here. * match code style --------- Co-authored-by: Urs Ganse <urs@nerd2nerd.org> Co-authored-by: leejet <leejet714@gmail.com>	2023-09-08 23:47:28 +08:00
leejet	e5a7aec252	feat: add CUDA RNG	2023-09-03 19:24:07 +08:00
leejet	31e77e1573	feat: add SD2.x support (#40 )	2023-09-03 16:00:33 +08:00
leejet	008d80a0b1	docs: update README.md	2023-08-25 20:59:18 +08:00
leejet	721cb324af	chore: add sd Dockerfile	2023-08-22 22:14:20 +08:00
leejet	a393bebec8	docs: update README.md	2023-08-22 20:45:23 +08:00
Derek Anderson	76b9b2e9a2	docs: update README.md (#23 )	2023-08-21 23:53:50 +08:00
leejet	17095dddea	feat: add token weighting support (#13 )	2023-08-20 20:28:36 +08:00
leejet	7132027862	docs: update sd path	2023-08-17 23:44:56 +08:00
leejet	8f34dd7cc7	perf: free unused params immediately to reduce memory usage	2023-08-17 00:55:36 +08:00
leejet	cbee3c9a4f	docs: update README.md	2023-08-16 22:26:15 +08:00
leejet	58735a2813	feat: add img2img mode (#5 )	2023-08-16 01:48:07 +08:00
leejet	3265464090	docs: update README.md	2023-08-14 21:27:29 +08:00
leejet	f5d174f9ab	docs: update README.md	2023-08-13 19:47:53 +08:00
leejet	3aca342e60	Initial commit	2023-08-13 16:00:22 +08:00

38 Commits