stable-diffusion.cpp

Author	SHA1	Message	Date
bssrdf	2b1bc06477	feat: add PhotoMaker Version 2 support (#358 ) * first attempt at updating to photomaker v2 * continue adding photomaker v2 modules * finishing the last few pieces for photomaker v2; id_embeds need to be done by a manual step and pass as an input file * added a name converter for Photomaker V2; build ok * more debugging underway * failing at cuda mat_mul * updated chunk_half to be more efficient; redo feedforward * fixed a bug: carefully using ggml_view_4d to get chunks of a tensor; strides need to be recalculated or set properly; still failing at soft_max cuda op * redo weight calculation and weightv fixed a bug now Photomaker V2 kinds of working * add python script for face detection (Photomaker V2 needs) * updated readme for photomaker * fixed a bug causing PMV1 crashing; both V1 and V2 work * fixed clean_input_ids for PMV2 * fixed a double counting bug in tokenize_with_trigger_token * updated photomaker readme * removed some commented code * improved reconstructing class word free prompt * changed reading id_embed to raw binary using existing load tensor function; this is more efficient than using model load and also makes it easier to work with sd server * minor clean up --------- Co-authored-by: bssrdf <bssrdf@gmail.com>	2024-11-23 11:50:14 +08:00
stduhpf	6ea812256e	feat: add flux 1 lite 8B (freepik) support (#474 ) * Flux Lite (Freepik) support * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 11:41:30 +08:00
stduhpf	9b1d90bc23	fix: improve clip text_projection support (#397 )	2024-11-23 11:19:27 +08:00
leejet	ac54e00760	feat: add sd3.5 support (#445 )	2024-10-24 21:58:03 +08:00
leejet	c837c5d9cc	style: format code	2024-08-25 00:19:37 +08:00
leejet	64d231f384	feat: add flux support (#356 ) * add flux support * avoid build failures in non-CUDA environments * fix schnell support * add k quants support * add support for applying lora to quantized tensors * add inplace conversion support for f8_e4m3 (#359) in the same way it is done for bf16 like how bf16 converts losslessly to fp32, f8_e4m3 converts losslessly to fp16 * add xlabs flux comfy converted lora support * update docs --------- Co-authored-by: Erik Scholz <Green-Sky@users.noreply.github.com>	2024-08-24 14:29:52 +08:00
leejet	73c2176648	feat: add sd3 support (#298 )	2024-07-28 15:44:08 +08:00

7 Commits