stable-diffusion.cpp

Author	SHA1	Message	Date
vmobilis	655f8a5169	fix: clang complains about needless braces (#618 )	2025-03-09 12:26:41 +08:00
idostyle	d7c7a34712	fix: ModelLoader::load_tensors duplicated check (#623 ) Introduced in `2b6ec97fe2`	2025-03-09 12:23:23 +08:00
vmobilis	81556f3136	chore: silence some warnings about precision loss (#620 )	2025-03-09 12:22:39 +08:00
stduhpf	3fb275a67b	fix: suport sdxl embedddings (#621 )	2025-03-09 12:21:23 +08:00
leejet	30b3ac8e62	fix: avoid potential dangling pointer problem	2025-03-01 16:58:26 +08:00
leejet	195d170136	sync: update ggml	2025-03-01 12:09:55 +08:00
stduhpf	f50a7f66aa	fix: fix race condition causing inconsistent value for `decoder_only` (#609 )	2025-03-01 11:49:06 +08:00
stduhpf	85e9a12988	fix: preprocess tensor names in tensor types map (#607 ) Thank you for your contribution	2025-03-01 11:48:04 +08:00
stduhpf	fbd42b6fc1	fix: fix embeddings with quantized models (#601 )	2025-03-01 11:45:39 +08:00
yslai	19d876ee30	feat: implement DDIM with the "trailing" timestep spacing and TCD (#568 )	2025-02-22 21:34:22 +08:00
lalala	f27f2b2aa2	docs: add missing --mask and --guidance options to print_usage (#572 )	2025-02-22 21:32:37 +08:00
piallai	99609761dc	docs: fix typo in readme (#574 )	2025-02-22 21:30:28 +08:00
stduhpf	69c73789fe	fix: force binary mask for inpaint models (#589 ) Co-authored-by: leejet <leejet714@gmail.com>	2025-02-22 21:29:57 +08:00
Meng, Hengyu	838beb9b5e	chore: add global SYCL compile flags (#597 )	2025-02-22 21:23:58 +08:00
stduhpf	f23b803a6b	fix:: unapply current loras properly (#590 )	2025-02-22 21:22:22 +08:00
stduhpf	1be2491dcf	feat: partial LyCORIS support (tucker decomposition for LoCon + LoHa + LoKr) (#577 )	2025-02-22 21:19:26 +08:00
Matti Pulkkinen	3753223982	fix: make get_files_from_dir works with absolute path (#598 ) Co-authored-by: Matti Pulkkinen <pulkkinen@ultimatium.com>	2025-02-22 21:16:50 +08:00
R0CKSTAR	59ca2b0f16	chore: bump MUSA SDK version to rc3.1.1 (#599 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-22 21:14:26 +08:00
vmobilis	d46ed5e184	feat: support JPEG compression (#583 )	2025-02-05 16:18:02 +08:00
ag2s20150909	2535ad5a43	chore: fix cuda on github action (#580 )	2025-02-05 16:15:41 +08:00
stduhpf	e500d95abd	fix: fix rank 1 loras (#575 )	2025-02-05 16:13:17 +08:00
R0CKSTAR	a3cbdf6dcb	chore: SD_USE_CUBLAS => SD_USE_CUDA for MUSA backend (#578 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-05 16:11:26 +08:00
piallai	5eb15ef4d0	docs: add CLI-GUI to list (#546 )	2025-01-18 13:16:54 +08:00
stduhpf	d9b5942d98	feat: add sdxl v-pred suppport (#536 )	2025-01-18 13:15:54 +08:00
stduhpf	587a37b2e2	fix: avoid sd2((non inpaint) crash on v-pred check (#537 )	2025-01-18 13:13:34 +08:00
ag2s20150909	4fe83d52cf	chore: fix CUDA on GitHub Action (#567 )	2025-01-18 13:12:26 +08:00
null-define	b70aaa672a	chore: fix amd rocm build (#571 )	2025-01-18 13:11:39 +08:00
idostyle	27edb765a5	chore: fix CI windows release artifacts (#532 )	2025-01-18 13:09:22 +08:00
leejet	dcf91f9e0f	chore: change SD_CUBLAS/SD_USE_CUBLAS to SD_CUDA/SD_USE_CUDA	2024-12-28 13:27:51 +08:00
stduhpf	348a54e34a	feat: use pretty-progress for tensor loading (#516 )	2024-12-28 13:14:52 +08:00
stduhpf	d50473dc49	feat: support 16 channel tae (taesd/taef1) (#527 )	2024-12-28 13:13:48 +08:00
piallai	b5cc1422da	fix: fix typo for skip layers parameters (#492 )	2024-12-28 13:12:08 +08:00
R0CKSTAR	5cc74d1f09	feat: support Moore Threads GPU (#529 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-12-28 13:08:36 +08:00
stduhpf	0d9d6659a7	fix: fix metal build (#513 )	2024-12-28 13:06:17 +08:00
stduhpf	8f4ab9add3	feat: support Inpaint models (#511 )	2024-12-28 13:04:49 +08:00
stduhpf	cc92a6a1b3	feat: support more LoRA models (#520 )	2024-12-28 12:56:44 +08:00
leejet	9578fdcc46	chore: remove rocm5.5 build temporarily	2024-11-30 14:26:29 +08:00
stduhpf	9148b980be	feat: remove type restrictions (#489 )	2024-11-30 14:22:15 +08:00
stduhpf	7ce63e740c	feat: flexible model architecture for dit models (Flux & SD3) (#490 ) * Refactor: wtype per tensor * Fix default args * refactor: fix flux * Refactor photmaker v2 support * unet: refactor the refactoring * Refactor: fix controlnet and tae * refactor: upscaler * Refactor: fix runtime type override * upscaler: use fp16 again * Refactor: Flexible sd3 arch * Refactor: Flexible Flux arch * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-30 14:18:53 +08:00
leejet	4570715727	fix: use ggml_nn_attention in vae	2024-11-24 18:21:31 +08:00
stduhpf	53b415f787	fix: remove default variables in c headers (#478 )	2024-11-24 18:10:25 +08:00
leejet	c3eeb669cd	sync: update ggml	2024-11-23 13:29:32 +08:00
leejet	b5f4932696	refactor: add some sd vesion helper functions	2024-11-23 13:02:44 +08:00
Erik Scholz	1c168d98a5	fix: repair flash attention support (#386 ) * repair flash attention in _ext this does not fix the currently broken fa behind the define, which is only used by VAE Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com> * make flash attention in the diffusion model a runtime flag no support for sd3 or video * remove old flash attention option and switch vae over to attn_ext * update docs * format code --------- Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com> Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 12:39:08 +08:00
William Murray	ea9b647080	docs: update readme, add python bindings (#423 )	2024-11-23 11:52:33 +08:00
bssrdf	2b1bc06477	feat: add PhotoMaker Version 2 support (#358 ) * first attempt at updating to photomaker v2 * continue adding photomaker v2 modules * finishing the last few pieces for photomaker v2; id_embeds need to be done by a manual step and pass as an input file * added a name converter for Photomaker V2; build ok * more debugging underway * failing at cuda mat_mul * updated chunk_half to be more efficient; redo feedforward * fixed a bug: carefully using ggml_view_4d to get chunks of a tensor; strides need to be recalculated or set properly; still failing at soft_max cuda op * redo weight calculation and weightv fixed a bug now Photomaker V2 kinds of working * add python script for face detection (Photomaker V2 needs) * updated readme for photomaker * fixed a bug causing PMV1 crashing; both V1 and V2 work * fixed clean_input_ids for PMV2 * fixed a double counting bug in tokenize_with_trigger_token * updated photomaker readme * removed some commented code * improved reconstructing class word free prompt * changed reading id_embed to raw binary using existing load tensor function; this is more efficient than using model load and also makes it easier to work with sd server * minor clean up --------- Co-authored-by: bssrdf <bssrdf@gmail.com>	2024-11-23 11:50:14 +08:00
Flavio Bizzarri	b99cbfe4dc	docs: update README.md (#452 )	2024-11-23 11:46:50 +08:00
Plamen Minev	8c7719fe9a	fix: typo in clip-g encoder arg (#472 )	2024-11-23 11:46:00 +08:00
LostRuins Concedo	8f94efafa3	feat: add support for loading F8_E5M2 weights (#460 )	2024-11-23 11:45:11 +08:00
fszontagh	07585448ad	docs: update readme (#462 )	2024-11-23 11:42:12 +08:00

1 2 3 4 5

209 Commits