stable-diffusion.cpp

Author	SHA1	Message	Date
stduhpf	fbd42b6fc1	fix: fix embeddings with quantized models (#601 )	2025-03-01 11:45:39 +08:00
yslai	19d876ee30	feat: implement DDIM with the "trailing" timestep spacing and TCD (#568 )	2025-02-22 21:34:22 +08:00
lalala	f27f2b2aa2	docs: add missing --mask and --guidance options to print_usage (#572 )	2025-02-22 21:32:37 +08:00
piallai	99609761dc	docs: fix typo in readme (#574 )	2025-02-22 21:30:28 +08:00
stduhpf	69c73789fe	fix: force binary mask for inpaint models (#589 ) Co-authored-by: leejet <leejet714@gmail.com>	2025-02-22 21:29:57 +08:00
Meng, Hengyu	838beb9b5e	chore: add global SYCL compile flags (#597 )	2025-02-22 21:23:58 +08:00
stduhpf	f23b803a6b	fix:: unapply current loras properly (#590 )	2025-02-22 21:22:22 +08:00
stduhpf	1be2491dcf	feat: partial LyCORIS support (tucker decomposition for LoCon + LoHa + LoKr) (#577 )	2025-02-22 21:19:26 +08:00
Matti Pulkkinen	3753223982	fix: make get_files_from_dir works with absolute path (#598 ) Co-authored-by: Matti Pulkkinen <pulkkinen@ultimatium.com>	2025-02-22 21:16:50 +08:00
R0CKSTAR	59ca2b0f16	chore: bump MUSA SDK version to rc3.1.1 (#599 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-22 21:14:26 +08:00
vmobilis	d46ed5e184	feat: support JPEG compression (#583 )	2025-02-05 16:18:02 +08:00
ag2s20150909	2535ad5a43	chore: fix cuda on github action (#580 )	2025-02-05 16:15:41 +08:00
stduhpf	e500d95abd	fix: fix rank 1 loras (#575 )	2025-02-05 16:13:17 +08:00
R0CKSTAR	a3cbdf6dcb	chore: SD_USE_CUBLAS => SD_USE_CUDA for MUSA backend (#578 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-02-05 16:11:26 +08:00
piallai	5eb15ef4d0	docs: add CLI-GUI to list (#546 )	2025-01-18 13:16:54 +08:00
stduhpf	d9b5942d98	feat: add sdxl v-pred suppport (#536 )	2025-01-18 13:15:54 +08:00
stduhpf	587a37b2e2	fix: avoid sd2((non inpaint) crash on v-pred check (#537 )	2025-01-18 13:13:34 +08:00
ag2s20150909	4fe83d52cf	chore: fix CUDA on GitHub Action (#567 )	2025-01-18 13:12:26 +08:00
null-define	b70aaa672a	chore: fix amd rocm build (#571 )	2025-01-18 13:11:39 +08:00
idostyle	27edb765a5	chore: fix CI windows release artifacts (#532 )	2025-01-18 13:09:22 +08:00
leejet	dcf91f9e0f	chore: change SD_CUBLAS/SD_USE_CUBLAS to SD_CUDA/SD_USE_CUDA	2024-12-28 13:27:51 +08:00
stduhpf	348a54e34a	feat: use pretty-progress for tensor loading (#516 )	2024-12-28 13:14:52 +08:00
stduhpf	d50473dc49	feat: support 16 channel tae (taesd/taef1) (#527 )	2024-12-28 13:13:48 +08:00
piallai	b5cc1422da	fix: fix typo for skip layers parameters (#492 )	2024-12-28 13:12:08 +08:00
R0CKSTAR	5cc74d1f09	feat: support Moore Threads GPU (#529 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-12-28 13:08:36 +08:00
stduhpf	0d9d6659a7	fix: fix metal build (#513 )	2024-12-28 13:06:17 +08:00
stduhpf	8f4ab9add3	feat: support Inpaint models (#511 )	2024-12-28 13:04:49 +08:00
stduhpf	cc92a6a1b3	feat: support more LoRA models (#520 )	2024-12-28 12:56:44 +08:00
leejet	9578fdcc46	chore: remove rocm5.5 build temporarily	2024-11-30 14:26:29 +08:00
stduhpf	9148b980be	feat: remove type restrictions (#489 )	2024-11-30 14:22:15 +08:00
stduhpf	7ce63e740c	feat: flexible model architecture for dit models (Flux & SD3) (#490 ) * Refactor: wtype per tensor * Fix default args * refactor: fix flux * Refactor photmaker v2 support * unet: refactor the refactoring * Refactor: fix controlnet and tae * refactor: upscaler * Refactor: fix runtime type override * upscaler: use fp16 again * Refactor: Flexible sd3 arch * Refactor: Flexible Flux arch * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-30 14:18:53 +08:00
leejet	4570715727	fix: use ggml_nn_attention in vae	2024-11-24 18:21:31 +08:00
stduhpf	53b415f787	fix: remove default variables in c headers (#478 )	2024-11-24 18:10:25 +08:00
leejet	c3eeb669cd	sync: update ggml	2024-11-23 13:29:32 +08:00
leejet	b5f4932696	refactor: add some sd vesion helper functions	2024-11-23 13:02:44 +08:00
Erik Scholz	1c168d98a5	fix: repair flash attention support (#386 ) * repair flash attention in _ext this does not fix the currently broken fa behind the define, which is only used by VAE Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com> * make flash attention in the diffusion model a runtime flag no support for sd3 or video * remove old flash attention option and switch vae over to attn_ext * update docs * format code --------- Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com> Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 12:39:08 +08:00
William Murray	ea9b647080	docs: update readme, add python bindings (#423 )	2024-11-23 11:52:33 +08:00
bssrdf	2b1bc06477	feat: add PhotoMaker Version 2 support (#358 ) * first attempt at updating to photomaker v2 * continue adding photomaker v2 modules * finishing the last few pieces for photomaker v2; id_embeds need to be done by a manual step and pass as an input file * added a name converter for Photomaker V2; build ok * more debugging underway * failing at cuda mat_mul * updated chunk_half to be more efficient; redo feedforward * fixed a bug: carefully using ggml_view_4d to get chunks of a tensor; strides need to be recalculated or set properly; still failing at soft_max cuda op * redo weight calculation and weightv fixed a bug now Photomaker V2 kinds of working * add python script for face detection (Photomaker V2 needs) * updated readme for photomaker * fixed a bug causing PMV1 crashing; both V1 and V2 work * fixed clean_input_ids for PMV2 * fixed a double counting bug in tokenize_with_trigger_token * updated photomaker readme * removed some commented code * improved reconstructing class word free prompt * changed reading id_embed to raw binary using existing load tensor function; this is more efficient than using model load and also makes it easier to work with sd server * minor clean up --------- Co-authored-by: bssrdf <bssrdf@gmail.com>	2024-11-23 11:50:14 +08:00
Flavio Bizzarri	b99cbfe4dc	docs: update README.md (#452 )	2024-11-23 11:46:50 +08:00
Plamen Minev	8c7719fe9a	fix: typo in clip-g encoder arg (#472 )	2024-11-23 11:46:00 +08:00
LostRuins Concedo	8f94efafa3	feat: add support for loading F8_E5M2 weights (#460 )	2024-11-23 11:45:11 +08:00
fszontagh	07585448ad	docs: update readme (#462 )	2024-11-23 11:42:12 +08:00
stduhpf	6ea812256e	feat: add flux 1 lite 8B (freepik) support (#474 ) * Flux Lite (Freepik) support * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 11:41:30 +08:00
stduhpf	9b1d90bc23	fix: improve clip text_projection support (#397 )	2024-11-23 11:19:27 +08:00
stduhpf	65fa646684	feat: add sd3.5 medium and skip layer guidance support (#451 ) * mmdit-x * add support for sd3.5 medium * add skip layer guidance support (mmdit only) * ignore slg if slg_scale is zero (optimization) * init out_skip once * slg support for flux (expermiental) * warn if version doesn't support slg * refactor slg cli args * set default slg_scale to 0 (oops) * format code --------- Co-authored-by: leejet <leejet714@gmail.com>	2024-11-23 11:15:31 +08:00
leejet	ac54e00760	feat: add sd3.5 support (#445 )	2024-10-24 21:58:03 +08:00
stduhpf	14206fd488	fix: fix clip tokenizer (#383 )	2024-09-02 22:31:46 +08:00
zhentaoyu	e410aeb534	sync: update ggml to fix large image generation with SYCL backend (#380 ) * turn off fast-math on host in SYCL backend Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * update ggml for sync some sycl ops Signed-off-by: zhentaoyu <zhentao.yu@intel.com> * update sycl readme and ggml Signed-off-by: zhentaoyu <zhentao.yu@intel.com> --------- Signed-off-by: zhentaoyu <zhentao.yu@intel.com>	2024-09-02 22:29:35 +08:00
leejet	58d54738e2	docs: add star history	2024-08-28 00:27:54 +08:00
leejet	4f87b232c2	docs: add Vulkan build command	2024-08-28 00:25:31 +08:00

1 2 3 4 5

201 Commits