idostyle
d7c7a34712
fix: ModelLoader::load_tensors duplicated check ( #623 )
...
Introduced in 2b6ec97fe2
2025-03-09 12:23:23 +08:00
vmobilis
81556f3136
chore: silence some warnings about precision loss ( #620 )
2025-03-09 12:22:39 +08:00
stduhpf
3fb275a67b
fix: suport sdxl embedddings ( #621 )
2025-03-09 12:21:23 +08:00
leejet
30b3ac8e62
fix: avoid potential dangling pointer problem
2025-03-01 16:58:26 +08:00
leejet
195d170136
sync: update ggml
2025-03-01 12:09:55 +08:00
stduhpf
f50a7f66aa
fix: fix race condition causing inconsistent value for decoder_only
( #609 )
2025-03-01 11:49:06 +08:00
stduhpf
85e9a12988
fix: preprocess tensor names in tensor types map ( #607 )
...
Thank you for your contribution
2025-03-01 11:48:04 +08:00
stduhpf
fbd42b6fc1
fix: fix embeddings with quantized models ( #601 )
2025-03-01 11:45:39 +08:00
yslai
19d876ee30
feat: implement DDIM with the "trailing" timestep spacing and TCD ( #568 )
2025-02-22 21:34:22 +08:00
lalala
f27f2b2aa2
docs: add missing --mask and --guidance options to print_usage ( #572 )
2025-02-22 21:32:37 +08:00
piallai
99609761dc
docs: fix typo in readme ( #574 )
2025-02-22 21:30:28 +08:00
stduhpf
69c73789fe
fix: force binary mask for inpaint models ( #589 )
...
Co-authored-by: leejet <leejet714@gmail.com>
2025-02-22 21:29:57 +08:00
Meng, Hengyu
838beb9b5e
chore: add global SYCL compile flags ( #597 )
2025-02-22 21:23:58 +08:00
stduhpf
f23b803a6b
fix:: unapply current loras properly ( #590 )
2025-02-22 21:22:22 +08:00
stduhpf
1be2491dcf
feat: partial LyCORIS support (tucker decomposition for LoCon + LoHa + LoKr) ( #577 )
2025-02-22 21:19:26 +08:00
Matti Pulkkinen
3753223982
fix: make get_files_from_dir works with absolute path ( #598 )
...
Co-authored-by: Matti Pulkkinen <pulkkinen@ultimatium.com>
2025-02-22 21:16:50 +08:00
R0CKSTAR
59ca2b0f16
chore: bump MUSA SDK version to rc3.1.1 ( #599 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-02-22 21:14:26 +08:00
vmobilis
d46ed5e184
feat: support JPEG compression ( #583 )
2025-02-05 16:18:02 +08:00
ag2s20150909
2535ad5a43
chore: fix cuda on github action ( #580 )
2025-02-05 16:15:41 +08:00
stduhpf
e500d95abd
fix: fix rank 1 loras ( #575 )
2025-02-05 16:13:17 +08:00
R0CKSTAR
a3cbdf6dcb
chore: SD_USE_CUBLAS => SD_USE_CUDA for MUSA backend ( #578 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-02-05 16:11:26 +08:00
piallai
5eb15ef4d0
docs: add CLI-GUI to list ( #546 )
2025-01-18 13:16:54 +08:00
stduhpf
d9b5942d98
feat: add sdxl v-pred suppport ( #536 )
2025-01-18 13:15:54 +08:00
stduhpf
587a37b2e2
fix: avoid sd2((non inpaint) crash on v-pred check ( #537 )
2025-01-18 13:13:34 +08:00
ag2s20150909
4fe83d52cf
chore: fix CUDA on GitHub Action ( #567 )
2025-01-18 13:12:26 +08:00
null-define
b70aaa672a
chore: fix amd rocm build ( #571 )
2025-01-18 13:11:39 +08:00
idostyle
27edb765a5
chore: fix CI windows release artifacts ( #532 )
2025-01-18 13:09:22 +08:00
leejet
dcf91f9e0f
chore: change SD_CUBLAS/SD_USE_CUBLAS to SD_CUDA/SD_USE_CUDA
2024-12-28 13:27:51 +08:00
stduhpf
348a54e34a
feat: use pretty-progress for tensor loading ( #516 )
2024-12-28 13:14:52 +08:00
stduhpf
d50473dc49
feat: support 16 channel tae (taesd/taef1) ( #527 )
2024-12-28 13:13:48 +08:00
piallai
b5cc1422da
fix: fix typo for skip layers parameters ( #492 )
2024-12-28 13:12:08 +08:00
R0CKSTAR
5cc74d1f09
feat: support Moore Threads GPU ( #529 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-12-28 13:08:36 +08:00
stduhpf
0d9d6659a7
fix: fix metal build ( #513 )
2024-12-28 13:06:17 +08:00
stduhpf
8f4ab9add3
feat: support Inpaint models ( #511 )
2024-12-28 13:04:49 +08:00
stduhpf
cc92a6a1b3
feat: support more LoRA models ( #520 )
2024-12-28 12:56:44 +08:00
leejet
9578fdcc46
chore: remove rocm5.5 build temporarily
2024-11-30 14:26:29 +08:00
stduhpf
9148b980be
feat: remove type restrictions ( #489 )
2024-11-30 14:22:15 +08:00
stduhpf
7ce63e740c
feat: flexible model architecture for dit models (Flux & SD3) ( #490 )
...
* Refactor: wtype per tensor
* Fix default args
* refactor: fix flux
* Refactor photmaker v2 support
* unet: refactor the refactoring
* Refactor: fix controlnet and tae
* refactor: upscaler
* Refactor: fix runtime type override
* upscaler: use fp16 again
* Refactor: Flexible sd3 arch
* Refactor: Flexible Flux arch
* format code
---------
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-30 14:18:53 +08:00
leejet
4570715727
fix: use ggml_nn_attention in vae
2024-11-24 18:21:31 +08:00
stduhpf
53b415f787
fix: remove default variables in c headers ( #478 )
2024-11-24 18:10:25 +08:00
leejet
c3eeb669cd
sync: update ggml
2024-11-23 13:29:32 +08:00
leejet
b5f4932696
refactor: add some sd vesion helper functions
2024-11-23 13:02:44 +08:00
Erik Scholz
1c168d98a5
fix: repair flash attention support ( #386 )
...
* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE
Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
* make flash attention in the diffusion model a runtime flag
no support for sd3 or video
* remove old flash attention option and switch vae over to attn_ext
* update docs
* format code
---------
Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 12:39:08 +08:00
William Murray
ea9b647080
docs: update readme, add python bindings ( #423 )
2024-11-23 11:52:33 +08:00
bssrdf
2b1bc06477
feat: add PhotoMaker Version 2 support ( #358 )
...
* first attempt at updating to photomaker v2
* continue adding photomaker v2 modules
* finishing the last few pieces for photomaker v2; id_embeds need to be done by a manual step and pass as an input file
* added a name converter for Photomaker V2; build ok
* more debugging underway
* failing at cuda mat_mul
* updated chunk_half to be more efficient; redo feedforward
* fixed a bug: carefully using ggml_view_4d to get chunks of a tensor; strides need to be recalculated or set properly; still failing at soft_max cuda op
* redo weight calculation and weight*v
* fixed a bug now Photomaker V2 kinds of working
* add python script for face detection (Photomaker V2 needs)
* updated readme for photomaker
* fixed a bug causing PMV1 crashing; both V1 and V2 work
* fixed clean_input_ids for PMV2
* fixed a double counting bug in tokenize_with_trigger_token
* updated photomaker readme
* removed some commented code
* improved reconstructing class word free prompt
* changed reading id_embed to raw binary using existing load tensor function; this is more efficient than using model load and also makes it easier to work with sd server
* minor clean up
---------
Co-authored-by: bssrdf <bssrdf@gmail.com>
2024-11-23 11:50:14 +08:00
Flavio Bizzarri
b99cbfe4dc
docs: update README.md ( #452 )
2024-11-23 11:46:50 +08:00
Plamen Minev
8c7719fe9a
fix: typo in clip-g encoder arg ( #472 )
2024-11-23 11:46:00 +08:00
LostRuins Concedo
8f94efafa3
feat: add support for loading F8_E5M2 weights ( #460 )
2024-11-23 11:45:11 +08:00
fszontagh
07585448ad
docs: update readme ( #462 )
2024-11-23 11:42:12 +08:00
stduhpf
6ea812256e
feat: add flux 1 lite 8B (freepik) support ( #474 )
...
* Flux Lite (Freepik) support
* format code
---------
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 11:41:30 +08:00