Commit Graph

27 Commits

Author SHA1 Message Date
Meng, Hengyu
838beb9b5e
chore: add global SYCL compile flags (#597) 2025-02-22 21:23:58 +08:00
R0CKSTAR
a3cbdf6dcb
chore: SD_USE_CUBLAS => SD_USE_CUDA for MUSA backend (#578)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-02-05 16:11:26 +08:00
null-define
b70aaa672a
chore: fix amd rocm build (#571) 2025-01-18 13:11:39 +08:00
leejet
dcf91f9e0f chore: change SD_CUBLAS/SD_USE_CUBLAS to SD_CUDA/SD_USE_CUDA 2024-12-28 13:27:51 +08:00
R0CKSTAR
5cc74d1f09
feat: support Moore Threads GPU (#529)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-12-28 13:08:36 +08:00
Erik Scholz
1c168d98a5
fix: repair flash attention support (#386)
* repair flash attention in _ext
this does not fix the currently broken fa behind the define, which is only used by VAE

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>

* make flash attention in the diffusion model a runtime flag
no support for sd3 or video

* remove old flash attention option and switch vae over to attn_ext

* update docs

* format code

---------

Co-authored-by: FSSRepo <FSSRepo@users.noreply.github.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-11-23 12:39:08 +08:00
zhentaoyu
e410aeb534
sync: update ggml to fix large image generation with SYCL backend (#380)
* turn off fast-math on host in SYCL backend

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update ggml for sync some sycl ops

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update sycl readme and ggml

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

---------

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
2024-09-02 22:29:35 +08:00
Yu Xing
6c88ad3fd6
fix: resolve naming conflict while llama.cpp and sd.cpp both build (#351) 2024-08-28 00:14:41 +08:00
soham
2027b16fda
feat: add vulkan backend support (#291)
* Fix includes and init vulkan the same as llama.cpp

* Add Windows Vulkan CI

* Updated ggml submodule

* support epsilon as a parameter for ggml_group_norm

---------

Co-authored-by: Cloudwalk <cloudwalk@icculus.org>
Co-authored-by: Oleg Skutte <00.00.oleg.00.00@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
2024-08-27 23:56:09 +08:00
zhentaoyu
697d000f49
feat: add SYCL Backend Support for Intel GPUs (#330)
* update ggml and add SYCL CMake option

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* hacky CMakeLists.txt for updating ggml in cpu backend

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* rebase and clean code

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* add sycl in README

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* rebase ggml commit

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* refine README

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

* update ggml for supporting sycl tsembd op

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>

---------

Signed-off-by: zhentaoyu <zhentao.yu@intel.com>
2024-08-10 13:42:50 +08:00
leejet
be6cd1a4bf sync: update ggml 2024-06-01 13:44:09 +08:00
Phu Tran
1ce9470f27
fix: fix building shared library (#188) 2024-03-03 13:24:59 +08:00
Cyberhan123
b7870a0f89
chore: improve ci (#150)
---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-02-26 22:01:34 +08:00
leejet
b6368868d9
feat: introduce GGMLBlock and implement SVD(Broken) (#159)
* introduce GGMLBlock and implement SVD(Broken)

* add sdxl vae warning
2024-02-24 20:06:39 +08:00
Steward Garcia
36ec16ac99
feat: Control Net support + Textual Inversion (embeddings) (#131)
* add controlnet to pipeline

* add cli params

* control strength cli param

* cli param keep controlnet in cpu

* add Textual Inversion

* add canny preprocessor

* refactor: change ggml_type_sizef to ggml_row_size

* process hint once time

* ignore the embedding name case

---------

Co-authored-by: leejet <leejet714@gmail.com>
2024-01-29 22:38:51 +08:00
旺旺碎冰冰
c6071fa82f
feat: add hipBlas support (#94) 2024-01-14 11:53:42 +08:00
leejet
7fb8a51318 chore: make SD_BUILD_DLL visible only to SD_LIB 2024-01-02 22:31:40 +08:00
leejet
2c5f3fc53a chore: add support for building shared library 2024-01-02 21:05:44 +08:00
leejet
2e79a82f85
refactor: reorganize code and use c api (#133) 2024-01-01 16:22:18 +08:00
Steward Garcia
004dfbef27
feat: implement ESRGAN upscaler + Metal Backend (#104)
* add esrgan upscaler

* add sd_tiling

* support metal backend

* add clip_skip

---------

Co-authored-by: leejet <leejet714@gmail.com>
2023-12-28 23:46:48 +08:00
leejet
d7af2c2ba9
feat: load weights from safetensors and ckpt (#101) 2023-12-03 15:47:20 +08:00
Steward Garcia
8124588cf1
feat: ggml-alloc integration and gpu acceleration (#75)
* set ggml url to FSSRepo/ggml

* ggml-alloc integration

* offload all functions to gpu

* gguf format + native converter

* merge custom vae to a model

* full offload to gpu

* improve pretty progress

---------

Co-authored-by: leejet <leejet714@gmail.com>
2023-11-26 19:02:36 +08:00
leejet
09cab2a2ae chore: set default BUILD_SHARED_LIBS to OFF 2023-10-22 14:59:03 +08:00
Erik Scholz
844351c417
feat: cmake improvements and simple ci (#9)
* move main and stb-libs to subfolder

* cmake : general additions

* ci : add simple building

---------

Co-authored-by: leejet <31925346+leejet@users.noreply.github.com>
2023-08-17 21:09:57 +08:00
leejet
58735a2813
feat: add img2img mode (#5) 2023-08-16 01:48:07 +08:00
Georgi Gerganov
a08cae6d95
fix: minor build fixes (#2)
* cmake : fix C++11 build

* gitignore : ignore .cache
2023-08-14 08:12:04 +08:00
leejet
3aca342e60 Initial commit 2023-08-13 16:00:22 +08:00