feat: add TAESD implementation - faster autoencoder (#88)
* add taesd implementation * taesd gpu offloading * show seed when generating image with -s -1 * less restrictive with larger images * cuda: im2col speedup x2 * cuda: group norm speedup x90 * quantized models now works in cuda :) * fix cal mem size --------- Co-authored-by: leejet <leejet714@gmail.com>
This commit is contained in:
parent
f99bcd1f76
commit
134883aec4
5
.gitignore
vendored
5
.gitignore
vendored
@ -8,6 +8,7 @@ test/
|
||||
*.bin
|
||||
*.exe
|
||||
*.gguf
|
||||
output*.png
|
||||
models*
|
||||
!taesd-model.gguf
|
||||
*.log
|
||||
output.png
|
||||
models/
|
35
README.md
35
README.md
@ -9,22 +9,23 @@ Inference of [Stable Diffusion](https://github.com/CompVis/stable-diffusion) in
|
||||
## Features
|
||||
|
||||
- Plain C/C++ implementation based on [ggml](https://github.com/ggerganov/ggml), working in the same way as [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
||||
- Super lightweight and without external dependencies.
|
||||
- Super lightweight and without external dependencies
|
||||
- SD1.x and SD2.x support
|
||||
- 16-bit, 32-bit float support
|
||||
- 4-bit, 5-bit and 8-bit integer quantization support
|
||||
- Accelerated memory-efficient CPU inference
|
||||
- Only requires ~2.3GB when using txt2img with fp16 precision to generate a 512x512 image, enabling Flash Attention just requires ~1.8GB.
|
||||
- AVX, AVX2 and AVX512 support for x86 architectures
|
||||
- Full CUDA backend for GPU acceleration, for now just for float16 and float32 models. There are some issues with quantized models and CUDA; it will be fixed in the future.
|
||||
- Can load ckpt, safetensors and diffusers models/checkpoints. Standalone VAEs models.
|
||||
- Full CUDA backend for GPU acceleration.
|
||||
- Can load ckpt, safetensors and diffusers models/checkpoints. Standalone VAEs models
|
||||
- No need to convert to `.ggml` or `.gguf` anymore!
|
||||
- Flash Attention for memory usage optimization (only cpu for now).
|
||||
- Flash Attention for memory usage optimization (only cpu for now)
|
||||
- Original `txt2img` and `img2img` mode
|
||||
- Negative prompt
|
||||
- [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) style tokenizer (not all the features, only token weighting for now)
|
||||
- LoRA support, same as [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#lora)
|
||||
- Latent Consistency Models support (LCM/LCM-LoRA)
|
||||
- Faster and memory efficient latent decoding with [TAESD](https://github.com/madebyollin/taesd)
|
||||
- Sampling method
|
||||
- `Euler A`
|
||||
- `Euler`
|
||||
@ -47,9 +48,10 @@ Inference of [Stable Diffusion](https://github.com/CompVis/stable-diffusion) in
|
||||
- [ ] More sampling methods
|
||||
- [ ] Make inference faster
|
||||
- The current implementation of ggml_conv_2d is slow and has high memory usage
|
||||
- Implement Winograd Convolution 2D for 3x3 kernel filtering
|
||||
- [ ] Continuing to reduce memory usage (quantizing the weights of ggml_conv_2d)
|
||||
- [ ] Implement BPE Tokenizer
|
||||
- [ ] Add [TAESD](https://github.com/madebyollin/taesd) for faster VAE decoding
|
||||
- [ ] Implement [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN/tree/master) upscaler
|
||||
- [ ] k-quants support
|
||||
|
||||
## Usage
|
||||
@ -122,7 +124,7 @@ cmake --build . --config Release
|
||||
### Run
|
||||
|
||||
```
|
||||
usage: ./bin/sd [arguments]
|
||||
usage: sd [arguments]
|
||||
|
||||
arguments:
|
||||
-h, --help show this help message and exit
|
||||
@ -131,8 +133,10 @@ arguments:
|
||||
If threads <= 0, then threads will be set to the number of CPU physical cores
|
||||
-m, --model [MODEL] path to model
|
||||
--vae [VAE] path to vae
|
||||
--taesd [TAESD_PATH] path to taesd. Using Tiny AutoEncoder for fast decoding (low quality)
|
||||
--type [TYPE] weight type (f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0)
|
||||
If not specified, the default is the type of the weight file. --lora-model-dir [DIR] lora model directory
|
||||
If not specified, the default is the type of the weight file.
|
||||
--lora-model-dir [DIR] lora model directory
|
||||
-i, --init-img [IMAGE] path to the input image, required by img2img
|
||||
-o, --output OUTPUT path to write result image to (default: ./output.png)
|
||||
-p, --prompt [PROMPT] the prompt to render
|
||||
@ -218,6 +222,23 @@ Here's a simple example:
|
||||
| ---- |---- |
|
||||
|  | |
|
||||
|
||||
## Using TAESD to faster decoding
|
||||
|
||||
You can use TAESD to accelerate the decoding of latent images by following these steps:
|
||||
|
||||
- Download the model [weights](https://huggingface.co/madebyollin/taesd/blob/main/diffusion_pytorch_model.safetensors).
|
||||
|
||||
Or curl
|
||||
|
||||
```bash
|
||||
curl -L -O https://huggingface.co/madebyollin/taesd/blob/main/diffusion_pytorch_model.safetensors
|
||||
```
|
||||
|
||||
- Specify the model path using the `--taesd PATH` parameter. example:
|
||||
|
||||
```bash
|
||||
sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors
|
||||
```
|
||||
|
||||
### Docker
|
||||
|
||||
|
24596
common/json.hpp
24596
common/json.hpp
File diff suppressed because it is too large
Load Diff
10130
common/miniz.h
10130
common/miniz.h
File diff suppressed because it is too large
Load Diff
7987
common/stb_image.h
7987
common/stb_image.h
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
1836
common/zip.c
1836
common/zip.c
File diff suppressed because it is too large
Load Diff
509
common/zip.h
509
common/zip.h
@ -1,509 +0,0 @@
|
||||
/*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
||||
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||||
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
||||
* IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
|
||||
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
|
||||
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
||||
* OTHER DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#pragma once
|
||||
#ifndef ZIP_H
|
||||
#define ZIP_H
|
||||
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
#include <sys/types.h>
|
||||
|
||||
#ifndef ZIP_SHARED
|
||||
#define ZIP_EXPORT
|
||||
#else
|
||||
#ifdef _WIN32
|
||||
#ifdef ZIP_BUILD_SHARED
|
||||
#define ZIP_EXPORT __declspec(dllexport)
|
||||
#else
|
||||
#define ZIP_EXPORT __declspec(dllimport)
|
||||
#endif
|
||||
#else
|
||||
#define ZIP_EXPORT __attribute__((visibility("default")))
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
#if !defined(_POSIX_C_SOURCE) && defined(_MSC_VER)
|
||||
// 64-bit Windows is the only mainstream platform
|
||||
// where sizeof(long) != sizeof(void*)
|
||||
#ifdef _WIN64
|
||||
typedef long long ssize_t; /* byte count or error */
|
||||
#else
|
||||
typedef long ssize_t; /* byte count or error */
|
||||
#endif
|
||||
#endif
|
||||
|
||||
/**
|
||||
* @mainpage
|
||||
*
|
||||
* Documentation for @ref zip.
|
||||
*/
|
||||
|
||||
/**
|
||||
* @addtogroup zip
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* Default zip compression level.
|
||||
*/
|
||||
#define ZIP_DEFAULT_COMPRESSION_LEVEL 6
|
||||
|
||||
/**
|
||||
* Error codes
|
||||
*/
|
||||
#define ZIP_ENOINIT -1 // not initialized
|
||||
#define ZIP_EINVENTNAME -2 // invalid entry name
|
||||
#define ZIP_ENOENT -3 // entry not found
|
||||
#define ZIP_EINVMODE -4 // invalid zip mode
|
||||
#define ZIP_EINVLVL -5 // invalid compression level
|
||||
#define ZIP_ENOSUP64 -6 // no zip 64 support
|
||||
#define ZIP_EMEMSET -7 // memset error
|
||||
#define ZIP_EWRTENT -8 // cannot write data to entry
|
||||
#define ZIP_ETDEFLINIT -9 // cannot initialize tdefl compressor
|
||||
#define ZIP_EINVIDX -10 // invalid index
|
||||
#define ZIP_ENOHDR -11 // header not found
|
||||
#define ZIP_ETDEFLBUF -12 // cannot flush tdefl buffer
|
||||
#define ZIP_ECRTHDR -13 // cannot create entry header
|
||||
#define ZIP_EWRTHDR -14 // cannot write entry header
|
||||
#define ZIP_EWRTDIR -15 // cannot write to central dir
|
||||
#define ZIP_EOPNFILE -16 // cannot open file
|
||||
#define ZIP_EINVENTTYPE -17 // invalid entry type
|
||||
#define ZIP_EMEMNOALLOC -18 // extracting data using no memory allocation
|
||||
#define ZIP_ENOFILE -19 // file not found
|
||||
#define ZIP_ENOPERM -20 // no permission
|
||||
#define ZIP_EOOMEM -21 // out of memory
|
||||
#define ZIP_EINVZIPNAME -22 // invalid zip archive name
|
||||
#define ZIP_EMKDIR -23 // make dir error
|
||||
#define ZIP_ESYMLINK -24 // symlink error
|
||||
#define ZIP_ECLSZIP -25 // close archive error
|
||||
#define ZIP_ECAPSIZE -26 // capacity size too small
|
||||
#define ZIP_EFSEEK -27 // fseek error
|
||||
#define ZIP_EFREAD -28 // fread error
|
||||
#define ZIP_EFWRITE -29 // fwrite error
|
||||
#define ZIP_ERINIT -30 // cannot initialize reader
|
||||
#define ZIP_EWINIT -31 // cannot initialize writer
|
||||
#define ZIP_EWRINIT -32 // cannot initialize writer from reader
|
||||
|
||||
/**
|
||||
* Looks up the error message string corresponding to an error number.
|
||||
* @param errnum error number
|
||||
* @return error message string corresponding to errnum or NULL if error is not
|
||||
* found.
|
||||
*/
|
||||
extern ZIP_EXPORT const char *zip_strerror(int errnum);
|
||||
|
||||
/**
|
||||
* @struct zip_t
|
||||
*
|
||||
* This data structure is used throughout the library to represent zip archive -
|
||||
* forward declaration.
|
||||
*/
|
||||
struct zip_t;
|
||||
|
||||
/**
|
||||
* Opens zip archive with compression level using the given mode.
|
||||
*
|
||||
* @param zipname zip archive file name.
|
||||
* @param level compression level (0-9 are the standard zlib-style levels).
|
||||
* @param mode file access mode.
|
||||
* - 'r': opens a file for reading/extracting (the file must exists).
|
||||
* - 'w': creates an empty file for writing.
|
||||
* - 'a': appends to an existing archive.
|
||||
*
|
||||
* @return the zip archive handler or NULL on error
|
||||
*/
|
||||
extern ZIP_EXPORT struct zip_t *zip_open(const char *zipname, int level,
|
||||
char mode);
|
||||
|
||||
/**
|
||||
* Opens zip archive with compression level using the given mode.
|
||||
* The function additionally returns @param errnum -
|
||||
*
|
||||
* @param zipname zip archive file name.
|
||||
* @param level compression level (0-9 are the standard zlib-style levels).
|
||||
* @param mode file access mode.
|
||||
* - 'r': opens a file for reading/extracting (the file must exists).
|
||||
* - 'w': creates an empty file for writing.
|
||||
* - 'a': appends to an existing archive.
|
||||
* @param errnum 0 on success, negative number (< 0) on error.
|
||||
*
|
||||
* @return the zip archive handler or NULL on error
|
||||
*/
|
||||
extern ZIP_EXPORT struct zip_t *
|
||||
zip_openwitherror(const char *zipname, int level, char mode, int *errnum);
|
||||
|
||||
/**
|
||||
* Closes the zip archive, releases resources - always finalize.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*/
|
||||
extern ZIP_EXPORT void zip_close(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Determines if the archive has a zip64 end of central directory headers.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the return code - 1 (true), 0 (false), negative number (< 0) on
|
||||
* error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_is64(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Opens an entry by name in the zip archive.
|
||||
*
|
||||
* For zip archive opened in 'w' or 'a' mode the function will append
|
||||
* a new entry. In readonly mode the function tries to locate the entry
|
||||
* in global dictionary.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param entryname an entry name in local dictionary.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_open(struct zip_t *zip, const char *entryname);
|
||||
|
||||
/**
|
||||
* Opens an entry by name in the zip archive.
|
||||
*
|
||||
* For zip archive opened in 'w' or 'a' mode the function will append
|
||||
* a new entry. In readonly mode the function tries to locate the entry
|
||||
* in global dictionary (case sensitive).
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param entryname an entry name in local dictionary (case sensitive).
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_opencasesensitive(struct zip_t *zip,
|
||||
const char *entryname);
|
||||
|
||||
/**
|
||||
* Opens a new entry by index in the zip archive.
|
||||
*
|
||||
* This function is only valid if zip archive was opened in 'r' (readonly) mode.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param index index in local dictionary.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_openbyindex(struct zip_t *zip, size_t index);
|
||||
|
||||
/**
|
||||
* Closes a zip entry, flushes buffer and releases resources.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_close(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Returns a local name of the current zip entry.
|
||||
*
|
||||
* The main difference between user's entry name and local entry name
|
||||
* is optional relative path.
|
||||
* Following .ZIP File Format Specification - the path stored MUST not contain
|
||||
* a drive or device letter, or a leading slash.
|
||||
* All slashes MUST be forward slashes '/' as opposed to backwards slashes '\'
|
||||
* for compatibility with Amiga and UNIX file systems etc.
|
||||
*
|
||||
* @param zip: zip archive handler.
|
||||
*
|
||||
* @return the pointer to the current zip entry name, or NULL on error.
|
||||
*/
|
||||
extern ZIP_EXPORT const char *zip_entry_name(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Returns an index of the current zip entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the index on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT ssize_t zip_entry_index(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Determines if the current zip entry is a directory entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the return code - 1 (true), 0 (false), negative number (< 0) on
|
||||
* error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_isdir(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Returns the uncompressed size of the current zip entry.
|
||||
* Alias for zip_entry_uncomp_size (for backward compatibility).
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the uncompressed size in bytes.
|
||||
*/
|
||||
extern ZIP_EXPORT unsigned long long zip_entry_size(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Returns the uncompressed size of the current zip entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the uncompressed size in bytes.
|
||||
*/
|
||||
extern ZIP_EXPORT unsigned long long zip_entry_uncomp_size(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Returns the compressed size of the current zip entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the compressed size in bytes.
|
||||
*/
|
||||
extern ZIP_EXPORT unsigned long long zip_entry_comp_size(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Returns CRC-32 checksum of the current zip entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the CRC-32 checksum.
|
||||
*/
|
||||
extern ZIP_EXPORT unsigned int zip_entry_crc32(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Compresses an input buffer for the current zip entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param buf input buffer.
|
||||
* @param bufsize input buffer size (in bytes).
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_write(struct zip_t *zip, const void *buf,
|
||||
size_t bufsize);
|
||||
|
||||
/**
|
||||
* Compresses a file for the current zip entry.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param filename input file.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_fwrite(struct zip_t *zip, const char *filename);
|
||||
|
||||
/**
|
||||
* Extracts the current zip entry into output buffer.
|
||||
*
|
||||
* The function allocates sufficient memory for a output buffer.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param buf output buffer.
|
||||
* @param bufsize output buffer size (in bytes).
|
||||
*
|
||||
* @note remember to release memory allocated for a output buffer.
|
||||
* for large entries, please take a look at zip_entry_extract function.
|
||||
*
|
||||
* @return the return code - the number of bytes actually read on success.
|
||||
* Otherwise a negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT ssize_t zip_entry_read(struct zip_t *zip, void **buf,
|
||||
size_t *bufsize);
|
||||
|
||||
/**
|
||||
* Extracts the current zip entry into a memory buffer using no memory
|
||||
* allocation.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param buf preallocated output buffer.
|
||||
* @param bufsize output buffer size (in bytes).
|
||||
*
|
||||
* @note ensure supplied output buffer is large enough.
|
||||
* zip_entry_size function (returns uncompressed size for the current
|
||||
* entry) can be handy to estimate how big buffer is needed.
|
||||
* For large entries, please take a look at zip_entry_extract function.
|
||||
*
|
||||
* @return the return code - the number of bytes actually read on success.
|
||||
* Otherwise a negative number (< 0) on error (e.g. bufsize is not large
|
||||
* enough).
|
||||
*/
|
||||
extern ZIP_EXPORT ssize_t zip_entry_noallocread(struct zip_t *zip, void *buf,
|
||||
size_t bufsize);
|
||||
|
||||
/**
|
||||
* Extracts the current zip entry into output file.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param filename output file.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_entry_fread(struct zip_t *zip, const char *filename);
|
||||
|
||||
/**
|
||||
* Extracts the current zip entry using a callback function (on_extract).
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param on_extract callback function.
|
||||
* @param arg opaque pointer (optional argument, which you can pass to the
|
||||
* on_extract callback)
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int
|
||||
zip_entry_extract(struct zip_t *zip,
|
||||
size_t (*on_extract)(void *arg, uint64_t offset,
|
||||
const void *data, size_t size),
|
||||
void *arg);
|
||||
|
||||
/**
|
||||
* Returns the number of all entries (files and directories) in the zip archive.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return the return code - the number of entries on success, negative number
|
||||
* (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT ssize_t zip_entries_total(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Deletes zip archive entries.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param entries array of zip archive entries to be deleted.
|
||||
* @param len the number of entries to be deleted.
|
||||
* @return the number of deleted entries, or negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT ssize_t zip_entries_delete(struct zip_t *zip,
|
||||
char *const entries[], size_t len);
|
||||
|
||||
/**
|
||||
* Extracts a zip archive stream into directory.
|
||||
*
|
||||
* If on_extract is not NULL, the callback will be called after
|
||||
* successfully extracted each zip entry.
|
||||
* Returning a negative value from the callback will cause abort and return an
|
||||
* error. The last argument (void *arg) is optional, which you can use to pass
|
||||
* data to the on_extract callback.
|
||||
*
|
||||
* @param stream zip archive stream.
|
||||
* @param size stream size.
|
||||
* @param dir output directory.
|
||||
* @param on_extract on extract callback.
|
||||
* @param arg opaque pointer.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int
|
||||
zip_stream_extract(const char *stream, size_t size, const char *dir,
|
||||
int (*on_extract)(const char *filename, void *arg),
|
||||
void *arg);
|
||||
|
||||
/**
|
||||
* Opens zip archive stream into memory.
|
||||
*
|
||||
* @param stream zip archive stream.
|
||||
* @param size stream size.
|
||||
* @param level compression level (0-9 are the standard zlib-style levels).
|
||||
* @param mode file access mode.
|
||||
* - 'r': opens a file for reading/extracting (the file must exists).
|
||||
* - 'w': creates an empty file for writing.
|
||||
* - 'a': appends to an existing archive.
|
||||
*
|
||||
* @return the zip archive handler or NULL on error
|
||||
*/
|
||||
extern ZIP_EXPORT struct zip_t *zip_stream_open(const char *stream, size_t size,
|
||||
int level, char mode);
|
||||
|
||||
/**
|
||||
* Opens zip archive stream into memory.
|
||||
* The function additionally returns @param errnum -
|
||||
*
|
||||
* @param stream zip archive stream.
|
||||
* @param size stream size.*
|
||||
* @param level compression level (0-9 are the standard zlib-style levels).
|
||||
* @param mode file access mode.
|
||||
* - 'r': opens a file for reading/extracting (the file must exists).
|
||||
* - 'w': creates an empty file for writing.
|
||||
* - 'a': appends to an existing archive.
|
||||
* @param errnum 0 on success, negative number (< 0) on error.
|
||||
*
|
||||
* @return the zip archive handler or NULL on error
|
||||
*/
|
||||
extern ZIP_EXPORT struct zip_t *zip_stream_openwitherror(const char *stream,
|
||||
size_t size, int level,
|
||||
char mode,
|
||||
int *errnum);
|
||||
|
||||
/**
|
||||
* Copy zip archive stream output buffer.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
* @param buf output buffer. User should free buf.
|
||||
* @param bufsize output buffer size (in bytes).
|
||||
*
|
||||
* @return copy size
|
||||
*/
|
||||
extern ZIP_EXPORT ssize_t zip_stream_copy(struct zip_t *zip, void **buf,
|
||||
size_t *bufsize);
|
||||
|
||||
/**
|
||||
* Close zip archive releases resources.
|
||||
*
|
||||
* @param zip zip archive handler.
|
||||
*
|
||||
* @return
|
||||
*/
|
||||
extern ZIP_EXPORT void zip_stream_close(struct zip_t *zip);
|
||||
|
||||
/**
|
||||
* Creates a new archive and puts files into a single zip archive.
|
||||
*
|
||||
* @param zipname zip archive file.
|
||||
* @param filenames input files.
|
||||
* @param len: number of input files.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_create(const char *zipname, const char *filenames[],
|
||||
size_t len);
|
||||
|
||||
/**
|
||||
* Extracts a zip archive file into directory.
|
||||
*
|
||||
* If on_extract_entry is not NULL, the callback will be called after
|
||||
* successfully extracted each zip entry.
|
||||
* Returning a negative value from the callback will cause abort and return an
|
||||
* error. The last argument (void *arg) is optional, which you can use to pass
|
||||
* data to the on_extract_entry callback.
|
||||
*
|
||||
* @param zipname zip archive file.
|
||||
* @param dir output directory.
|
||||
* @param on_extract_entry on extract callback.
|
||||
* @param arg opaque pointer.
|
||||
*
|
||||
* @return the return code - 0 on success, negative number (< 0) on error.
|
||||
*/
|
||||
extern ZIP_EXPORT int zip_extract(const char *zipname, const char *dir,
|
||||
int (*on_extract_entry)(const char *filename,
|
||||
void *arg),
|
||||
void *arg);
|
||||
/** @} */
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif
|
@ -58,6 +58,7 @@ struct SDParams {
|
||||
|
||||
std::string model_path;
|
||||
std::string vae_path;
|
||||
std::string taesd_path;
|
||||
ggml_type wtype = GGML_TYPE_COUNT;
|
||||
std::string lora_model_dir;
|
||||
std::string output_path = "output.png";
|
||||
@ -86,6 +87,7 @@ void print_params(SDParams params) {
|
||||
printf(" model_path: %s\n", params.model_path.c_str());
|
||||
printf(" wtype: %s\n", params.wtype < GGML_TYPE_COUNT ? ggml_type_name(params.wtype) : "unspecified");
|
||||
printf(" vae_path: %s\n", params.vae_path.c_str());
|
||||
printf(" taesd_path: %s\n", params.taesd_path.c_str());
|
||||
printf(" output_path: %s\n", params.output_path.c_str());
|
||||
printf(" init_img: %s\n", params.input_path.c_str());
|
||||
printf(" prompt: %s\n", params.prompt.c_str());
|
||||
@ -112,8 +114,9 @@ void print_usage(int argc, const char* argv[]) {
|
||||
printf(" If threads <= 0, then threads will be set to the number of CPU physical cores\n");
|
||||
printf(" -m, --model [MODEL] path to model\n");
|
||||
printf(" --vae [VAE] path to vae\n");
|
||||
printf(" --taesd [TAESD_PATH] path to taesd. Using Tiny AutoEncoder for fast decoding (low quality)\n");
|
||||
printf(" --type [TYPE] weight type (f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0)\n");
|
||||
printf(" If not specified, the default is the type of the weight file.");
|
||||
printf(" If not specified, the default is the type of the weight file.\n");
|
||||
printf(" --lora-model-dir [DIR] lora model directory\n");
|
||||
printf(" -i, --init-img [IMAGE] path to the input image, required by img2img\n");
|
||||
printf(" -o, --output OUTPUT path to write result image to (default: ./output.png)\n");
|
||||
@ -176,6 +179,12 @@ void parse_args(int argc, const char** argv, SDParams& params) {
|
||||
break;
|
||||
}
|
||||
params.vae_path = argv[i];
|
||||
} else if (arg == "--taesd") {
|
||||
if (++i >= argc) {
|
||||
invalid_arg = true;
|
||||
break;
|
||||
}
|
||||
params.taesd_path = argv[i];
|
||||
} else if (arg == "--type") {
|
||||
if (++i >= argc) {
|
||||
invalid_arg = true;
|
||||
@ -449,7 +458,8 @@ int main(int argc, const char* argv[]) {
|
||||
}
|
||||
}
|
||||
|
||||
StableDiffusion sd(params.n_threads, vae_decode_only, true, params.lora_model_dir, params.rng_type);
|
||||
StableDiffusion sd(params.n_threads, vae_decode_only, params.taesd_path, true, params.lora_model_dir, params.rng_type);
|
||||
|
||||
if (!sd.load_from_file(params.model_path, params.vae_path, params.wtype, params.schedule)) {
|
||||
return 1;
|
||||
}
|
||||
|
2
ggml
2
ggml
@ -1 +1 @@
|
||||
Subproject commit 03669ba9fdc5e0520e919e5c7e1b3a3359d28e59
|
||||
Subproject commit 70474c6890c015b53dc10a2300ae35246cc73589
|
19
model.cpp
19
model.cpp
@ -1296,7 +1296,7 @@ bool ModelLoader::load_tensors(on_new_tensor_cb_t on_new_tensor_cb) {
|
||||
if (backend == NULL || ggml_backend_is_cpu(backend)) {
|
||||
// for the CPU and Metal backend, we can copy directly into the tensor
|
||||
if (tensor_storage.type == dst_tensor->type) {
|
||||
GGML_ASSERT(ggml_nbytes(dst_tensor) == nbytes_to_read);
|
||||
GGML_ASSERT(ggml_nbytes(dst_tensor) == tensor_storage.nbytes());
|
||||
read_data(tensor_storage, (char*)dst_tensor->data, nbytes_to_read);
|
||||
|
||||
if (tensor_storage.is_bf16) {
|
||||
@ -1349,16 +1349,23 @@ bool ModelLoader::load_tensors(on_new_tensor_cb_t on_new_tensor_cb) {
|
||||
return success;
|
||||
}
|
||||
|
||||
int64_t ModelLoader::cal_mem_size() {
|
||||
int64_t ModelLoader::cal_mem_size(ggml_backend_t backend) {
|
||||
size_t alignment = 128;
|
||||
if (backend != NULL) {
|
||||
alignment = ggml_backend_get_alignment(backend);
|
||||
}
|
||||
int64_t mem_size = 0;
|
||||
std::vector<TensorStorage> processed_tensor_storages;
|
||||
for (auto& tensor_storage : tensor_storages) {
|
||||
if (is_unused_tensor(tensor_storage.name)) {
|
||||
continue;
|
||||
}
|
||||
|
||||
mem_size += tensor_storage.nbytes();
|
||||
mem_size += GGML_MEM_ALIGN * 2; // for lora alphas
|
||||
preprocess_tensor(tensor_storage, processed_tensor_storages);
|
||||
}
|
||||
|
||||
return mem_size + 10 * 1024 * 1024;
|
||||
for (auto& tensor_storage : processed_tensor_storages) {
|
||||
mem_size += tensor_storage.nbytes() + alignment;
|
||||
}
|
||||
|
||||
return mem_size;
|
||||
}
|
||||
|
3
model.h
3
model.h
@ -8,6 +8,7 @@
|
||||
#include <vector>
|
||||
|
||||
#include "ggml/ggml.h"
|
||||
#include "ggml/ggml-backend.h"
|
||||
#include "json.hpp"
|
||||
#include "zip.h"
|
||||
|
||||
@ -116,7 +117,7 @@ public:
|
||||
ggml_type get_sd_wtype();
|
||||
bool load_vocab(on_new_token_cb_t on_new_token_cb);
|
||||
bool load_tensors(on_new_tensor_cb_t on_new_tensor_cb);
|
||||
int64_t cal_mem_size();
|
||||
int64_t cal_mem_size(ggml_backend_t backend);
|
||||
~ModelLoader() = default;
|
||||
};
|
||||
#endif // __MODEL_H__
|
File diff suppressed because it is too large
Load Diff
@ -38,6 +38,7 @@ private:
|
||||
public:
|
||||
StableDiffusion(int n_threads = -1,
|
||||
bool vae_decode_only = false,
|
||||
std::string taesd_path = "",
|
||||
bool free_params_immediately = false,
|
||||
std::string lora_model_dir = "",
|
||||
RNGType rng_type = STD_DEFAULT_RNG);
|
||||
|
Loading…
Reference in New Issue
Block a user