0.46

2026-05-06 08:16:43 -03:00 · 2024-09-28 17:45:23 +02:00
parent 3d6014206f
commit c1cebdf1de
20 changed files with 561 additions and 79 deletions
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# 🔗 Comfyui : Bjornulf_custom_nodes v0.45 🔗
+# 🔗 Comfyui : Bjornulf_custom_nodes v0.46 🔗
 # Coffee : ☕☕☕☕☕ 5/5
@@ -8,6 +8,7 @@
 ## 👁 Display and Show 👁
 `1.` [👁 Show (Text, Int, Float)](#1----show-text-int-float)  
 `49.` [📹👁 Video Preview](#49)  
 ## ✒ Text ✒
 `2.` [✒ Write Text](#2----write-text)  
@@ -15,7 +16,7 @@
 `4.` [🔗 Combine Texts](#4----combine-texts)  
 `15.` [💾 Save Text](#15----save-text)  
 `26.` [🎲 Random line from input](#26----random-line-from-input)  
-`28.` [🔢 Text with random Seed](#28----text-with-random-seed)  
+`28.` [🔢🎲 Text with random Seed](#28----text-with-random-seed)  
 `32.` [🧑📝 Character Description Generator](#32----character-description-generator)  
 `48.` [🔀🎲 Text scrambler (🧑 Character)](#48----text-scrambler--character)  
@@ -37,6 +38,7 @@
 `3.` [✒🗔 Advanced Write Text (+ 🎲 random selection and 🅰️ variables)](#3----advanced-write-text---random-selection-and-🅰%EF%B8%8F-variables)  
 `5.` [🎲 Random (Texts)](#5----random-texts)  
 `26.` [🎲 Random line from input](#26----random-line-from-input)  
 `28.` [🔢🎲 Text with random Seed](#28----text-with-random-seed)  
 `37.` [🎲🖼 Random Image](#37----random-image)  
 `40.` [🎲 Random (Model+Clip+Vae) - aka Checkpoint / Model](#40----random-modelclipvae---aka-checkpoint--model)  
 `41.` [🎲 Random Load checkpoint (Model Selector)](#41----random-load-checkpoint-model-selector)  
@@ -69,7 +71,11 @@
 ## 📹 Video 📹
 `20.` [📹 Video Ping Pong](#20----video-ping-pong)  
-`21.` [📹 Images to Video](#21----images-to-video)  
+`21.` [📹 Images to Video (FFmpeg)](#21----images-to-video)  
 `49.` [📹👁 Video Preview](#49)  
 `50.` [🖼➜📹 Images to Video path (tmp video)](#50)  
 `51.` [📹➜🖼 Video Path to Images](#51)  
 `52.` [🔊📹 Audio Video Sync](#52)  
 ## 🦙 AI 🦙
 `19.` [🦙 Ollama](#19----ollama)  
@@ -77,13 +83,14 @@
 ## 🔊 Audio 🔊
 `31.` [🔊 TTS - Text to Speech](#31----tts---text-to-speech-100-local-any-voice-you-want-any-language)  
 `52.` [🔊📹 Audio Video Sync](#52)  
 ## 💻 System 💻
 `34.` [🧹 Free VRAM hack](#34----free-vram-hack)  
 ## 🧍 Manual user Control 🧍
-`35.` [⏸️ Paused. Resume or Stop ?](#35---%EF%B8%8F-paused-resume-or-stop-)  
+`35.` [⏸️ Paused. Resume or Stop, Pick 👇](#35---%EF%B8%8F-paused-resume-or-stop-)  
-`36.` [⏸️🔍 Paused. Select input, Pick one](#36---%EF%B8%8F-paused-select-input-pick-one)  
+`36.` [⏸️ Paused. Select input, Pick 👇](#36---%EF%B8%8F-paused-select-input-pick-one)  
 ## 🧠 Logic / Conditional Operations 🧠
 `45.` [🔀 If-Else (input / compare_with)](#45----if-else-input--compare_with)  
@@ -217,6 +224,7 @@ cd /where/you/installed/ComfyUI && python main.py
 - **v0.43**: Add control_after_generate to Ollama and allow to keep in VRAM for 1 minute if needed. (For chaining quick generations.) Add fallback to 0.0.0.0
 - **v0.44**: Allow ollama to have a cusom url in the file `ollama_ip.txt` in the comfyui custom nodes folder. Minor changes, add details/updates to README.
 - **v0.45**: Add a new node : Text scrambler (Character), change text randomly using the file `scrambler/scrambler_character.json` in the comfyui custom nodes folder.
 - **v0.46**: ❗ A lot of changes to Video nodes. Save to video is now using FLOAT for fps, not INT. (A lot of other custom nodes do that as well...) Add node to preview video, add node to convert a video path to a list of images. add node to convert a list of images to a temporary video + video_path. add node to synchronize duration of audio with video. (useful for MuseTalk) change TTS node with many new outputs ("audio_path", "full_path", "duration") to reuse with other nodes like MuseTalk, also TTS rename input to "connect_to_workflow", to avoid mistakes sending text to it.
 # 📝 Nodes descriptions
@@ -521,13 +529,20 @@ Also, when you select a voice with this format `fr/fake_Bjornulf.wav`, it will c
 So... note that if you know you have an audio file ready to play, you can still use my node but you do NOT need my TTS server to be running.
 My node will just play the audio file if it can find it, won't try to connect th backend TTS server.  
-Let's say you already use this node to create an audio file saying `workflow is done` with the Attenborough voice  :
+Let's say you already use this node to create an audio file saying `workflow is done` with the Attenborough voice  :  
 ![TTS](screenshots/tts_end.png)  
-As long as you keep exactly the same settings, it will not use my server to play the audio file! You can safely turn in off, so it won't use your precious VRAM Duh. (TTS server should be using ~3GB of VRAM.)  
+As long as you keep exactly the same settings, it will not use my server to play the audio file! You can safely turn the TTS server off, so it won't use your precious VRAM Duh. (TTS server should be using ~3GB of VRAM.)  
 Also `connect_to_workflow` is optional, it means that you can make a workflow with ONLY my TTS node to pre-generate the audio files with the sentences you want to use later, example :  
 ![TTS](screenshots/tts_preload.png)  
 If you want to run my TTS nodes along side image generation, i recommend you to use my PAUSE node so you can manually stop the TTS server after my TTS node. When the VRAM is freed, you can the click on the RESUME button to continue the workflow.  
 If you can afford to run both at the same time, good for you, but Locally I can't run my TTS server and FLUX at the same time, so I use this trick. :  
 ![TTS](screenshots/tts_preload_2.png)  
 Also input is optional, it means that you can make a workflow with ONLY my TTS node to pre-generate the audio files with the sentences you want to maybe use later, example :  
 ![TTS](screenshots/tts_generate.png)  
 ### 32 - 🧑📝 Character Description Generator
 ![characters](screenshots/characters.png)
@@ -756,3 +771,36 @@ Here another simple example taking a few selected images from a folder and combi
 **Description:**  
 Take text as input and scramble (randomize) the text by using the file `scrambler/character_scrambler.json` in the comfyui custom nodes folder.  
 ### 49 - 📹👁 Video Preview
 ![video preview](screenshots/video_preview.png)
 **Description:**  
 ### 50 - 🖼➜📹 Images to Video path (tmp video)
 ![image to video path](screenshots/image_to_video_path.png)
 **Description:**  
 ### 51 - 📹➜🖼 Video Path to Images
 ![video path to image](screenshots/video_path_to_image.png)
 **Description:**  
 ### 52 - 🔊📹 Audio Video Sync
 **Description:**  
 This node will basically synchronize the duration of an audio file with a video file by adding silence to the audio file if it's too short, or demultiply the video file if too long. (Video ideally need to be a loop, check my ping pong video node.)  
 It is good like for example with MuseTalk <https://github.com/chaojie/ComfyUI-MuseTalk>, If you want to chain up videos (Let's say sentence by sentence) it will always go back to the last frame. (Making the video transition smoother.)  
 Here is an example without `Audio Video Sync` node (The duration of the video is shorter than the audio, so after playing it will not go back to the last frame, ideally i want to have a loop where the first frame is the same as the last frame. -See my node loop video ping pong if needed-) :
 ![audio sync video](screenshots/audio_sync_video_without.png)
 Here is an example with `Audio Video Sync` node, notice that it is also convenient to recover the frames per second of the video, and send that to other nodes. :
 ![audio sync video](screenshots/audio_sync_video_with.png)
--- a/init.py
+++ b/init.py
@@ -1,9 +1,9 @@
 from .images_to_video import imagesToVideo
 from .write_text import WriteText
-from .write_image_environment import WriteImageEnvironment
+# from .write_image_environment import WriteImageEnvironment
-from .write_image_characters import WriteImageCharacters
+# from .write_image_characters import WriteImageCharacters
-from .write_image_character import WriteImageCharacter
+# from .write_image_character import WriteImageCharacter
-from .write_image_allinone import WriteImageAllInOne
+# from .write_image_allinone import WriteImageAllInOne
 from .combine_texts import CombineTexts
 from .loop_texts import LoopTexts
 from .random_texts import RandomTexts
@@ -51,9 +51,17 @@ from .image_details import ImageDetails
 from .combine_images import CombineImages
 # from .pass_preview_image import PassPreviewImage
 from .text_scramble_character import ScramblerCharacter
 from .audio_video_sync import AudioVideoSync
 from .video_path_to_images import VideoToImagesList
 from .images_to_video_path import ImagesListToVideo
 from .video_preview import VideoPreview
 NODE_CLASS_MAPPINGS = {
    "Bjornulf_ollamaLoader": ollamaLoader,
    "Bjornulf_VideoPreview": VideoPreview,
    "Bjornulf_ImagesListToVideo": ImagesListToVideo,
    "Bjornulf_VideoToImagesList": VideoToImagesList,
    "Bjornulf_AudioVideoSync": AudioVideoSync,
    "Bjornulf_ScramblerCharacter": ScramblerCharacter,
    "Bjornulf_CombineImages": CombineImages,
    "Bjornulf_ImageDetails": ImageDetails,
@@ -106,6 +114,10 @@ NODE_CLASS_MAPPINGS = {
 NODE_DISPLAY_NAME_MAPPINGS = {
    "Bjornulf_WriteText": "✒ Write Text",
    "Bjornulf_VideoPreview": "📹👁 Video Preview",
    "Bjornulf_ImagesListToVideo": "🖼➜📹 Images to Video path (tmp video)",
    "Bjornulf_VideoToImagesList": "📹➜🖼 Video Path to Images",
    "Bjornulf_AudioVideoSync": "🔊📹 Audio Video Sync",
    "Bjornulf_ScramblerCharacter": "🔀🎲 Text scrambler (🧑 Character)",
    "Bjornulf_WriteTextAdvanced": "✒🗔 Advanced Write Text",
    "Bjornulf_LoopWriteText": "♻ Loop (✒🗔 Advanced Write Text)",
@@ -129,7 +141,7 @@ NODE_DISPLAY_NAME_MAPPINGS = {
    "Bjornulf_CharacterDescriptionGenerator": "🧑📝 Character Description Generator",
    "Bjornulf_GreenScreenToTransparency": "🟩➜▢ Green Screen to Transparency",
    "Bjornulf_SaveBjornulfLobeChat": "🖼💬 Save image for Bjornulf LobeChat",
-    "Bjornulf_TextToStringAndSeed": "🔢 Text with random Seed",
+    "Bjornulf_TextToStringAndSeed": "🔢🎲 Text with random Seed",
    "Bjornulf_ShowText": "👁 Show (Text, Int, Float)",
    "Bjornulf_ImageMaskCutter": "🖼✂ Cut Image with Mask",
    "Bjornulf_LoadImageWithTransparency": "📥🖼 Load Image with Transparency ▢",
--- a/audio_video_sync.py
+++ b/audio_video_sync.py
@@ -0,0 +1,156 @@
 import torch
 import torchaudio
 import os
 import subprocess
 from datetime import datetime
 import math
 class AudioVideoSync:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(s):
        return {
            "required": {
                "audio": ("AUDIO",),
                "video_path": ("STRING", {"default": ""}),
            },
        }
    RETURN_TYPES = ("AUDIO", "STRING", "STRING", "FLOAT")
    RETURN_NAMES = ("synced_audio", "audio_path", "synced_video_path", "video_fps")
    FUNCTION = "sync_audio_video"
    CATEGORY = "audio"
    # def get_video_duration(self, video_path):
    #     cmd = ['ffprobe', '-v', 'error', '-show_entries', 'format=duration', '-of', 'default=noprint_wrappers=1:nokey=1', video_path]
    #     result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    #     duration = float(result.stdout)
    #     return math.ceil(duration * 10) / 10
    def get_video_duration(self, video_path):
        cmd = ['ffprobe', '-v', 'error', '-show_entries', 'format=duration', '-of', 'default=noprint_wrappers=1:nokey=1', video_path]
        result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        return float(result.stdout)
    def get_video_fps(self, video_path):
        cmd = ['ffprobe', '-v', 'error', '-select_streams', 'v:0', '-count_packets', '-show_entries', 'stream=r_frame_rate', '-of', 'csv=p=0', video_path]
        result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        fps = result.stdout.strip()
        if '/' in fps:
            num, den = map(float, fps.split('/'))
            return num / den
        return float(fps)
    def sync_audio_video(self, audio, video_path):
        if not isinstance(audio, dict) or 'waveform' not in audio or 'sample_rate' not in audio:
            raise ValueError("Expected audio input to be a dictionary with 'waveform' and 'sample_rate' keys")
        audio_data = audio['waveform']
        sample_rate = audio['sample_rate']
        print(f"Audio data shape: {audio_data.shape}")
        print(f"Sample rate: {sample_rate}")
        # Calculate video duration
        video_duration = self.get_video_duration(video_path)
        # Calculate audio duration
        audio_duration = audio_data.shape[-1] / sample_rate
        print(f"Video duration: {video_duration}")
        print(f"Audio duration: {audio_duration}")
        # Calculate the desired audio duration and number of video repetitions
        if audio_duration <= video_duration:
            target_duration = video_duration
            repetitions = 1
        else:
            repetitions = math.ceil(audio_duration / video_duration)
            target_duration = video_duration * repetitions
        # Calculate the number of samples to add
        current_samples = audio_data.shape[-1]
        target_samples = int(target_duration * sample_rate)
        samples_to_add = target_samples - current_samples
        print(f"Current samples: {current_samples}, Target samples: {target_samples}, Samples to add: {samples_to_add}")
        if samples_to_add > 0:
            # Create silence
            if audio_data.dim() == 3:
                silence_shape = (audio_data.shape[0], audio_data.shape[1], samples_to_add)
            else:  # audio_data.dim() == 2
                silence_shape = (audio_data.shape[0], samples_to_add)
            silence = torch.zeros(silence_shape, dtype=audio_data.dtype, device=audio_data.device)
            # Append silence to the audio
            synced_audio = torch.cat((audio_data, silence), dim=-1)
        else:
            synced_audio = audio_data
        print(f"Synced audio shape: {synced_audio.shape}")
        # Save the synced audio file and get the file path
        audio_path = self.save_audio(synced_audio, sample_rate)
        # Create and save the synced video
        synced_video_path = self.create_synced_video(video_path, repetitions)
        video_fps = self.get_video_fps(video_path)
        # Return the synced audio data, audio file path, and synced video path
        return ({"waveform": synced_audio, "sample_rate": sample_rate}, audio_path, synced_video_path, video_fps)   
    def save_audio(self, audio_tensor, sample_rate):
        # Create the sync_audio folder if it doesn't exist
        os.makedirs("Bjornulf/sync_audio", exist_ok=True)
        # Generate a unique filename using the current timestamp
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"Bjornulf/sync_audio/synced_audio_{timestamp}.wav"
        # Ensure audio_tensor is 2D
        if audio_tensor.dim() == 3:
            audio_tensor = audio_tensor.squeeze(0)  # Remove batch dimension
        elif audio_tensor.dim() == 1:
            audio_tensor = audio_tensor.unsqueeze(0)  # Add channel dimension
        # Save the audio file
        torchaudio.save(filename, audio_tensor, sample_rate)
        print(f"Synced audio saved to: {filename}")
        # Return the full path to the saved audio file
        return os.path.abspath(filename)
    def create_synced_video(self, video_path, repetitions):
        # Create the sync_video folder if it doesn't exist
        os.makedirs("Bjornulf/sync_video", exist_ok=True)
        # Generate a unique filename using the current timestamp
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        output_path = f"Bjornulf/sync_video/synced_video_{timestamp}.mp4"
        # Create a temporary file with the list of input video files
        with open("Bjornulf/temp_video_list.txt", "w") as f:
            for _ in range(repetitions):
                f.write(f"file '{video_path}'\n")
        # Use ffmpeg to concatenate the video multiple times
        cmd = [
            'ffmpeg',
            '-f', 'concat',
            '-safe', '0',
            '-i', 'Bjornulf/temp_video_list.txt',
            '-c', 'copy',
            output_path
        ]
        subprocess.run(cmd, check=True)
        # Remove the temporary file
        os.remove("Bjornulf/temp_video_list.txt")
        print(f"Synced video saved to: {output_path}")
        return os.path.abspath(output_path)
--- a/images_to_video.py
+++ b/images_to_video.py
@@ -12,7 +12,7 @@ class imagesToVideo:
        return {
            "required": {
                "images": ("IMAGE",),
-                "fps": ("INT", {"default": 24, "min": 1, "max": 60}),
+                "fps": ("FLOAT", {"default": 24, "min": 1, "max": 120}),
                "name_prefix": ("STRING", {"default": "output/imgs2video/me"}),
                "format": (["mp4", "webm"], {"default": "mp4"}),
                "mp4_encoder": (["libx264 (H.264)", "h264_nvenc (H.264 / NVIDIA GPU)", "libx265 (H.265)", "hevc_nvenc (H.265 / NVIDIA GPU)"], {"default": "h264_nvenc (H.264 / NVIDIA GPU)"}),
@@ -47,7 +47,7 @@ class imagesToVideo:
        # Create the new filename with the incremented number
        output_file = f"{name_prefix}_{next_num:04d}.{format}"
-        temp_dir = "temp_images_imgs2video"
+        temp_dir = "Bjornulf/temp_images_imgs2video"
        # Clean up temp dir
        if os.path.exists(temp_dir) and os.path.isdir(temp_dir):
            for file in os.listdir(temp_dir):
--- a/images_to_video_path.py
+++ b/images_to_video_path.py
@@ -0,0 +1,91 @@
 import os
 import uuid
 import subprocess
 import tempfile
 import torch
 import numpy as np
 from PIL import Image
 class ImagesListToVideo:
    @classmethod
    def INPUT_TYPES(s):
        return {
            "required": {
                "images": ("IMAGE",),
                "frames_per_second": ("FLOAT", {"default": 30, "min": 1, "max": 120, "step": 1}),
            }
        }
    RETURN_TYPES = ("STRING",)
    RETURN_NAMES = ("video_path",)
    FUNCTION = "images_to_video"
    CATEGORY = "Bjornulf"
    def images_to_video(self, images, frames_per_second=30):
        # Create the output directory if it doesn't exist
        output_dir = os.path.join("Bjornulf", "images_to_video")
        os.makedirs(output_dir, exist_ok=True)
        # Generate a unique filename for the video
        video_filename = f"video_{uuid.uuid4().hex}.mp4"
        video_path = os.path.join(output_dir, video_filename)
        # Create a temporary directory to store image files
        with tempfile.TemporaryDirectory() as temp_dir:
            # Save each image as a PNG file in the temporary directory
            for i, img in enumerate(images):
                # Convert the image to the correct format
                img_np = self.convert_to_numpy(img)
                # Ensure the image is in RGB format
                if img_np.shape[-1] != 3:
                    img_np = self.convert_to_rgb(img_np)
                # Convert to PIL Image
                img_pil = Image.fromarray(img_np)
                img_path = os.path.join(temp_dir, f"frame_{i:05d}.png")
                img_pil.save(img_path)
            # Use FFmpeg to create a video from the image sequence
            ffmpeg_cmd = [
                "ffmpeg",
                "-framerate", str(frames_per_second),
                "-i", os.path.join(temp_dir, "frame_%05d.png"),
                "-c:v", "libx264",
                "-pix_fmt", "yuv420p",
                "-crf", "23",
                "-y",  # Overwrite output file if it exists
                video_path
            ]
            try:
                subprocess.run(ffmpeg_cmd, check=True, capture_output=True, text=True)
            except subprocess.CalledProcessError as e:
                print(f"FFmpeg error: {e.stderr}")
                return ("",)  # Return empty string if video creation fails
        return (video_path,)
    def convert_to_numpy(self, img):
        if isinstance(img, torch.Tensor):
            img = img.cpu().numpy()
        if img.dtype == np.uint8:
            return img
        elif img.dtype == np.float32 or img.dtype == np.float64:
            return (img * 255).astype(np.uint8)
        else:
            raise ValueError(f"Unsupported data type: {img.dtype}")
    def convert_to_rgb(self, img):
        if img.shape[-1] == 1:  # Grayscale
            return np.repeat(img, 3, axis=-1)
        elif img.shape[-1] == 768:  # Latent space representation
            # This is a placeholder. You might need a more sophisticated method to convert latent space to RGB
            img = img.reshape((-1, 3))  # Reshape to (H*W, 3)
            img = (img - img.min()) / (img.max() - img.min())  # Normalize to [0, 1]
            img = (img * 255).astype(np.uint8)
            return img.reshape((img.shape[0], -1, 3))  # Reshape back to (H, W, 3)
        elif len(img.shape) == 2:  # 2D array
            return np.stack([img, img, img], axis=-1)
        else:
            raise ValueError(f"Unsupported image shape: {img.shape}")
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,7 +1,7 @@
 [project]
 name = "bjornulf_custom_nodes"
 description = "Nodes: Ollama, Text to Speech, Combine Texts, Random Texts, Save image for Bjornulf LobeChat, Text with random Seed, Random line from input, Combine images, Image to grayscale (black & white), Remove image Transparency (alpha), Resize Image, ..."
-version = "0.45"
+version = "0.46"
 license = {file = "LICENSE"}
 [project.urls]
--- a/screenshots/audio_sync_video_with.png
+++ b/screenshots/audio_sync_video_with.png
--- a/screenshots/audio_sync_video_without.png
+++ b/screenshots/audio_sync_video_without.png
--- a/screenshots/image_to_video_path.png
+++ b/screenshots/image_to_video_path.png
--- a/screenshots/tts.png
+++ b/screenshots/tts.png
--- a/screenshots/tts_end.png
+++ b/screenshots/tts_end.png
--- a/screenshots/tts_generate.png
+++ b/screenshots/tts_generate.png
--- a/screenshots/tts_preload.png
+++ b/screenshots/tts_preload.png
--- a/screenshots/tts_preload_2.png
+++ b/screenshots/tts_preload_2.png
--- a/screenshots/video_path_to_image.png
+++ b/screenshots/video_path_to_image.png
--- a/screenshots/video_preview.png
+++ b/screenshots/video_preview.png
--- a/text_to_speech.py
+++ b/text_to_speech.py
@@ -9,53 +9,34 @@ import os
 import sys
 import random
 import re
 from typing import Dict, Any, List, Tuple
 class Everything(str):
    def __ne__(self, __value: object) -> bool:
        return False
 language_map = {
-    "ar": "Arabic",
+    "ar": "Arabic", "cs": "Czech", "de": "German", "en": "English",
-    "cs": "Czech",
+    "es": "Spanish", "fr": "French", "hi": "Hindi", "hu": "Hungarian",
-    "de": "German",
+    "it": "Italian", "ja": "Japanese", "ko": "Korean", "nl": "Dutch",
-    "en": "English",
+    "pl": "Polish", "pt": "Portuguese", "ru": "Russian", "tr": "Turkish",
    "es": "Spanish",
    "fr": "French",
    "hi": "Hindi",
    "hu": "Hungarian",
    "it": "Italian",
    "ja": "Japanese",
    "ko": "Korean",
    "nl": "Dutch",
    "pl": "Polish",
    "pt": "Portuguese",
    "ru": "Russian",
    "tr": "Turkish",
    "zh-cn": "Chinese"
 }
 class TextToSpeech:
    @classmethod
-    def INPUT_TYPES(cls):
+    def INPUT_TYPES(cls) -> Dict[str, Any]:
        speakers_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "speakers")
-        speaker_options = []
+        speaker_options = [os.path.relpath(os.path.join(root, file), speakers_dir)
-
+                           for root, _, files in os.walk(speakers_dir)
-        for root, dirs, files in os.walk(speakers_dir):
+                           for file in files if file.endswith(".wav")]
-            for file in files:
+        
-                if file.endswith(".wav"):
+        speaker_options = speaker_options or ["No WAV files found"]
-                    rel_path = os.path.relpath(os.path.join(root, file), speakers_dir)
+        
                    speaker_options.append(rel_path)
        if not speaker_options:
            speaker_options.append("No WAV files found")
        language_options = list(language_map.values())
        return {
            "required": {
                "text": ("STRING", {"multiline": True}),
-                "language": (language_options, {
+                "language": (list(language_map.values()), {
                    "default": language_map["en"],
                    "display": "dropdown"
                }),
@@ -69,44 +50,45 @@ class TextToSpeech:
                "seed": ("INT", {"default": 0}),
            },
            "optional": {
-                "input": (Everything("*"), {"forceInput": True}),
+                "connect_to_workflow": (Everything("*"), {"forceInput": True}),
            }
        }
-    RETURN_TYPES = ("AUDIO",)
+    RETURN_TYPES = ("AUDIO", "STRING", "STRING", "FLOAT")
    RETURN_NAMES = ("AUDIO", "audio_path", "full_path", "duration")
    FUNCTION = "generate_audio"
    CATEGORY = "Bjornulf"
    @staticmethod
-    def get_language_code(language_name):
+    def get_language_code(language_name: str) -> str:
-        for code, name in language_map.items():
+        return next((code for code, name in language_map.items() if name == language_name), "en")
            if name == language_name:
                return code
        return "en"
    @staticmethod
-    def sanitize_text(text):
+    def sanitize_text(text: str) -> str:
-        sanitized = re.sub(r'[^\w\s-]', '', text).replace(' ', '_')
+        return re.sub(r'[^\w\s-]', '', text).replace(' ', '_')[:50]
        return sanitized[:50]
-    def generate_audio(self, text, language, autoplay, seed, save_audio, overwrite, speaker_wav, input=None):
+    def generate_audio(self, text: str, language: str, autoplay: bool, seed: int,
                       save_audio: bool, overwrite: bool, speaker_wav: str,
                       connect_to_workflow: Any = None) -> Tuple[Dict[str, Any], str, str, float]:
        language_code = self.get_language_code(language)
        sanitized_text = self.sanitize_text(text)
        save_path = os.path.join("Bjornulf_TTS", language, speaker_wav, f"{sanitized_text}.wav")
-        os.makedirs(os.path.dirname(save_path), exist_ok=True)
+        full_path = os.path.abspath(save_path)
        os.makedirs(os.path.dirname(full_path), exist_ok=True)
-        if os.path.exists(save_path) and not overwrite:
+        if os.path.exists(full_path) and not overwrite:
-            print(f"Using existing audio file: {save_path}")
+            print(f"Using existing audio file: {full_path}")
-            audio_data = self.load_audio_file(save_path)
+            audio_data = self.load_audio_file(full_path)
        else:
            audio_data = self.create_new_audio(text, language_code, speaker_wav, seed)
            if save_audio:
-                self.save_audio_file(audio_data, save_path)
+                self.save_audio_file(audio_data, full_path)
-        return self.process_audio_data(autoplay, audio_data)
+        audio_output, _, duration = self.process_audio_data(autoplay, audio_data, full_path if save_audio else None)
        return (audio_output, save_path, full_path, duration)
-    def create_new_audio(self, text, language_code, speaker_wav, seed):
+    def create_new_audio(self, text: str, language_code: str, speaker_wav: str, seed: int) -> io.BytesIO:
        random.seed(seed)
        if speaker_wav == "No WAV files found":
            print("Error: No WAV files available for text-to-speech.")
@@ -133,17 +115,17 @@ class TextToSpeech:
            print(f"Unexpected error: {e}")
            return io.BytesIO()
-    def play_audio(self, audio):
+    def play_audio(self, audio: AudioSegment) -> None:
        if sys.platform.startswith('win'):
            try:
                import winsound
-                winsound.PlaySound(audio, winsound.SND_MEMORY)
+                winsound.PlaySound(audio.raw_data, winsound.SND_MEMORY)
            except Exception as e:
                print(f"An error occurred: {e}")
        else:
            play(audio)
-    def process_audio_data(self, autoplay, audio_data):
+    def process_audio_data(self, autoplay: bool, audio_data: io.BytesIO, save_path: str) -> Tuple[Dict[str, Any], str, float]:
        try:
            audio = AudioSegment.from_mp3(audio_data)
            sample_rate = audio.frame_rate
@@ -151,23 +133,22 @@ class TextToSpeech:
            audio_np = np.array(audio.get_array_of_samples()).astype(np.float32)
            audio_np /= np.iinfo(np.int16).max
-            if num_channels == 1:
+            audio_np = audio_np.reshape(-1, num_channels).T if num_channels > 1 else audio_np.reshape(1, -1)
                audio_np = audio_np.reshape(1, -1)
            else:
                audio_np = audio_np.reshape(-1, num_channels).T
            audio_tensor = torch.from_numpy(audio_np)
            if autoplay:
                self.play_audio(audio)
-            return ({"waveform": audio_tensor.unsqueeze(0), "sample_rate": sample_rate},)
+            duration = len(audio) / 1000.0  # Convert milliseconds to seconds
            return ({"waveform": audio_tensor.unsqueeze(0), "sample_rate": sample_rate}, save_path or "", duration)
        except Exception as e:
            print(f"Error processing audio data: {e}")
-            return ({"waveform": torch.zeros(1, 1, 1, dtype=torch.float32), "sample_rate": 22050},)
+            return ({"waveform": torch.zeros(1, 1, 1, dtype=torch.float32), "sample_rate": 22050}, "", 0.0)
-    def save_audio_file(self, audio_data, save_path):
+    def save_audio_file(self, audio_data: io.BytesIO, save_path: str) -> None:
        try:
            with open(save_path, 'wb') as f:
                f.write(audio_data.getvalue())
@@ -175,11 +156,11 @@ class TextToSpeech:
        except Exception as e:
            print(f"Error saving audio file: {e}")
-    def load_audio_file(self, file_path):
+    def load_audio_file(self, file_path: str) -> io.BytesIO:
        try:
            with open(file_path, 'rb') as f:
                audio_data = io.BytesIO(f.read())
            return audio_data
        except Exception as e:
            print(f"Error loading audio file: {e}")
-            return io.BytesIO()
+            return io.BytesIO()
--- a/video_path_to_images.py
+++ b/video_path_to_images.py
@@ -0,0 +1,62 @@
 import os
 import cv2
 import numpy as np
 import torch
 from PIL import Image
 class VideoToImagesList:
    @classmethod
    def INPUT_TYPES(s):
        return {
            "required": {
                "video_path": ("STRING", {"forceInput": True}),
                "frame_interval": ("INT", {"default": 1, "min": 1, "max": 100}),
                "max_frames": ("INT", {"default": 0, "min": 0, "max": 10000})
            }
        }
    RETURN_TYPES = ("IMAGE", "FLOAT", "FLOAT", "INT")
    RETURN_NAMES = ("IMAGE", "initial_fps", "new_fps", "total_frames")
    FUNCTION = "video_to_images"
    CATEGORY = "Bjornulf"
    def video_to_images(self, video_path, frame_interval=1, max_frames=0):
        if not os.path.exists(video_path):
            raise FileNotFoundError(f"Video file not found: {video_path}")
        cap = cv2.VideoCapture(video_path)
        frame_count = 0
        images = []
        # Get the initial fps of the video
        initial_fps = cap.get(cv2.CAP_PROP_FPS)
        # Get the total number of frames in the video
        total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
        while True:
            ret, frame = cap.read()
            if not ret or (max_frames > 0 and len(images) >= max_frames):
                break
            if frame_count % frame_interval == 0:
                # Convert BGR to RGB
                rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
                pil_image = Image.fromarray(rgb_frame)
                # Convert PIL Image to tensor
                tensor_image = torch.from_numpy(np.array(pil_image).astype(np.float32) / 255.0).unsqueeze(0)
                images.append(tensor_image)
            frame_count += 1
        cap.release()
        if not images:
            raise ValueError("No frames were extracted from the video")
        # Calculate the new fps
        new_fps = initial_fps / frame_interval
        # Stack all images into a single tensor
        return (torch.cat(images, dim=0), initial_fps, new_fps, total_frames)
--- a/video_preview.py
+++ b/video_preview.py
@@ -0,0 +1,49 @@
 import os
 import shutil
 # import logging
 class VideoPreview:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "video_path": ("STRING", {"forceInput": True}),
            },
        }
    RETURN_TYPES = ()
    FUNCTION = "preview_video"
    CATEGORY = "Bjornulf"
    OUTPUT_NODE = True
    def preview_video(self, video_path):
        if not video_path:
            return {"ui": {"error": "No video path provided."}}
        # Keep the "output" folder structure for copying
        dest_dir = os.path.join("output", "Bjornulf", "preview_video")
        os.makedirs(dest_dir, exist_ok=True)
        video_name = os.path.basename(video_path)
        dest_path = os.path.join(dest_dir, video_name)
        if os.path.abspath(video_path) != os.path.abspath(dest_path):
            shutil.copy2(video_path, dest_path)
            print(f"Video copied successfully to {dest_path}")
        else:
            print(f"Video is already in the destination folder: {dest_path}")
        # Determine the video type based on file extension
        _, file_extension = os.path.splitext(dest_path)
        video_type = file_extension.lower()[1:]  # Remove the dot from extension
        # logging.info(f"Video type: {video_type}")
        # logging.info(f"Video path: {dest_path}")
        # logging.info(f"Destination directory: {dest_dir}")
        # logging.info(f"Video name: {video_name}")
        # Create a new variable for the return value without "output"
        return_dest_dir = os.path.join("Bjornulf", "preview_video")
        # Return the video name and the modified destination directory
        return {"ui": {"video": [video_name, return_dest_dir]}}
--- a/web/js/video_preview.js
+++ b/web/js/video_preview.js
@@ -0,0 +1,83 @@
 import { api } from '../../../scripts/api.js';
 import { app } from "../../../scripts/app.js";
 function displayVideoPreview(component, filename, category) {
    let videoWidget = component._videoWidget;
    if (!videoWidget) {
        // Create the widget if it doesn't exist
        var container = document.createElement("div");
        const currentNode = component;
        videoWidget = component.addDOMWidget("videopreview", "preview", container, {
            serialize: false,
            hideOnZoom: false,
            getValue() {
                return container.value;
            },
            setValue(v) {
                container.value = v;
            },
        });
        videoWidget.computeSize = function(width) {
            if (this.aspectRatio && !this.parentElement.hidden) {
                let height = (currentNode.size[0] - 20) / this.aspectRatio + 10;
                if (!(height > 0)) {
                    height = 0;
                }
                return [width, height];
            }
            return [width, -4];
        };
        videoWidget.value = { hidden: false, paused: false, params: {} };
        videoWidget.parentElement = document.createElement("div");
        videoWidget.parentElement.className = "video_preview";
        videoWidget.parentElement.style['width'] = "100%";
        container.appendChild(videoWidget.parentElement);
        videoWidget.videoElement = document.createElement("video");
        videoWidget.videoElement.controls = true;
        videoWidget.videoElement.loop = false;
        videoWidget.videoElement.muted = false;
        videoWidget.videoElement.style['width'] = "100%";
        videoWidget.videoElement.addEventListener("loadedmetadata", () => {
            videoWidget.aspectRatio = videoWidget.videoElement.videoWidth / videoWidget.videoElement.videoHeight;
            adjustSize(component);
        });
        videoWidget.videoElement.addEventListener("error", () => {
            videoWidget.parentElement.hidden = true;
            adjustSize(component);
        });
        videoWidget.parentElement.hidden = videoWidget.value.hidden;
        videoWidget.videoElement.autoplay = !videoWidget.value.paused && !videoWidget.value.hidden;
        videoWidget.videoElement.hidden = false;
        videoWidget.parentElement.appendChild(videoWidget.videoElement);
        component._videoWidget = videoWidget; // Store the widget for future reference
    }
    // Update the video source
    let params = {
        "filename": filename,
        "subfolder": category,
        "type": "output",
        "rand": Math.random().toString().slice(2, 12)
    };
    const urlParams = new URLSearchParams(params);
    videoWidget.videoElement.src = `http://localhost:8188/api/view?${urlParams.toString()}`;
    adjustSize(component); // Adjust the component size
 }
 function adjustSize(component) {
    component.setSize([component.size[0], component.computeSize([component.size[0], component.size[1]])[1]]);
    component?.graph?.setDirtyCanvas(true);
 }
 app.registerExtension({
    name: "Bjornulf.VideoPreview",
    async beforeRegisterNodeDef(nodeType, nodeData, appInstance) {
        if (nodeData?.name == "Bjornulf_VideoPreview") {
            nodeType.prototype.onExecuted = function (data) {
                displayVideoPreview(this, data.video[0], data.video[1]);
            };
        }
    }
 });