This commit is contained in:
justumen
2024-09-28 17:45:23 +02:00
parent 3d6014206f
commit c1cebdf1de
20 changed files with 561 additions and 79 deletions

View File

@@ -1,4 +1,4 @@
# 🔗 Comfyui : Bjornulf_custom_nodes v0.45 🔗
# 🔗 Comfyui : Bjornulf_custom_nodes v0.46 🔗
# Coffee : ☕☕☕☕☕ 5/5
@@ -8,6 +8,7 @@
## 👁 Display and Show 👁
`1.` [👁 Show (Text, Int, Float)](#1----show-text-int-float)
`49.` [📹👁 Video Preview](#49)
## ✒ Text ✒
`2.` [✒ Write Text](#2----write-text)
@@ -15,7 +16,7 @@
`4.` [🔗 Combine Texts](#4----combine-texts)
`15.` [💾 Save Text](#15----save-text)
`26.` [🎲 Random line from input](#26----random-line-from-input)
`28.` [🔢 Text with random Seed](#28----text-with-random-seed)
`28.` [🔢🎲 Text with random Seed](#28----text-with-random-seed)
`32.` [🧑📝 Character Description Generator](#32----character-description-generator)
`48.` [🔀🎲 Text scrambler (🧑 Character)](#48----text-scrambler--character)
@@ -37,6 +38,7 @@
`3.` [✒🗔 Advanced Write Text (+ 🎲 random selection and 🅰️ variables)](#3----advanced-write-text---random-selection-and-🅰%EF%B8%8F-variables)
`5.` [🎲 Random (Texts)](#5----random-texts)
`26.` [🎲 Random line from input](#26----random-line-from-input)
`28.` [🔢🎲 Text with random Seed](#28----text-with-random-seed)
`37.` [🎲🖼 Random Image](#37----random-image)
`40.` [🎲 Random (Model+Clip+Vae) - aka Checkpoint / Model](#40----random-modelclipvae---aka-checkpoint--model)
`41.` [🎲 Random Load checkpoint (Model Selector)](#41----random-load-checkpoint-model-selector)
@@ -69,7 +71,11 @@
## 📹 Video 📹
`20.` [📹 Video Ping Pong](#20----video-ping-pong)
`21.` [📹 Images to Video](#21----images-to-video)
`21.` [📹 Images to Video (FFmpeg)](#21----images-to-video)
`49.` [📹👁 Video Preview](#49)
`50.` [🖼➜📹 Images to Video path (tmp video)](#50)
`51.` [📹➜🖼 Video Path to Images](#51)
`52.` [🔊📹 Audio Video Sync](#52)
## 🦙 AI 🦙
`19.` [🦙 Ollama](#19----ollama)
@@ -77,13 +83,14 @@
## 🔊 Audio 🔊
`31.` [🔊 TTS - Text to Speech](#31----tts---text-to-speech-100-local-any-voice-you-want-any-language)
`52.` [🔊📹 Audio Video Sync](#52)
## 💻 System 💻
`34.` [🧹 Free VRAM hack](#34----free-vram-hack)
## 🧍 Manual user Control 🧍
`35.` [⏸️ Paused. Resume or Stop ?](#35---%EF%B8%8F-paused-resume-or-stop-)
`36.` [⏸️🔍 Paused. Select input, Pick one](#36---%EF%B8%8F-paused-select-input-pick-one)
`35.` [⏸️ Paused. Resume or Stop, Pick 👇](#35---%EF%B8%8F-paused-resume-or-stop-)
`36.` [⏸️ Paused. Select input, Pick 👇](#36---%EF%B8%8F-paused-select-input-pick-one)
## 🧠 Logic / Conditional Operations 🧠
`45.` [🔀 If-Else (input / compare_with)](#45----if-else-input--compare_with)
@@ -217,6 +224,7 @@ cd /where/you/installed/ComfyUI && python main.py
- **v0.43**: Add control_after_generate to Ollama and allow to keep in VRAM for 1 minute if needed. (For chaining quick generations.) Add fallback to 0.0.0.0
- **v0.44**: Allow ollama to have a cusom url in the file `ollama_ip.txt` in the comfyui custom nodes folder. Minor changes, add details/updates to README.
- **v0.45**: Add a new node : Text scrambler (Character), change text randomly using the file `scrambler/scrambler_character.json` in the comfyui custom nodes folder.
- **v0.46**: ❗ A lot of changes to Video nodes. Save to video is now using FLOAT for fps, not INT. (A lot of other custom nodes do that as well...) Add node to preview video, add node to convert a video path to a list of images. add node to convert a list of images to a temporary video + video_path. add node to synchronize duration of audio with video. (useful for MuseTalk) change TTS node with many new outputs ("audio_path", "full_path", "duration") to reuse with other nodes like MuseTalk, also TTS rename input to "connect_to_workflow", to avoid mistakes sending text to it.
# 📝 Nodes descriptions
@@ -522,12 +530,19 @@ Also, when you select a voice with this format `fr/fake_Bjornulf.wav`, it will c
So... note that if you know you have an audio file ready to play, you can still use my node but you do NOT need my TTS server to be running.
My node will just play the audio file if it can find it, won't try to connect th backend TTS server.
Let's say you already use this node to create an audio file saying `workflow is done` with the Attenborough voice :
![TTS](screenshots/tts_end.png)
As long as you keep exactly the same settings, it will not use my server to play the audio file! You can safely turn in off, so it won't use your precious VRAM Duh. (TTS server should be using ~3GB of VRAM.)
As long as you keep exactly the same settings, it will not use my server to play the audio file! You can safely turn the TTS server off, so it won't use your precious VRAM Duh. (TTS server should be using ~3GB of VRAM.)
Also `connect_to_workflow` is optional, it means that you can make a workflow with ONLY my TTS node to pre-generate the audio files with the sentences you want to use later, example :
![TTS](screenshots/tts_preload.png)
If you want to run my TTS nodes along side image generation, i recommend you to use my PAUSE node so you can manually stop the TTS server after my TTS node. When the VRAM is freed, you can the click on the RESUME button to continue the workflow.
If you can afford to run both at the same time, good for you, but Locally I can't run my TTS server and FLUX at the same time, so I use this trick. :
![TTS](screenshots/tts_preload_2.png)
Also input is optional, it means that you can make a workflow with ONLY my TTS node to pre-generate the audio files with the sentences you want to maybe use later, example :
![TTS](screenshots/tts_generate.png)
### 32 - 🧑📝 Character Description Generator
![characters](screenshots/characters.png)
@@ -756,3 +771,36 @@ Here another simple example taking a few selected images from a folder and combi
**Description:**
Take text as input and scramble (randomize) the text by using the file `scrambler/character_scrambler.json` in the comfyui custom nodes folder.
### 49 - 📹👁 Video Preview
![video preview](screenshots/video_preview.png)
**Description:**
### 50 - 🖼➜📹 Images to Video path (tmp video)
![image to video path](screenshots/image_to_video_path.png)
**Description:**
### 51 - 📹➜🖼 Video Path to Images
![video path to image](screenshots/video_path_to_image.png)
**Description:**
### 52 - 🔊📹 Audio Video Sync
**Description:**
This node will basically synchronize the duration of an audio file with a video file by adding silence to the audio file if it's too short, or demultiply the video file if too long. (Video ideally need to be a loop, check my ping pong video node.)
It is good like for example with MuseTalk <https://github.com/chaojie/ComfyUI-MuseTalk>, If you want to chain up videos (Let's say sentence by sentence) it will always go back to the last frame. (Making the video transition smoother.)
Here is an example without `Audio Video Sync` node (The duration of the video is shorter than the audio, so after playing it will not go back to the last frame, ideally i want to have a loop where the first frame is the same as the last frame. -See my node loop video ping pong if needed-) :
![audio sync video](screenshots/audio_sync_video_without.png)
Here is an example with `Audio Video Sync` node, notice that it is also convenient to recover the frames per second of the video, and send that to other nodes. :
![audio sync video](screenshots/audio_sync_video_with.png)

View File

@@ -1,9 +1,9 @@
from .images_to_video import imagesToVideo
from .write_text import WriteText
from .write_image_environment import WriteImageEnvironment
from .write_image_characters import WriteImageCharacters
from .write_image_character import WriteImageCharacter
from .write_image_allinone import WriteImageAllInOne
# from .write_image_environment import WriteImageEnvironment
# from .write_image_characters import WriteImageCharacters
# from .write_image_character import WriteImageCharacter
# from .write_image_allinone import WriteImageAllInOne
from .combine_texts import CombineTexts
from .loop_texts import LoopTexts
from .random_texts import RandomTexts
@@ -51,9 +51,17 @@ from .image_details import ImageDetails
from .combine_images import CombineImages
# from .pass_preview_image import PassPreviewImage
from .text_scramble_character import ScramblerCharacter
from .audio_video_sync import AudioVideoSync
from .video_path_to_images import VideoToImagesList
from .images_to_video_path import ImagesListToVideo
from .video_preview import VideoPreview
NODE_CLASS_MAPPINGS = {
"Bjornulf_ollamaLoader": ollamaLoader,
"Bjornulf_VideoPreview": VideoPreview,
"Bjornulf_ImagesListToVideo": ImagesListToVideo,
"Bjornulf_VideoToImagesList": VideoToImagesList,
"Bjornulf_AudioVideoSync": AudioVideoSync,
"Bjornulf_ScramblerCharacter": ScramblerCharacter,
"Bjornulf_CombineImages": CombineImages,
"Bjornulf_ImageDetails": ImageDetails,
@@ -106,6 +114,10 @@ NODE_CLASS_MAPPINGS = {
NODE_DISPLAY_NAME_MAPPINGS = {
"Bjornulf_WriteText": "✒ Write Text",
"Bjornulf_VideoPreview": "📹👁 Video Preview",
"Bjornulf_ImagesListToVideo": "🖼➜📹 Images to Video path (tmp video)",
"Bjornulf_VideoToImagesList": "📹➜🖼 Video Path to Images",
"Bjornulf_AudioVideoSync": "🔊📹 Audio Video Sync",
"Bjornulf_ScramblerCharacter": "🔀🎲 Text scrambler (🧑 Character)",
"Bjornulf_WriteTextAdvanced": "✒🗔 Advanced Write Text",
"Bjornulf_LoopWriteText": "♻ Loop (✒🗔 Advanced Write Text)",
@@ -129,7 +141,7 @@ NODE_DISPLAY_NAME_MAPPINGS = {
"Bjornulf_CharacterDescriptionGenerator": "🧑📝 Character Description Generator",
"Bjornulf_GreenScreenToTransparency": "🟩➜▢ Green Screen to Transparency",
"Bjornulf_SaveBjornulfLobeChat": "🖼💬 Save image for Bjornulf LobeChat",
"Bjornulf_TextToStringAndSeed": "🔢 Text with random Seed",
"Bjornulf_TextToStringAndSeed": "🔢🎲 Text with random Seed",
"Bjornulf_ShowText": "👁 Show (Text, Int, Float)",
"Bjornulf_ImageMaskCutter": "🖼✂ Cut Image with Mask",
"Bjornulf_LoadImageWithTransparency": "📥🖼 Load Image with Transparency ▢",

156
audio_video_sync.py Normal file
View File

@@ -0,0 +1,156 @@
import torch
import torchaudio
import os
import subprocess
from datetime import datetime
import math
class AudioVideoSync:
def __init__(self):
pass
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"audio": ("AUDIO",),
"video_path": ("STRING", {"default": ""}),
},
}
RETURN_TYPES = ("AUDIO", "STRING", "STRING", "FLOAT")
RETURN_NAMES = ("synced_audio", "audio_path", "synced_video_path", "video_fps")
FUNCTION = "sync_audio_video"
CATEGORY = "audio"
# def get_video_duration(self, video_path):
# cmd = ['ffprobe', '-v', 'error', '-show_entries', 'format=duration', '-of', 'default=noprint_wrappers=1:nokey=1', video_path]
# result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
# duration = float(result.stdout)
# return math.ceil(duration * 10) / 10
def get_video_duration(self, video_path):
cmd = ['ffprobe', '-v', 'error', '-show_entries', 'format=duration', '-of', 'default=noprint_wrappers=1:nokey=1', video_path]
result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
return float(result.stdout)
def get_video_fps(self, video_path):
cmd = ['ffprobe', '-v', 'error', '-select_streams', 'v:0', '-count_packets', '-show_entries', 'stream=r_frame_rate', '-of', 'csv=p=0', video_path]
result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
fps = result.stdout.strip()
if '/' in fps:
num, den = map(float, fps.split('/'))
return num / den
return float(fps)
def sync_audio_video(self, audio, video_path):
if not isinstance(audio, dict) or 'waveform' not in audio or 'sample_rate' not in audio:
raise ValueError("Expected audio input to be a dictionary with 'waveform' and 'sample_rate' keys")
audio_data = audio['waveform']
sample_rate = audio['sample_rate']
print(f"Audio data shape: {audio_data.shape}")
print(f"Sample rate: {sample_rate}")
# Calculate video duration
video_duration = self.get_video_duration(video_path)
# Calculate audio duration
audio_duration = audio_data.shape[-1] / sample_rate
print(f"Video duration: {video_duration}")
print(f"Audio duration: {audio_duration}")
# Calculate the desired audio duration and number of video repetitions
if audio_duration <= video_duration:
target_duration = video_duration
repetitions = 1
else:
repetitions = math.ceil(audio_duration / video_duration)
target_duration = video_duration * repetitions
# Calculate the number of samples to add
current_samples = audio_data.shape[-1]
target_samples = int(target_duration * sample_rate)
samples_to_add = target_samples - current_samples
print(f"Current samples: {current_samples}, Target samples: {target_samples}, Samples to add: {samples_to_add}")
if samples_to_add > 0:
# Create silence
if audio_data.dim() == 3:
silence_shape = (audio_data.shape[0], audio_data.shape[1], samples_to_add)
else: # audio_data.dim() == 2
silence_shape = (audio_data.shape[0], samples_to_add)
silence = torch.zeros(silence_shape, dtype=audio_data.dtype, device=audio_data.device)
# Append silence to the audio
synced_audio = torch.cat((audio_data, silence), dim=-1)
else:
synced_audio = audio_data
print(f"Synced audio shape: {synced_audio.shape}")
# Save the synced audio file and get the file path
audio_path = self.save_audio(synced_audio, sample_rate)
# Create and save the synced video
synced_video_path = self.create_synced_video(video_path, repetitions)
video_fps = self.get_video_fps(video_path)
# Return the synced audio data, audio file path, and synced video path
return ({"waveform": synced_audio, "sample_rate": sample_rate}, audio_path, synced_video_path, video_fps)
def save_audio(self, audio_tensor, sample_rate):
# Create the sync_audio folder if it doesn't exist
os.makedirs("Bjornulf/sync_audio", exist_ok=True)
# Generate a unique filename using the current timestamp
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"Bjornulf/sync_audio/synced_audio_{timestamp}.wav"
# Ensure audio_tensor is 2D
if audio_tensor.dim() == 3:
audio_tensor = audio_tensor.squeeze(0) # Remove batch dimension
elif audio_tensor.dim() == 1:
audio_tensor = audio_tensor.unsqueeze(0) # Add channel dimension
# Save the audio file
torchaudio.save(filename, audio_tensor, sample_rate)
print(f"Synced audio saved to: {filename}")
# Return the full path to the saved audio file
return os.path.abspath(filename)
def create_synced_video(self, video_path, repetitions):
# Create the sync_video folder if it doesn't exist
os.makedirs("Bjornulf/sync_video", exist_ok=True)
# Generate a unique filename using the current timestamp
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_path = f"Bjornulf/sync_video/synced_video_{timestamp}.mp4"
# Create a temporary file with the list of input video files
with open("Bjornulf/temp_video_list.txt", "w") as f:
for _ in range(repetitions):
f.write(f"file '{video_path}'\n")
# Use ffmpeg to concatenate the video multiple times
cmd = [
'ffmpeg',
'-f', 'concat',
'-safe', '0',
'-i', 'Bjornulf/temp_video_list.txt',
'-c', 'copy',
output_path
]
subprocess.run(cmd, check=True)
# Remove the temporary file
os.remove("Bjornulf/temp_video_list.txt")
print(f"Synced video saved to: {output_path}")
return os.path.abspath(output_path)

View File

@@ -12,7 +12,7 @@ class imagesToVideo:
return {
"required": {
"images": ("IMAGE",),
"fps": ("INT", {"default": 24, "min": 1, "max": 60}),
"fps": ("FLOAT", {"default": 24, "min": 1, "max": 120}),
"name_prefix": ("STRING", {"default": "output/imgs2video/me"}),
"format": (["mp4", "webm"], {"default": "mp4"}),
"mp4_encoder": (["libx264 (H.264)", "h264_nvenc (H.264 / NVIDIA GPU)", "libx265 (H.265)", "hevc_nvenc (H.265 / NVIDIA GPU)"], {"default": "h264_nvenc (H.264 / NVIDIA GPU)"}),
@@ -47,7 +47,7 @@ class imagesToVideo:
# Create the new filename with the incremented number
output_file = f"{name_prefix}_{next_num:04d}.{format}"
temp_dir = "temp_images_imgs2video"
temp_dir = "Bjornulf/temp_images_imgs2video"
# Clean up temp dir
if os.path.exists(temp_dir) and os.path.isdir(temp_dir):
for file in os.listdir(temp_dir):

91
images_to_video_path.py Normal file
View File

@@ -0,0 +1,91 @@
import os
import uuid
import subprocess
import tempfile
import torch
import numpy as np
from PIL import Image
class ImagesListToVideo:
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"images": ("IMAGE",),
"frames_per_second": ("FLOAT", {"default": 30, "min": 1, "max": 120, "step": 1}),
}
}
RETURN_TYPES = ("STRING",)
RETURN_NAMES = ("video_path",)
FUNCTION = "images_to_video"
CATEGORY = "Bjornulf"
def images_to_video(self, images, frames_per_second=30):
# Create the output directory if it doesn't exist
output_dir = os.path.join("Bjornulf", "images_to_video")
os.makedirs(output_dir, exist_ok=True)
# Generate a unique filename for the video
video_filename = f"video_{uuid.uuid4().hex}.mp4"
video_path = os.path.join(output_dir, video_filename)
# Create a temporary directory to store image files
with tempfile.TemporaryDirectory() as temp_dir:
# Save each image as a PNG file in the temporary directory
for i, img in enumerate(images):
# Convert the image to the correct format
img_np = self.convert_to_numpy(img)
# Ensure the image is in RGB format
if img_np.shape[-1] != 3:
img_np = self.convert_to_rgb(img_np)
# Convert to PIL Image
img_pil = Image.fromarray(img_np)
img_path = os.path.join(temp_dir, f"frame_{i:05d}.png")
img_pil.save(img_path)
# Use FFmpeg to create a video from the image sequence
ffmpeg_cmd = [
"ffmpeg",
"-framerate", str(frames_per_second),
"-i", os.path.join(temp_dir, "frame_%05d.png"),
"-c:v", "libx264",
"-pix_fmt", "yuv420p",
"-crf", "23",
"-y", # Overwrite output file if it exists
video_path
]
try:
subprocess.run(ffmpeg_cmd, check=True, capture_output=True, text=True)
except subprocess.CalledProcessError as e:
print(f"FFmpeg error: {e.stderr}")
return ("",) # Return empty string if video creation fails
return (video_path,)
def convert_to_numpy(self, img):
if isinstance(img, torch.Tensor):
img = img.cpu().numpy()
if img.dtype == np.uint8:
return img
elif img.dtype == np.float32 or img.dtype == np.float64:
return (img * 255).astype(np.uint8)
else:
raise ValueError(f"Unsupported data type: {img.dtype}")
def convert_to_rgb(self, img):
if img.shape[-1] == 1: # Grayscale
return np.repeat(img, 3, axis=-1)
elif img.shape[-1] == 768: # Latent space representation
# This is a placeholder. You might need a more sophisticated method to convert latent space to RGB
img = img.reshape((-1, 3)) # Reshape to (H*W, 3)
img = (img - img.min()) / (img.max() - img.min()) # Normalize to [0, 1]
img = (img * 255).astype(np.uint8)
return img.reshape((img.shape[0], -1, 3)) # Reshape back to (H, W, 3)
elif len(img.shape) == 2: # 2D array
return np.stack([img, img, img], axis=-1)
else:
raise ValueError(f"Unsupported image shape: {img.shape}")

View File

@@ -1,7 +1,7 @@
[project]
name = "bjornulf_custom_nodes"
description = "Nodes: Ollama, Text to Speech, Combine Texts, Random Texts, Save image for Bjornulf LobeChat, Text with random Seed, Random line from input, Combine images, Image to grayscale (black & white), Remove image Transparency (alpha), Resize Image, ..."
version = "0.45"
version = "0.46"
license = {file = "LICENSE"}
[project.urls]

Binary file not shown.

After

Width:  |  Height:  |  Size: 698 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 712 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 422 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 60 KiB

After

Width:  |  Height:  |  Size: 154 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 341 KiB

After

Width:  |  Height:  |  Size: 200 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 110 KiB

BIN
screenshots/tts_preload.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 153 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 484 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 231 KiB

View File

@@ -9,53 +9,34 @@ import os
import sys
import random
import re
from typing import Dict, Any, List, Tuple
class Everything(str):
def __ne__(self, __value: object) -> bool:
return False
language_map = {
"ar": "Arabic",
"cs": "Czech",
"de": "German",
"en": "English",
"es": "Spanish",
"fr": "French",
"hi": "Hindi",
"hu": "Hungarian",
"it": "Italian",
"ja": "Japanese",
"ko": "Korean",
"nl": "Dutch",
"pl": "Polish",
"pt": "Portuguese",
"ru": "Russian",
"tr": "Turkish",
"ar": "Arabic", "cs": "Czech", "de": "German", "en": "English",
"es": "Spanish", "fr": "French", "hi": "Hindi", "hu": "Hungarian",
"it": "Italian", "ja": "Japanese", "ko": "Korean", "nl": "Dutch",
"pl": "Polish", "pt": "Portuguese", "ru": "Russian", "tr": "Turkish",
"zh-cn": "Chinese"
}
class TextToSpeech:
@classmethod
def INPUT_TYPES(cls):
def INPUT_TYPES(cls) -> Dict[str, Any]:
speakers_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "speakers")
speaker_options = []
speaker_options = [os.path.relpath(os.path.join(root, file), speakers_dir)
for root, _, files in os.walk(speakers_dir)
for file in files if file.endswith(".wav")]
for root, dirs, files in os.walk(speakers_dir):
for file in files:
if file.endswith(".wav"):
rel_path = os.path.relpath(os.path.join(root, file), speakers_dir)
speaker_options.append(rel_path)
if not speaker_options:
speaker_options.append("No WAV files found")
language_options = list(language_map.values())
speaker_options = speaker_options or ["No WAV files found"]
return {
"required": {
"text": ("STRING", {"multiline": True}),
"language": (language_options, {
"language": (list(language_map.values()), {
"default": language_map["en"],
"display": "dropdown"
}),
@@ -69,44 +50,45 @@ class TextToSpeech:
"seed": ("INT", {"default": 0}),
},
"optional": {
"input": (Everything("*"), {"forceInput": True}),
"connect_to_workflow": (Everything("*"), {"forceInput": True}),
}
}
RETURN_TYPES = ("AUDIO",)
RETURN_TYPES = ("AUDIO", "STRING", "STRING", "FLOAT")
RETURN_NAMES = ("AUDIO", "audio_path", "full_path", "duration")
FUNCTION = "generate_audio"
CATEGORY = "Bjornulf"
@staticmethod
def get_language_code(language_name):
for code, name in language_map.items():
if name == language_name:
return code
return "en"
def get_language_code(language_name: str) -> str:
return next((code for code, name in language_map.items() if name == language_name), "en")
@staticmethod
def sanitize_text(text):
sanitized = re.sub(r'[^\w\s-]', '', text).replace(' ', '_')
return sanitized[:50]
def sanitize_text(text: str) -> str:
return re.sub(r'[^\w\s-]', '', text).replace(' ', '_')[:50]
def generate_audio(self, text, language, autoplay, seed, save_audio, overwrite, speaker_wav, input=None):
def generate_audio(self, text: str, language: str, autoplay: bool, seed: int,
save_audio: bool, overwrite: bool, speaker_wav: str,
connect_to_workflow: Any = None) -> Tuple[Dict[str, Any], str, str, float]:
language_code = self.get_language_code(language)
sanitized_text = self.sanitize_text(text)
save_path = os.path.join("Bjornulf_TTS", language, speaker_wav, f"{sanitized_text}.wav")
os.makedirs(os.path.dirname(save_path), exist_ok=True)
full_path = os.path.abspath(save_path)
os.makedirs(os.path.dirname(full_path), exist_ok=True)
if os.path.exists(save_path) and not overwrite:
print(f"Using existing audio file: {save_path}")
audio_data = self.load_audio_file(save_path)
if os.path.exists(full_path) and not overwrite:
print(f"Using existing audio file: {full_path}")
audio_data = self.load_audio_file(full_path)
else:
audio_data = self.create_new_audio(text, language_code, speaker_wav, seed)
if save_audio:
self.save_audio_file(audio_data, save_path)
self.save_audio_file(audio_data, full_path)
return self.process_audio_data(autoplay, audio_data)
audio_output, _, duration = self.process_audio_data(autoplay, audio_data, full_path if save_audio else None)
return (audio_output, save_path, full_path, duration)
def create_new_audio(self, text, language_code, speaker_wav, seed):
def create_new_audio(self, text: str, language_code: str, speaker_wav: str, seed: int) -> io.BytesIO:
random.seed(seed)
if speaker_wav == "No WAV files found":
print("Error: No WAV files available for text-to-speech.")
@@ -133,17 +115,17 @@ class TextToSpeech:
print(f"Unexpected error: {e}")
return io.BytesIO()
def play_audio(self, audio):
def play_audio(self, audio: AudioSegment) -> None:
if sys.platform.startswith('win'):
try:
import winsound
winsound.PlaySound(audio, winsound.SND_MEMORY)
winsound.PlaySound(audio.raw_data, winsound.SND_MEMORY)
except Exception as e:
print(f"An error occurred: {e}")
else:
play(audio)
def process_audio_data(self, autoplay, audio_data):
def process_audio_data(self, autoplay: bool, audio_data: io.BytesIO, save_path: str) -> Tuple[Dict[str, Any], str, float]:
try:
audio = AudioSegment.from_mp3(audio_data)
sample_rate = audio.frame_rate
@@ -151,23 +133,22 @@ class TextToSpeech:
audio_np = np.array(audio.get_array_of_samples()).astype(np.float32)
audio_np /= np.iinfo(np.int16).max
if num_channels == 1:
audio_np = audio_np.reshape(1, -1)
else:
audio_np = audio_np.reshape(-1, num_channels).T
audio_np = audio_np.reshape(-1, num_channels).T if num_channels > 1 else audio_np.reshape(1, -1)
audio_tensor = torch.from_numpy(audio_np)
if autoplay:
self.play_audio(audio)
return ({"waveform": audio_tensor.unsqueeze(0), "sample_rate": sample_rate},)
duration = len(audio) / 1000.0 # Convert milliseconds to seconds
return ({"waveform": audio_tensor.unsqueeze(0), "sample_rate": sample_rate}, save_path or "", duration)
except Exception as e:
print(f"Error processing audio data: {e}")
return ({"waveform": torch.zeros(1, 1, 1, dtype=torch.float32), "sample_rate": 22050},)
return ({"waveform": torch.zeros(1, 1, 1, dtype=torch.float32), "sample_rate": 22050}, "", 0.0)
def save_audio_file(self, audio_data, save_path):
def save_audio_file(self, audio_data: io.BytesIO, save_path: str) -> None:
try:
with open(save_path, 'wb') as f:
f.write(audio_data.getvalue())
@@ -175,7 +156,7 @@ class TextToSpeech:
except Exception as e:
print(f"Error saving audio file: {e}")
def load_audio_file(self, file_path):
def load_audio_file(self, file_path: str) -> io.BytesIO:
try:
with open(file_path, 'rb') as f:
audio_data = io.BytesIO(f.read())

62
video_path_to_images.py Normal file
View File

@@ -0,0 +1,62 @@
import os
import cv2
import numpy as np
import torch
from PIL import Image
class VideoToImagesList:
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"video_path": ("STRING", {"forceInput": True}),
"frame_interval": ("INT", {"default": 1, "min": 1, "max": 100}),
"max_frames": ("INT", {"default": 0, "min": 0, "max": 10000})
}
}
RETURN_TYPES = ("IMAGE", "FLOAT", "FLOAT", "INT")
RETURN_NAMES = ("IMAGE", "initial_fps", "new_fps", "total_frames")
FUNCTION = "video_to_images"
CATEGORY = "Bjornulf"
def video_to_images(self, video_path, frame_interval=1, max_frames=0):
if not os.path.exists(video_path):
raise FileNotFoundError(f"Video file not found: {video_path}")
cap = cv2.VideoCapture(video_path)
frame_count = 0
images = []
# Get the initial fps of the video
initial_fps = cap.get(cv2.CAP_PROP_FPS)
# Get the total number of frames in the video
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
while True:
ret, frame = cap.read()
if not ret or (max_frames > 0 and len(images) >= max_frames):
break
if frame_count % frame_interval == 0:
# Convert BGR to RGB
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
pil_image = Image.fromarray(rgb_frame)
# Convert PIL Image to tensor
tensor_image = torch.from_numpy(np.array(pil_image).astype(np.float32) / 255.0).unsqueeze(0)
images.append(tensor_image)
frame_count += 1
cap.release()
if not images:
raise ValueError("No frames were extracted from the video")
# Calculate the new fps
new_fps = initial_fps / frame_interval
# Stack all images into a single tensor
return (torch.cat(images, dim=0), initial_fps, new_fps, total_frames)

49
video_preview.py Normal file
View File

@@ -0,0 +1,49 @@
import os
import shutil
# import logging
class VideoPreview:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"video_path": ("STRING", {"forceInput": True}),
},
}
RETURN_TYPES = ()
FUNCTION = "preview_video"
CATEGORY = "Bjornulf"
OUTPUT_NODE = True
def preview_video(self, video_path):
if not video_path:
return {"ui": {"error": "No video path provided."}}
# Keep the "output" folder structure for copying
dest_dir = os.path.join("output", "Bjornulf", "preview_video")
os.makedirs(dest_dir, exist_ok=True)
video_name = os.path.basename(video_path)
dest_path = os.path.join(dest_dir, video_name)
if os.path.abspath(video_path) != os.path.abspath(dest_path):
shutil.copy2(video_path, dest_path)
print(f"Video copied successfully to {dest_path}")
else:
print(f"Video is already in the destination folder: {dest_path}")
# Determine the video type based on file extension
_, file_extension = os.path.splitext(dest_path)
video_type = file_extension.lower()[1:] # Remove the dot from extension
# logging.info(f"Video type: {video_type}")
# logging.info(f"Video path: {dest_path}")
# logging.info(f"Destination directory: {dest_dir}")
# logging.info(f"Video name: {video_name}")
# Create a new variable for the return value without "output"
return_dest_dir = os.path.join("Bjornulf", "preview_video")
# Return the video name and the modified destination directory
return {"ui": {"video": [video_name, return_dest_dir]}}

83
web/js/video_preview.js Normal file
View File

@@ -0,0 +1,83 @@
import { api } from '../../../scripts/api.js';
import { app } from "../../../scripts/app.js";
function displayVideoPreview(component, filename, category) {
let videoWidget = component._videoWidget;
if (!videoWidget) {
// Create the widget if it doesn't exist
var container = document.createElement("div");
const currentNode = component;
videoWidget = component.addDOMWidget("videopreview", "preview", container, {
serialize: false,
hideOnZoom: false,
getValue() {
return container.value;
},
setValue(v) {
container.value = v;
},
});
videoWidget.computeSize = function(width) {
if (this.aspectRatio && !this.parentElement.hidden) {
let height = (currentNode.size[0] - 20) / this.aspectRatio + 10;
if (!(height > 0)) {
height = 0;
}
return [width, height];
}
return [width, -4];
};
videoWidget.value = { hidden: false, paused: false, params: {} };
videoWidget.parentElement = document.createElement("div");
videoWidget.parentElement.className = "video_preview";
videoWidget.parentElement.style['width'] = "100%";
container.appendChild(videoWidget.parentElement);
videoWidget.videoElement = document.createElement("video");
videoWidget.videoElement.controls = true;
videoWidget.videoElement.loop = false;
videoWidget.videoElement.muted = false;
videoWidget.videoElement.style['width'] = "100%";
videoWidget.videoElement.addEventListener("loadedmetadata", () => {
videoWidget.aspectRatio = videoWidget.videoElement.videoWidth / videoWidget.videoElement.videoHeight;
adjustSize(component);
});
videoWidget.videoElement.addEventListener("error", () => {
videoWidget.parentElement.hidden = true;
adjustSize(component);
});
videoWidget.parentElement.hidden = videoWidget.value.hidden;
videoWidget.videoElement.autoplay = !videoWidget.value.paused && !videoWidget.value.hidden;
videoWidget.videoElement.hidden = false;
videoWidget.parentElement.appendChild(videoWidget.videoElement);
component._videoWidget = videoWidget; // Store the widget for future reference
}
// Update the video source
let params = {
"filename": filename,
"subfolder": category,
"type": "output",
"rand": Math.random().toString().slice(2, 12)
};
const urlParams = new URLSearchParams(params);
videoWidget.videoElement.src = `http://localhost:8188/api/view?${urlParams.toString()}`;
adjustSize(component); // Adjust the component size
}
function adjustSize(component) {
component.setSize([component.size[0], component.computeSize([component.size[0], component.size[1]])[1]]);
component?.graph?.setDirtyCanvas(true);
}
app.registerExtension({
name: "Bjornulf.VideoPreview",
async beforeRegisterNodeDef(nodeType, nodeData, appInstance) {
if (nodeData?.name == "Bjornulf_VideoPreview") {
nodeType.prototype.onExecuted = function (data) {
displayVideoPreview(this, data.video[0], data.video[1]);
};
}
}
});