fix(recipe): use resources type field to identify checkpoint instead of modelVersionIds[0]

When importing a CivitAI image as a recipe, modelVersionIds[0] was blindly used as the checkpoint version ID. This array mixes checkpoints and LoRAs without ordering guarantees, causing LoRAs to be saved as the recipe checkpoint. Fix by: 1. Removing the modelVersionIds[0] fallback in _download_remote_media 2. Parsing resources entries with type:"model" as the checkpoint 3. Adding model type validation in populate_checkpoint_from_civitai Also add 2 tests for the new behavior and fix 3 tests whose mocks lacked the required model.type field.
fix(example-images): exclude failed_models from check_pending_models pending count
2026-06-23 19:51:15 -03:00 · 2026-05-28 15:46:38 +08:00 · 2026-05-28 12:00:25 +08:00 · 2026-05-28 11:56:22 +08:00 · 2026-05-27 19:58:56 +08:00 · 2026-05-27 19:38:08 +08:00
13 changed files with 290 additions and 29 deletions
--- a/README.md
+++ b/README.md
--- a/py/recipes/base.py
+++ b/py/recipes/base.py
@@ -7,7 +7,7 @@ import re
 from typing import Dict, List, Any, Optional, Tuple
 from abc import ABC, abstractmethod
 from ..config import config
-from ..utils.constants import VALID_LORA_TYPES
+from ..utils.constants import VALID_LORA_TYPES, VALID_CHECKPOINT_SUB_TYPES
 from ..utils.civitai_utils import rewrite_preview_url

 logger = logging.getLogger(__name__)
@@ -173,6 +173,20 @@ class RecipeMetadataParser(ABC):
                checkpoint['isDeleted'] = True
                return checkpoint

+            # Validate that the model type is actually a checkpoint.
+            # Unlike populate_lora_from_civitai which has this check,
+            # this function was missing type validation — allowing LoRA
+            # version data to be saved as the recipe's checkpoint when the
+            # wrong version ID was passed downstream (fixed in v2.7+).
+            model_type = civitai_data.get('model', {}).get('type', '').lower()
+            if model_type not in VALID_CHECKPOINT_SUB_TYPES:
+                logger.warning(
+                    f"Cannot populate checkpoint: model version {civitai_data.get('id')} "
+                    f"has type '{model_type}', expected one of {VALID_CHECKPOINT_SUB_TYPES}. "
+                    f"Skipping checkpoint enrichment."
+                )
+                return checkpoint
+
            if 'model' in civitai_data and 'name' in civitai_data['model']:
                checkpoint['name'] = civitai_data['model']['name']

--- a/py/recipes/parsers/civitai_image.py
+++ b/py/recipes/parsers/civitai_image.py
@@ -185,8 +185,67 @@ class CivitaiApiMetadataParser(RecipeMetadataParser):
            # Process standard resources array
            if "resources" in metadata and isinstance(metadata["resources"], list):
                for resource in metadata["resources"]:
+                    resource_type = resource.get("type", "lora")
+
+                    # Track resources with type "model" — these are checkpoint models.
+                    # The resources array is the most reliable source for checkpoint
+                    # identification because it has an explicit type field and hash,
+                    # unlike modelVersionIds which is a flat list with no type info.
+                    if resource_type == "model":
+                        checkpoint_entry = {
+                            "id": 0,
+                            "modelId": 0,
+                            "name": resource.get("name", "Unknown Model"),
+                            "version": "",
+                            "type": resource.get("type", "model"),
+                            "existsLocally": False,
+                            "localPath": None,
+                            "file_name": resource.get("name", ""),
+                            "hash": resource.get("hash", "") or "",
+                            "thumbnailUrl": "/loras_static/images/no-preview.png",
+                            "baseModel": "",
+                            "size": 0,
+                            "downloadUrl": "",
+                            "isDeleted": False,
+                        }
+
+                        # Try to look up base model from the checkpoint hash
+                        if checkpoint_entry["hash"] and metadata_provider:
+                            try:
+                                civitai_info = (
+                                    await metadata_provider.get_model_by_hash(
+                                        checkpoint_entry["hash"]
+                                    )
+                                )
+                                civitai_data, error_msg = (
+                                    (civitai_info, None)
+                                    if not isinstance(civitai_info, tuple)
+                                    else civitai_info
+                                )
+                                if civitai_data and error_msg != "Model not found":
+                                    if 'model' in civitai_data and 'name' in civitai_data['model']:
+                                        checkpoint_entry['name'] = civitai_data['model']['name']
+                                    checkpoint_entry['id'] = civitai_data.get('id', 0)
+                                    checkpoint_entry['modelId'] = civitai_data.get('modelId', 0)
+                                    if 'name' in civitai_data:
+                                        checkpoint_entry['version'] = civitai_data['name']
+                                    base_model = civitai_data.get('baseModel', '')
+                                    if base_model:
+                                        checkpoint_entry['baseModel'] = base_model
+                                        if not result['base_model']:
+                                            result['base_model'] = base_model
+                            except Exception as e:
+                                logger.error(
+                                    f"Error fetching checkpoint info for hash "
+                                    f"{checkpoint_entry['hash']}: {e}"
+                                )
+
+                        if result["model"] is None:
+                            result["model"] = checkpoint_entry
+                        continue
+
                    # Modified to process resources without a type field as potential LoRAs
-                    if resource.get("type", "lora") == "lora":
+                    if resource_type == "lora":
                        lora_hash = resource.get("hash", "")

                        # Try to get hash from the hashes field if not present in resource
--- a/py/routes/handlers/recipe_handlers.py
+++ b/py/routes/handlers/recipe_handlers.py
@@ -1293,11 +1293,18 @@ class RecipeManagementHandler:
                    image_info.get("meta") if civitai_image_id and image_info else None
                )
                if civitai_image_id and image_info:
+                    # modelVersionId (singular) — the primary version for this
+                    # image on CivitAI.  May be absent, or may *not* be the
+                    # checkpoint (e.g. when the image was generated with a LoRA
+                    # as the primary subject).  When absent, DO NOT fall back to
+                    # modelVersionIds[0] — that array mixes checkpoints, LoRAs,
+                    # and other model version IDs without ordering guarantees.
+                    # The downstream enrichment flow will find the real
+                    # checkpoint via meta.resources (type:"model" hash) or
+                    # meta.civitaiResources (type:"checkpoint" version ID), so
+                    # leaving model_ver_id as None is safe and avoids the bug
+                    # where a LoRA version ID was treated as the checkpoint.
                    model_ver_id = image_info.get("modelVersionId")
-                    if not model_ver_id:
-                        ids = image_info.get("modelVersionIds")
-                        if isinstance(ids, list) and ids:
-                            model_ver_id = ids[0]

                    # Inject root-level modelVersionIds into meta so downstream
                    # parsers (CivitaiApiMetadataParser) can discover ALL resources
--- a/py/services/civitai_client.py
+++ b/py/services/civitai_client.py
@@ -410,6 +410,25 @@ class CivitaiClient:
            return None

        target_version = self._select_target_version(model_data, model_id, version_id)
+
+        # If modelVersions is empty (e.g. CivitAI cache lag for newly published
+        # models) but a specific version_id is known, fall back to fetching the
+        # version directly via the individual model-versions endpoint, then
+        # enrich it with the model-level data we already have.
+        if target_version is None and version_id is not None:
+            logger.info(
+                "modelVersions empty for model %s; falling back to direct "
+                "version lookup for %s",
+                model_id,
+                version_id,
+            )
+            version = await self._fetch_version_by_id(version_id)
+            if version:
+                self._enrich_version_with_model_data(version, model_data)
+                self._remove_comfy_metadata(version)
+                return version
+            return None
+
        if target_version is None:
            return None

--- a/py/services/model_hash_index.py
+++ b/py/services/model_hash_index.py
@@ -7,6 +7,7 @@ class ModelHashIndex:
    def __init__(self):
        self._hash_to_path: Dict[str, str] = {}
        self._filename_to_hash: Dict[str, str] = {}
+        self._autov2_to_path: Dict[str, str] = {}
        # New data structures for tracking duplicates
        self._duplicate_hashes: Dict[str, List[str]] = {}  # sha256 -> list of paths
        self._duplicate_filenames: Dict[str, List[str]] = {}  # filename -> list of paths
@@ -63,6 +64,9 @@ class ModelHashIndex:
        # Add new mappings
        self._hash_to_path[sha256] = file_path
        self._filename_to_hash[filename] = sha256
+        # AutoV2 = first 10 chars of SHA256
+        if len(sha256) >= 10:
+            self._autov2_to_path[sha256[:10]] = file_path
    
    def _get_filename_from_path(self, file_path: str) -> str:
        """Extract filename without extension from path"""
@@ -157,7 +161,12 @@ class ModelHashIndex:
                del self._duplicate_filenames[filename]
                if filename in self._filename_to_hash:
                    del self._filename_to_hash[filename]
-    
+
+        # Remove from AutoV2 index
+        autov2_keys_to_remove = [k for k, v in self._autov2_to_path.items() if v == file_path]
+        for k in autov2_keys_to_remove:
+            del self._autov2_to_path[k]
+
    def remove_by_hash(self, sha256: str) -> None:
        """Remove entry by hash"""
        sha256 = sha256.lower()
@@ -177,6 +186,10 @@ class ModelHashIndex:
        # Remove hash-to-path mapping
        del self._hash_to_path[sha256]
        
+        autov2_key = sha256[:10]
+        if autov2_key in self._autov2_to_path:
+            del self._autov2_to_path[autov2_key]
+        
        # Update filename-to-hash and duplicate filenames for all paths
        for path_to_remove in paths_to_remove:
            fname = self._get_filename_from_path(path_to_remove)
@@ -195,13 +208,24 @@ class ModelHashIndex:
                    # If only one entry remains, it's no longer a duplicate
                    del self._duplicate_filenames[fname]
    
-    def has_hash(self, sha256: str) -> bool:
-        """Check if hash exists in index"""
-        return sha256.lower() in self._hash_to_path
-    
-    def get_path(self, sha256: str) -> Optional[str]:
-        """Get file path for a hash"""
-        return self._hash_to_path.get(sha256.lower())
+    def has_hash(self, hash_value: str) -> bool:
+        """Check if hash exists in index (SHA256 or AutoV2)"""
+        normalized = hash_value.lower()
+        if normalized in self._hash_to_path:
+            return True
+        if len(normalized) == 10:
+            return normalized in self._autov2_to_path
+        return False
+
+    def get_path(self, hash_value: str) -> Optional[str]:
+        """Get file path for a hash (SHA256 or AutoV2)"""
+        normalized = hash_value.lower()
+        path = self._hash_to_path.get(normalized)
+        if path is not None:
+            return path
+        if len(normalized) == 10:
+            return self._autov2_to_path.get(normalized)
+        return None
    
    def get_hash(self, file_path: str) -> Optional[str]:
        """Get hash for a file path"""
@@ -218,6 +242,7 @@ class ModelHashIndex:
        """Clear all entries"""
        self._hash_to_path.clear()
        self._filename_to_hash.clear()
+        self._autov2_to_path.clear()
        self._duplicate_hashes.clear()
        self._duplicate_filenames.clear()
    
--- a/py/services/model_metadata_provider.py
+++ b/py/services/model_metadata_provider.py
@@ -5,7 +5,7 @@ import logging
 import random
 from typing import Optional, Dict, Tuple, Any, List, Sequence
 from .downloader import get_downloader
-from .errors import RateLimitError
+from .errors import RateLimitError, ResourceNotFoundError

 try:
    from bs4 import BeautifulSoup
@@ -482,6 +482,7 @@ class FallbackMetadataProvider(ModelMetadataProvider):
        return None, "Model not found"

    async def get_model_versions(self, model_id: str) -> Optional[Dict]:
+        not_found_confirmed = False
        for provider, label in self._iter_providers():
            try:
                result = await self._call_with_rate_limit(
@@ -492,8 +493,24 @@ class FallbackMetadataProvider(ModelMetadataProvider):
                if result:
                    return result
            except RateLimitError as exc:
+                if not_found_confirmed:
+                    logger.debug(
+                        "Suppressing rate limit from %s for model %s: "
+                        "already confirmed as not found by another provider",
+                        label,
+                        model_id,
+                    )
+                    return None
                exc.provider = exc.provider or label
                raise exc
+            except ResourceNotFoundError:
+                not_found_confirmed = True
+                logger.debug(
+                    "Provider %s reports model %s as not found",
+                    label,
+                    model_id,
+                )
+                continue
            except Exception as e:
                logger.debug("Provider %s failed for get_model_versions: %s", label, e)
                continue
--- a/py/utils/example_images_download_manager.py
+++ b/py/utils/example_images_download_manager.py
@@ -397,13 +397,12 @@ class DownloadManager:

            models_with_hash = len(all_models_with_hash)

-            # Calculate pending count: check which models actually need processing
-            # A model is pending if it has a hash, is not in processed_models,
-            # and its folder doesn't exist or is empty
+            # Calculate pending count: check which models actually need processing.
+            # A model is pending if it has a hash, is not already processed or known-failed,
+            # and its folder doesn't exist or is empty.
            pending_hashes = set()
            for model_hash, model_name in all_models_with_hash:
-                if model_hash not in processed_models:
-                    # Check if model folder exists with files
+                if model_hash not in processed_models and model_hash not in failed_models:
                    model_dir = ExampleImagePathResolver.get_model_folder(
                        model_hash, active_library
                    )
--- a/settings.json.example
+++ b/settings.json.example
@@ -10,13 +10,14 @@
      "C:/path/to/your/checkpoints_folder",
      "C:/path/to/another/checkpoints_folder"
    ],
+    "unet": [
+      "C:/path/to/your/diffusion_models_folder",
+      "C:/path/to/another/diffusion_models_folder"
+    ],
    "embeddings": [
      "C:/path/to/your/embeddings_folder",
      "C:/path/to/another/embeddings_folder"
    ]
  },
-  "example_images_open_mode": "system",
-  "example_images_local_root": "",
-  "example_images_open_uri_template": "",
  "auto_organize_exclusions": []
 }
--- a/static/js/managers/UpdateService.js
+++ b/static/js/managers/UpdateService.js
@@ -731,9 +731,16 @@ export class UpdateService {
    }
    
    // Simple markdown parser for changelog items
+    // Simple markdown parser for changelog items
+    // Escape HTML entities first so angle brackets in content (e.g. `<lora:x>`)
+    // aren't swallowed by innerHTML's HTML parser as invalid tags
    parseMarkdown(text) {
        if (!text) return '';
        
+        text = text.replace(/&/g, '&amp;');
+        text = text.replace(/</g, '&lt;');
+        text = text.replace(/>/g, '&gt;');
+        
        // Handle bold text (**text**)
        text = text.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>');
        
--- a/tests/routes/test_recipe_routes.py
+++ b/tests/routes/test_recipe_routes.py
@@ -467,7 +467,10 @@ async def test_import_remote_recipe(monkeypatch, tmp_path: Path) -> None:
    class Provider:
        async def get_model_version_info(self, model_version_id):
            provider_calls.append(model_version_id)
-            return {"baseModel": "Flux Provider"}, None
+            return {
+                "baseModel": "Flux Provider",
+                "model": {"type": "Checkpoint", "name": "Flux"},
+            }, None

    async def fake_get_default_metadata_provider():
        return Provider()
--- a/tests/services/test_civitai_image_parser.py
+++ b/tests/services/test_civitai_image_parser.py
@@ -298,3 +298,113 @@ async def test_parse_metadata_handles_modelVersionIds(monkeypatch):
    assert lora2["type"] == "lora"
    assert lora2["hash"] == "aabbccdd0022"
    assert lora2["baseModel"] == "SDXL"
+
+
+@pytest.mark.asyncio
+async def test_parse_metadata_extracts_checkpoint_from_resources_model_type(monkeypatch):
+    """resources entries with type:"model" should be captured as the checkpoint,
+    not skipped (which was the old buggy behavior), and not mixed into loras."""
+    captured_hashes = []
+
+    async def fake_metadata_provider():
+        class Provider:
+            async def get_model_by_hash(self, model_hash):
+                captured_hashes.append(model_hash)
+                if model_hash == "a1b2c3d4e5":
+                    return ({
+                        "id": 999,
+                        "modelId": 888,
+                        "name": "v1.0",
+                        "model": {"name": "Real Checkpoint", "type": "Checkpoint"},
+                        "baseModel": "SDXL 1.0",
+                        "images": [{"url": "https://image.civitai.com/cp/original=true"}],
+                        "files": [{"type": "Model", "primary": True, "sizeKB": 1024, "name": "cp.safetensors"}]
+                    }, None)
+                return None, "Model not found"
+
+        return Provider()
+
+    monkeypatch.setattr(
+        "py.recipes.parsers.civitai_image.get_default_metadata_provider",
+        fake_metadata_provider,
+    )
+
+    parser = CivitaiApiMetadataParser()
+
+    metadata = {
+        "prompt": "test",
+        "resources": [
+            {"hash": "a1b2c3d4e5", "name": "Real Checkpoint", "type": "model"},
+            {"hash": "f6g7h8i9j0", "name": "Some LoRA", "type": "lora", "weight": 0.8},
+        ],
+        "Model hash": "a1b2c3d4e5",
+    }
+
+    result = await parser.parse_metadata(metadata)
+
+    # The type:"model" resource should be in result["model"], not in result["loras"]
+    assert result["model"] is not None, "checkpoint model should be extracted"
+    assert result["model"]["name"] == "Real Checkpoint"
+    assert result["model"]["hash"] == "a1b2c3d4e5"
+    assert result["model"]["type"] == "model"
+
+    # The LoRA resource should be in result["loras"]
+    assert len(result["loras"]) == 1
+    assert result["loras"][0]["name"] == "Some LoRA"
+
+    # The checkpoint hash should have triggered a lookup
+    assert "a1b2c3d4e5" in captured_hashes
+
+
+@pytest.mark.asyncio
+async def test_parse_metadata_resources_model_type_does_not_duplicate_checkpoint_in_loras(monkeypatch):
+    """When a resources entry has type:"model", it should NOT also appear in loras.
+    Regression test for the bug where the checkpoint model appeared in both places."""
+    async def fake_metadata_provider():
+        class Provider:
+            async def get_model_by_hash(self, model_hash):
+                if model_hash == "cp123hash":
+                    return ({
+                        "id": 100,
+                        "modelId": 200,
+                        "name": "v2",
+                        "model": {"name": "My Checkpoint", "type": "Checkpoint"},
+                        "baseModel": "SDXL",
+                        "files": [{"type": "Model", "primary": True, "sizeKB": 1024, "name": "cp.safetensors"}]
+                    }, None)
+                if model_hash == "lora1hash":
+                    return ({
+                        "id": 300,
+                        "modelId": 400,
+                        "name": "v1",
+                        "model": {"name": "Style LoRA", "type": "LORA"},
+                        "baseModel": "SDXL",
+                        "files": [{"type": "Model", "primary": True, "sizeKB": 512, "name": "style.safetensors"}]
+                    }, None)
+                return None, "Model not found"
+
+        return Provider()
+
+    monkeypatch.setattr(
+        "py.recipes.parsers.civitai_image.get_default_metadata_provider",
+        fake_metadata_provider,
+    )
+
+    parser = CivitaiApiMetadataParser()
+    metadata = {
+        "resources": [
+            {"hash": "cp123hash", "name": "My Checkpoint", "type": "model"},
+            {"hash": "lora1hash", "name": "Style LoRA", "type": "lora", "weight": 0.5},
+        ],
+    }
+
+    result = await parser.parse_metadata(metadata)
+
+    # Checkpoint must NOT appear in loras
+    lora_names = {l["name"] for l in result["loras"]}
+    assert "My Checkpoint" not in lora_names
+    assert "Style LoRA" in lora_names
+
+    # Checkpoint must be in result["model"]
+    assert result["model"] is not None
+    assert result["model"]["name"] == "My Checkpoint"
--- a/tests/services/test_recipe_repair.py
+++ b/tests/services/test_recipe_repair.py
@@ -94,7 +94,7 @@ async def test_repair_all_recipes_with_enriched_checkpoint_id(setup_scanner):
        "id": 5678,
        "modelId": 1234,
        "name": "v1.0",
-        "model": {"name": "Full Model Name"},
+        "model": {"name": "Full Model Name", "type": "Checkpoint"},
        "baseModel": "SDXL 1.0",
        "images": [{"url": "https://image.url/thumb.jpg"}],
        "files": [{"type": "Model", "hashes": {"SHA256": "ABCDEF"}, "name": "full_filename.safetensors"}]
@@ -142,7 +142,7 @@ async def test_repair_all_recipes_supports_civitai_red_source_url(setup_scanner)
            "id": 5678,
            "modelId": 1234,
            "name": "v1.0",
-            "model": {"name": "Full Model Name"},
+            "model": {"name": "Full Model Name", "type": "Checkpoint"},
            "baseModel": "SDXL 1.0",
            "images": [{"url": "https://image.url/thumb.jpg"}],
            "files": [
@@ -183,7 +183,7 @@ async def test_repair_all_recipes_with_enriched_checkpoint_hash(setup_scanner):
        "id": 999,
        "modelId": 888,
        "name": "v2.0",
-        "model": {"name": "Hashed Model"},
+        "model": {"name": "Hashed Model", "type": "Checkpoint"},
        "baseModel": "SD 1.5",
        "files": [{"type": "Model", "hashes": {"SHA256": "hash123"}, "name": "hashed.safetensors"}]
    }, None)
Author	SHA1	Message	Date
Will Miao	34791c2ad7	fix(recipe): use resources type field to identify checkpoint instead of modelVersionIds[0] When importing a CivitAI image as a recipe, modelVersionIds[0] was blindly used as the checkpoint version ID. This array mixes checkpoints and LoRAs without ordering guarantees, causing LoRAs to be saved as the recipe checkpoint. Fix by: 1. Removing the modelVersionIds[0] fallback in _download_remote_media 2. Parsing resources entries with type:"model" as the checkpoint 3. Adding model type validation in populate_checkpoint_from_civitai Also add 2 tests for the new behavior and fix 3 tests whose mocks lacked the required model.type field.	2026-05-28 15:46:38 +08:00
Will Miao	3f6824eef6	fix(example-images): exclude failed_models from check_pending_models pending count Previously check_pending_models() only skipped models already in processed_models, so models that had permanently failed (no CivitAI images available, download errors) were forever reported as "pending". This caused repeated auto-download cycles with no actual work to do. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-28 12:00:25 +08:00
Will Miao	3919dfa3f4	fix(metadata): suppress rate-limit propagation when model already confirmed deleted When CivitAI returns 404 (ResourceNotFoundError) and a fallback provider like CivArchive subsequently rate-limits, the ChainedMetadataProvider now suppresses the RateLimitError instead of propagating it. Previously, the rate-limit error would bubble up through _refresh_single_model and cause the outer retry loop to re-process the same model repeatedly, producing dozens of duplicate "Model X is no longer available" log messages and wasting API quota. The model is NOT permanently marked as ignored — its last_checked_at timestamp is preserved, so it will be retried on the next refresh cycle when the rate limit has cleared and CivArchive may still have the data. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-28 11:56:22 +08:00
Will Miao	7124b5293f	chore(settings): remove unused example_images config, add unet folder_paths example	2026-05-27 19:58:56 +08:00
Will Miao	d2a04f8993	fix(model-hash-index): clean up AutoV2 entry in remove_by_hash	2026-05-27 19:38:08 +08:00
pixelpaws	7027a7c270	Merge pull request #946 from 1756141021/fix/autov2-hash-matching fix: match local LoRAs by AutoV2 hash when Civitai model is deleted	2026-05-27 19:20:31 +08:00
hein	0a1d7dfd4c	fix: match local LoRAs by AutoV2 hash when Civitai model is deleted When recipe metadata contains AutoV2 hashes (10-char short hash from image metadata) and the Civitai API cannot resolve them to SHA256 (model deleted, API offline), the local hash index failed to match because it only stored full SHA256 hashes. AutoV2 is simply SHA256[:10], so we derive it automatically in add_entry() — no extra file I/O or schema changes needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-27 14:15:01 +08:00
Will Miao	3962b1a96d	fix(civitai): fall back to direct version fetch when modelVersions is empty for newly published models	2026-05-27 06:40:13 +08:00
Will Miao	8b856276bf	fix(ui): escape HTML entities in parseMarkdown to prevent swallowed angle brackets	2026-05-27 06:40:13 +08:00
willmiao	c97c802956	docs: auto-update supporters list in README	2026-05-26 13:27:45 +00:00