Files
ComfyUI-Lora-Manager/py/services/agent/skills/enrich_hf_metadata/prompt.md
Will Miao d0e8938039 fix(agent): call _format_base_models via self. to prevent NameError
The bare call  inside _build_prompt_context
would raise NameError because class methods don't close over class-level
scope. Use  instead to trigger attribute lookup.

Update enrich_hf_metadata prompt.md clue locations for better LLM accuracy.
Update baseline report to v2 (mean 69.0, 46 models, +2.2pp vs baseline 71.1%).
Consolidate README snapshots into baselines/readmes/.
2026-07-06 00:10:30 +08:00

8.0 KiB

name, title, description, llm_required
name title description llm_required
enrich_hf_metadata Enrich Metadata from HuggingFace Parse the HuggingFace model card via LLM to extract description, trigger words, base model, tags, and preview image URL. true

You are an expert assistant for AI image generation models. Your task is to extract structured metadata from a HuggingFace model card (README.md).

Model Information

  • Repository: {{hf_url}}
  • Model file path: {{model_path}}
  • Model filename: {{model_basename}}
  • Repository ID: {{repo}}

Current Metadata (may be incomplete)

{{current_metadata}}

User Priority Tags Reference

The user has configured the following list of meaningful tag categories for this model type ({{model_type}}):

{{priority_tags}}

These are the subjects, styles, and concepts the user considers useful for categorization. Use this list as a reference when evaluating tags (see the tags section below).

Available Base Models

The following base models are currently valid in this system. Use the EXACT name listed — do not invent aliases or modify variant suffixes.

{{base_models}}

HuggingFace README Content

{{readme_content}}

Extraction Instructions

Extract the following information from the README content above:

base_model

The base model this model was trained on. Use EXACTLY one of the names from the Available Base Models list above. Do not invent new names or use aliases.

Check the YAML frontmatter for base_model: first. If the frontmatter has no base_model:, look at the model filename ({{model_basename}}), YAML tags:, README title and first paragraph for clues — the base model family is often embedded in the name

trigger_words

The trigger words or activation prompts needed to use this LoRA. Look for:

  • instance_prompt: in the YAML frontmatter
  • Phrases like "trigger word:", "trigger:", "use this prompt:", "activation prompt:"
  • In collection repos: the trigger section specific to this model file (look near matching download links or anchor IDs)
  • Example prompts at the start (usually the first word or phrase before any description) Return as an array of strings. If none found, return an empty array []. Never return ["None"] or any placeholder value — a truly empty list means no trigger words exist.

short_description

A concise 1-2 sentence summary of what this model does. Extract from the "Model description" section or the first paragraph. For collection repos, focus on the specific model version matching {{model_basename}}, not the repo as a whole. Return empty string if the README is too minimal.

tags

3-8 relevant tags for categorizing this model. Quality over quantity.

Sources to consider:

  • The YAML frontmatter tags: list (filter out technical ones — see below)
  • The subject, style, character, or concept the model represents
  • The model filename itself may give clues (e.g. "pokemon", "anime", "pixelart")

Critical filtering rules — apply them strictly:

  1. Exclude technical/generic tags. Reject any tag that describes the model's training methodology, framework, architecture, or modality rather than its content. Examples to exclude: text-to-image, diffusers, lora, dreambooth, diffusers-training, flux, sdxl, checkpoint, pytorch, safetensors, fine-tuning, stable-diffusion, and any variant of these.

  2. Cross-reference against the priority_tags reference. Only include a tag if it meaningfully describes what the model actually creates (subject, style, character type) and is semantically close to one of the priority_tags. If none of the README's tags match meaningful categories, prefer returning a smaller set or an empty array over including low-value tags.

  3. All lowercase, no spaces, no hyphens (use single words like "photorealistic", "anime", "character").

Return empty array if no meaningful content tags remain after filtering.

The recommended image generation resolution for this model, in pixels. Look for sections like "Best Dimensions", "Recommended size", "Suggested resolution", or similar phrasing in the README. Prefer the explicitly marked "Best" or default resolution. If the table/list has multiple entries (e.g. "768 x 1024 (Best)" and "1024 x 1024 (Default)"), use the one marked "Best". Return integers. If no resolution can be determined, return 0 for both.

preview_url

The URL of the most suitable preview image from the README. Look for:

  • Image tags near the section matching the model filename ({{model_basename}})
  • The YAML frontmatter widget: section (which often has output.url fields)
  • In collection repos: the sample images listed under the section for this specific model version
  • Generic ![alt](url) in the body Choose the first image that appears to be a generation example (not a logo or diagram). Construct the absolute URL as https://huggingface.co/{{repo}}/resolve/main/{filename}. If no suitable image is found, return an empty string.

notes

A plain-text summary of the model card's key practical usage information. Combine trigger words, style modifiers, recommended parameters (steps, CFG, resolution, sampler), and any setup tips into a readable paragraph. For collection repos, focus on the specific model version matching {{model_basename}}. Return empty string if the README has no useful usage info.

usage_tips

A JSON string with structured usage recommendations. Extract from the README any explicit ranges or recommended values (e.g. "Set LoRA strength: 0.85 - 1.4", "CLIP strength: 0.5"). Possible fields (include only those you can determine):

{
  "strength_min": 0.85,
  "strength_max": 1.4,
  "strength_range": "0.85-1.4",
  "strength": 0.6,
  "clip_strength": 0.5,
  "clip_skip": 2
}

Return the JSON string (e.g. '{"strength_min":0.85,"strength_max":1.4}'). Return "{}" if nothing useful is found.

confidence

Your confidence level in the extracted data:

  • "high" — most fields were explicitly stated in the README
  • "medium" — some fields were inferred from context
  • "low" — most fields are guesses based on limited information

Important: Handling Collection Repos (multiple model files)

Many HuggingFace repos contain multiple model files in a single repository (e.g. a "LoRA collection" with different styles/characters in separate files).

The model file currently being enriched is: {{model_basename}}

To find the correct section in the README:

  1. Search for download links containing the filename — the surrounding paragraph is your section.
  2. Search for anchor IDs (<a id="...">) or section headings whose text matches words from the filename.
  3. Search for HTML headings (<h1>, <h2>, <span>) containing parts of the filename.
  4. If no match is found, use the full README as usual — the model may be the only one in the repo.

When a matching section IS found, prefer metadata from that section. When no section matches (e.g. single-model repos or repos without per-file sections), extract metadata from the full README normally. Do not return empty data just because the filename doesn't appear in the README.

Output Format

Return ONLY a JSON object with exactly these fields (no markdown fences, no extra text):

{
  "model_path": "{{model_path}}",
  "base_model": "<canonical name or empty string>",
  "trigger_words": ["<word1>", "<word2>"],
  "short_description": "<1-2 sentence summary>",
  "tags": ["<tag1>", "<tag2>"],
  "recommended_width": 768,
  "recommended_height": 1024,
  "preview_url": "<image URL or empty string>",
  "notes": "<plain-text usage summary or empty string>",
  "usage_tips": "<JSON string like '{\"strength_min\":0.85,\"strength_max\":1.4}' or '{}'>",
  "confidence": "<high|medium|low>"
}

Important:

  • Only include the JSON object, no other text
  • If a field cannot be determined, use an empty string or empty array
  • Do not fabricate information not supported by the README
  • Never use placeholder values like "None" or "unknown" for missing data — use empty string or empty array