Introduce an agent skill framework for LLM-driven metadata enrichment: - AgentCLI (py/agent_cli/): in-process wrappers around internal services using standard relative imports, eliminating the need for sys.path hacks - LLMService: centralized BYOK (bring-your-own-key) LLM client supporting OpenAI, Ollama, and custom OpenAI-compatible endpoints - PostProcessor: deterministic engine that applies LLM output via AgentCLI (replaces old handler.py + _BASE_MODEL_ALIASES approach) - SkillRegistry: filesystem-based skill discovery (skill.yaml + prompt.md) - AgentService: orchestrates skill execution with WebSocket progress - Frontend AgentManager: WebSocket listeners, skill execution, config UI - Context menu entries (single + bulk) for "Enrich Metadata (Agent)" - Settings UI for AI Provider configuration (BYOK) - Full i18n support across 9 locales Bug fixes found during review: - aiohttp.web.json_response: status_code= -> status= - settings_modal cancelEditApiKey: wrong argument position - AgentManager.isLlmConfigured: allow Ollama without API key - PostProcessor._merge_tags: lowercase all tags to match TagUpdateService
8.0 KiB
Agent Skills System
The LoRA Manager agent skills system enables LLM-powered metadata enrichment and other AI-driven tasks. Users configure their own LLM provider (BYOK), and skills are executed through right-click context menu actions.
Architecture
┌──────────────────────────────────────────────┐
│ LoRA Manager Backend │
│ │
│ ┌──────────────┐ ┌────────────────┐ │
│ │ LLMService │───▶│ LLM Provider │ │
│ │ (BYOK config, │◀───│ (OpenAI/Ollama │ │
│ │ API calls) │ │ /custom) │ │
│ └───────┬───────┘ └────────────────┘ │
│ │ │
│ ┌───────▼───────────────────────┐ │
│ │ AgentService │ │
│ │ (orchestration: validate │ │
│ │ → LLM call → post-process │ │
│ │ → WebSocket broadcast) │ │
│ └───────┬───────────────────────┘ │
│ │ │
│ ┌───────▼───────────────────────┐ │
│ │ SkillRegistry │ │
│ │ ┌─────────────────────────┐ │ │
│ │ │ enrich_hf_metadata: │ │ │
│ │ │ - skill.yaml │ │ │
│ │ │ - prompt.md │ │ │
│ │ │ - handler.py │ │ │
│ │ └─────────────────────────┘ │ │
│ └───────────────────────────────┘ │
└──────────────────────────────────────────────┘
Key Design Principle
Skills define what to do (prompt + post-processing). The AgentService handles how (LLM calls, validation, progress).
Skills never call the LLM directly. This keeps BYOK configuration centralized and provider-agnostic.
BYOK Configuration
Users configure their LLM provider in Settings → AI Provider:
| Setting | Description | Example |
|---|---|---|
llm_provider |
Provider type | openai, ollama, or custom |
llm_api_key |
API key (not needed for local Ollama) | sk-... |
llm_api_base |
Custom API base URL (empty = provider default) | https://api.openai.com/v1 |
llm_model |
Model name | gpt-4o-mini |
Environment variable overrides: LLM_API_KEY, LLM_MODEL, LLM_API_BASE, LLM_PROVIDER.
Supported Providers
- OpenAI: Uses
https://api.openai.com/v1by default - Ollama (local): Uses
http://localhost:11434/v1, no API key required - Custom: Any OpenAI-compatible endpoint (vLLM, LM Studio, etc.) — set
llm_api_baseexplicitly
Available Skills
enrich_hf_metadata
Enriches HuggingFace-downloaded models with metadata extracted by an LLM from the HF model card.
Entry point: Right-click context menu → "Enrich Metadata (Agent)"
What it does:
- Reads the model's
.metadata.jsonto get thehf_url - Fetches the README.md from the HuggingFace repository
- Sends the README + local metadata to the LLM for structured extraction
- Writes extracted fields to
.metadata.json:base_model— only if current value is emptytrainedWords— trigger words (LoRA only, if none exist)modelDescription— concise summary (if none exists)tags— merged with existing tags, deduplicatedmetadata_source— audit trail:agent:enrich_hf_metadatallm_enriched_at— ISO timestamp
- Downloads and optimizes preview image (if LLM found one in the README)
- Updates the scanner cache
- Broadcasts WebSocket progress events
Model types: LoRA, Checkpoint, Embedding
Adding a New Skill
1. Create the skill directory
py/services/agent/skills/<skill_name>/
├── skill.yaml # Skill metadata and schemas
├── prompt.md # LLM prompt template
└── handler.py # Pre-processing and post-processing
2. Write skill.yaml
name: my_skill
title: "My Skill"
description: "What this skill does"
llm_required: true
model_type_filter: ["lora"] # or null for all types
input_schema:
type: object
properties:
model_paths:
type: array
items:
type: string
required:
- model_paths
output_schema:
type: object
properties:
# ... JSON schema for LLM output
permissions:
write_metadata: true
write_previews: false
network_domains:
- "example.com"
3. Write prompt.md
Use {{variable}} placeholders that will be replaced with data from the prepare function:
You are an expert assistant...
Model URL: {{hf_url}}
README content:
{{readme_content}}
Current metadata:
{{current_metadata}}
4. Write handler.py
async def prepare(model_path: str, input_data: dict) -> dict:
"""Gather context for the LLM prompt. Returns variables for template rendering."""
return {
"model_path": model_path,
# ... other variables used in prompt.md
}
async def post_process(context) -> dict:
"""Apply the LLM-extracted data to the model."""
llm_response = context.llm_response
# ... write metadata, download previews, update cache
return {
"success": True,
"updated_fields": ["base_model", "tags"],
"errors": [],
}
Important: Use absolute imports (from py.utils.metadata_manager import MetadataManager) because skills are loaded via importlib.util.spec_from_file_location, which doesn't support relative imports.
5. Test
The skill is automatically discovered by SkillRegistry on startup. Test with:
pytest tests/services/test_agent_service.py
API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /api/lm/agent/skills |
List available skills |
| POST | /api/lm/agent/execute/{skill_name} |
Execute a skill (body: {"model_paths": [...]}) |
| POST | /api/lm/agent/cancel |
Cancel running skill (stub) |
WebSocket Events
| Type | When | Key fields |
|---|---|---|
agent_progress |
Skill started/processing | skill, status, total, processed, success, current_path |
agent_progress |
Skill completed | skill, status, updated_models, errors, summary |
agent_progress |
Skill error | skill, status, error |
Security Model
Skills declare permissions in skill.yaml:
write_metadata— can write.metadata.jsonfileswrite_previews— can download/replace preview imagesnetwork_domains— allowed domains for HTTP requests
These are declarative constraints checked by AgentService. They are defense-in-depth, not a sandbox — the Python process can technically do anything, but the contract is clear and auditable.
File Locations
| Component | Path |
|---|---|
| LLMService | py/services/llm_service.py |
| AgentService | py/services/agent/agent_service.py |
| SkillRegistry | py/services/agent/skill_registry.py |
| SkillDefinition | py/services/agent/skill_definition.py |
| Skills directory | py/services/agent/skills/ |
| Route handlers | py/routes/handlers/agent_handlers.py |
| Frontend manager | static/js/managers/AgentManager.js |
| Settings UI | templates/components/modals/settings_modal.html |
| Context menu | templates/components/context_menu.html |