Files
Will Miao d324b57274 perf(check-model-exists): eliminate SQLite connection-per-query overhead and skip redundant history checks
Root cause: 231 concurrent /check-model-exists requests on 175K-lora library
caused ~9.4s wall clock time. The bottleneck was two-fold:

1. DownloadedVersionHistoryService opened a new sqlite3.connect() for every
   query under asyncio.Lock. With a large WAL from 175K entries, each
   connect() took ~8ms. Serialized by the lock across 231 requests, the
   230th request waited ~1848ms just for lock acquisition.

2. check_model_exists always queried download history even when the model
   was found locally. The history result (hasBeenDownloaded /
   downloadedVersionIds) is only used by the UI when the model is NOT
   found locally; when found, the 'in library' indicator takes priority.

Changes:
- downloaded_version_history_service.py: added persistent _get_conn() that
  creates the SQLite connection once and reuses it across all queries
- misc_handlers.py: early-return from check_model_exists when the model
  exists locally, bypassing the history service entirely (lock skipped)

Expected: per-request wait time drops from ~1912ms to <3ms, wall clock
from ~9.4s to <0.3s for the 175K-lora user's 231-card page.
2026-05-02 13:31:20 +08:00
..
2025-02-24 20:41:16 +08:00