Назад към всички

yolo-detection-2026

// State-of-the-art real-time object detection using YOLO

$ git log --oneline --stat
stars:2,335
forks:444
updated:March 2, 2026
SKILL.mdreadonly
SKILL.md Frontmatter
nameyolo-detection-2026
descriptionYOLO 2026 — state-of-the-art real-time object detection
version2.0.0
iconassets/icon.png
entryscripts/detect.py
deploydeploy.sh
requirements[object Object]
parameters[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
capabilities[object Object]

YOLO 2026 Object Detection

Real-time object detection using the latest YOLO 2026 models. Detects 80+ COCO object classes including people, vehicles, animals, and everyday objects. Outputs bounding boxes with labels and confidence scores.

Model Sizes

SizeSpeedAccuracyBest For
nanoFastestGoodReal-time on CPU, edge devices
smallFastBetterBalanced speed/accuracy
mediumModerateHighAccuracy-focused deployments
largeSlowerHighestMaximum detection quality

Hardware Acceleration

The skill uses env_config.py to automatically detect hardware and convert the model to the fastest format for your platform. Conversion happens once during deployment and is cached.

PlatformBackendOptimized FormatCompute UnitsExpected Speedup
NVIDIA GPUCUDATensorRT .engineGPU~3-5x
Apple Silicon (M1+)MPSCoreML .mlpackageNeural Engine (NPU)~2x
Intel CPU/GPU/NPUOpenVINOOpenVINO IR .xmlCPU/GPU/NPU~2-3x
AMD GPUROCmONNX RuntimeGPU~1.5-2x
CPU (any)CPUONNX RuntimeCPU~1.5x

Apple Silicon Note: Detection defaults to cpu_and_ne (CPU + Neural Engine), keeping the GPU free for LLM/VLM inference. Set compute_units: all to include GPU if not running local LLM.

How It Works

  1. deploy.sh detects your hardware via env_config.HardwareEnv.detect()
  2. Installs the matching requirements_{backend}.txt (e.g. CUDA → includes tensorrt)
  3. Pre-converts the default model to the optimal format
  4. At runtime, detect.py loads the cached optimized model automatically
  5. Falls back to PyTorch if optimization fails

Set use_optimized: false to disable auto-conversion and use raw PyTorch.

Auto Start

Set auto_start: true in the skill config to start detection automatically when Aegis launches. The skill will begin processing frames from the selected camera immediately.

auto_start: true
model_size: nano
fps: 5

Performance Monitoring

The skill emits perf_stats events every 50 frames with aggregate timing:

{"event": "perf_stats", "total_frames": 50, "timings_ms": {
  "inference": {"avg": 3.4, "p50": 3.2, "p95": 5.1},
  "postprocess": {"avg": 0.15, "p50": 0.12, "p95": 0.31},
  "total": {"avg": 3.6, "p50": 3.4, "p95": 5.5}
}}

Protocol

Communicates via JSON lines over stdin/stdout.

Aegis → Skill (stdin)

{"event": "frame", "frame_id": 42, "camera_id": "front_door", "timestamp": "...", "frame_path": "/tmp/aegis_detection/frame_front_door.jpg", "width": 1920, "height": 1080}

Skill → Aegis (stdout)

{"event": "ready", "model": "yolo2026n", "device": "mps", "backend": "mps", "format": "coreml", "gpu": "Apple M3", "classes": 80, "fps": 5}
{"event": "detections", "frame_id": 42, "camera_id": "front_door", "timestamp": "...", "objects": [
  {"class": "person", "confidence": 0.92, "bbox": [100, 50, 300, 400]}
]}
{"event": "perf_stats", "total_frames": 50, "timings_ms": {"inference": {"avg": 3.4}}}
{"event": "error", "message": "...", "retriable": true}

Bounding Box Format

[x_min, y_min, x_max, y_max] — pixel coordinates (xyxy).

Stop Command

{"command": "stop"}

Installation

The deploy.sh bootstrapper handles everything — Python environment, GPU backend detection, dependency installation, and model optimization. No manual setup required.

./deploy.sh

Requirements Files

FileBackendKey Deps
requirements_cuda.txtNVIDIAtorch (cu124), tensorrt
requirements_mps.txtAppletorch, coremltools
requirements_intel.txtInteltorch, openvino
requirements_rocm.txtAMDtorch (rocm6.2), onnxruntime-rocm
requirements_cpu.txtCPUtorch (cpu), onnxruntime