Writing a Plugin¶

MindSight supports four plugin types, each extending a different stage of the processing pipeline:

Type	Base Class	Purpose
Gaze	`GazePlugin`	Custom gaze estimation backends
Object Detection	`ObjectDetectionPlugin`	Augment or replace YOLO detections
Phenomena	`PhenomenaPlugin`	Track gaze-based social phenomena
Data Collection	`DataCollectionPlugin`	Custom data output and charting

All four types share the same discovery mechanism: the plugin registry scans subdirectories under Plugins/, finds any *.py module that exposes a PLUGIN_CLASS module-level variable, and registers the class. All four types also share the same CLI protocol (add_arguments + from_args) for activation via command-line flags.

Common Steps (All Plugin Types)¶

Step 1: Create Your Plugin Directory¶

Create a named subfolder under the appropriate type directory. Each folder needs an __init__.py (can be empty) and one or more *.py modules:

Plugins/GazeTracking/MyBackend/
    __init__.py
    my_backend.py

Plugins/ObjectDetection/MyDetector/
    __init__.py
    my_detector.py

Plugins/Phenomena/MyTracker/
    __init__.py
    my_tracker.py

Plugins/DataCollection/MyOutput/
    __init__.py
    my_output.py

The filename does not matter. Only the class and PLUGIN_CLASS variable are used by the registry. Folders whose names start with _ are skipped during discovery.

Step 2: Set Up Imports¶

Plugins live in subdirectories and are loaded dynamically, so Python's default import resolution will not find top-level packages like Plugins or DataCollection from that depth. Add this boilerplate at the top of your module:

from __future__ import annotations

import sys
from pathlib import Path

# Walk up to the repository root so sibling imports resolve correctly.
_REPO_ROOT = Path(__file__).parent.parent.parent
if str(_REPO_ROOT) not in sys.path:
    sys.path.insert(0, str(_REPO_ROOT))

The number of .parent calls depends on your plugin's depth relative to the repository root. For a standard plugin at Plugins/<Type>/<Name>/module.py, three levels is correct.

Then import the base class for your plugin type:

from Plugins import GazePlugin           # for gaze plugins
from Plugins import ObjectDetectionPlugin # for object detection plugins
from Plugins import PhenomenaPlugin       # for phenomena plugins
from Plugins import DataCollectionPlugin  # for data collection plugins

Step 3: Implement the CLI Protocol¶

Every plugin type uses the same two classmethods for CLI integration:

@classmethod
def add_arguments(cls, parser) -> None:
    """Called once at startup. Add plugin-specific flags to argparse."""
    g = parser.add_argument_group("My Plugin")
    g.add_argument("--my-plugin", action="store_true",
                   help="Enable the my-plugin plugin.")
    g.add_argument("--my-param", type=float, default=0.5,
                   help="Example parameter (default: 0.5).")

@classmethod
def from_args(cls, args):
    """Called after argument parsing. Return an instance if activated, else None."""
    if not getattr(args, "my_plugin", False):
        return None
    return cls(param=getattr(args, "my_param", 0.5))

add_arguments -- Create an argument group with parser.add_argument_group() to keep your flags visually grouped in --help output.
from_args -- Return None if the activation flag was not set; the registry will skip your plugin. Use getattr(args, ..., default) rather than direct attribute access to be resilient against missing attributes.

Step 4: Expose PLUGIN_CLASS¶

At the bottom of your module file:

PLUGIN_CLASS = MyClass

This is the sentinel that PluginRegistry.discover() looks for when it scans plugin directories. If it is missing, your plugin will be silently skipped.

Writing a Gaze Plugin¶

Gaze plugins provide custom gaze estimation backends. The first plugin whose from_args returns a non-None instance is used as the gaze backend for the entire run. Plugins with is_fallback = True are tried last.

Class Skeleton¶

class MyGaze(GazePlugin):
    """One-line description of your gaze backend."""

    name = "my_gaze"
    mode = "per_face"   # or "scene"
    is_fallback = False

    def __init__(self, model_path: str = "default"):
        self._model_path = model_path
        # Load your model here

Attribute	Purpose
`name`	Unique string identifier for the registry.
`mode`	`"per_face"` or `"scene"` -- controls which estimation method the pipeline calls.
`is_fallback`	When `True`, this plugin is tried only after all non-fallback plugins.

Per-Face Mode¶

Set mode = "per_face" and implement estimate:

def estimate(self, face_bgr):
    """Estimate gaze from a cropped face image.

    Parameters
    ----------
    face_bgr : ndarray
        Cropped face region as a BGR numpy array.

    Returns
    -------
    tuple of (pitch_rad, yaw_rad, confidence)
        Pitch and yaw in radians, confidence in [0, 1].
    """
    pitch, yaw = self._model.predict(face_bgr)
    return (pitch, yaw, 0.9)

The pipeline crops each detected face and calls estimate once per face per frame.

Scene-Level Mode¶

Set mode = "scene" and implement estimate_frame:

def estimate_frame(self, frame_bgr, face_bboxes_px: list) -> list:
    """Estimate gaze for all faces in a full frame.

    Parameters
    ----------
    frame_bgr : ndarray
        Full frame as a BGR numpy array.
    face_bboxes_px : list
        List of face bounding boxes in pixel coordinates.

    Returns
    -------
    list of (gaze_xy_px, confidence)
        One entry per bounding box. gaze_xy_px is the predicted
        gaze target point in pixel coordinates.
    """
    results = []
    for bbox in face_bboxes_px:
        gaze_point = self._model.predict_scene(frame_bgr, bbox)
        results.append((gaze_point, 0.85))
    return results

Custom Pipeline (Advanced)¶

Override run_pipeline() for full control over face cropping, estimation, temporal smoothing, and ray construction. When implemented, the coordinator in GazeTracking/gaze_pipeline.py calls this instead of the default per-face or scene handler.

def run_pipeline(self, **kwargs):
    """Self-contained gaze estimation pipeline.

    Common kwargs
    -------------
    frame           : BGR numpy array at display resolution.
    faces           : List of detected face dicts (from RetinaFace).
    objects         : Non-person detection list.
    gaze_cfg        : GazeConfig with ray parameters.
    smoother        : Optional GazeSmootherReID instance.
    snap_hysteresis : Optional SnapHysteresisTracker instance.
    aux_frames      : dict[(pid_label, stream_type), ndarray | None] --
                      per-participant auxiliary video frames.

    Returns
    -------
    tuple of (persons_gaze, face_confs, face_bboxes, face_track_ids,
              face_objs, ray_snapped, ray_extended)
    """
    frame = kwargs['frame']
    faces = kwargs.get('faces', [])
    # ... full pipeline logic ...
    return (persons_gaze, face_confs, face_bboxes,
            face_track_ids, face_objs, ray_snapped, ray_extended)

Selection Behavior¶

At startup the registry iterates all discovered gaze plugins and calls from_args on each. The first plugin that returns a non-None instance wins and is used for the entire run. Plugins with is_fallback = True are deferred to the end of the iteration order, so they only activate when no other gaze plugin was selected.

For a complete worked example, see Gaze Plugin Tutorial.

Writing an Object Detection Plugin¶

Object detection plugins augment or replace the default YOLO detection pass. Multiple plugins can be active simultaneously and are chained: each receives the output of the previous plugin.

Class Skeleton¶

class MyDetector(ObjectDetectionPlugin):
    """One-line description of your detector."""

    name = "my_detector"

    def __init__(self, conf_threshold: float = 0.3):
        self._conf = conf_threshold

The detect() Method¶

def detect(self, *, frame, detection_frame, all_dets, det_cfg, **kwargs):
    """Post-process or replace the detection list for one frame.

    Called after YOLO each frame, BEFORE the person/object split
    and BEFORE gaze estimation. Multiple plugins chain: each
    receives the output of the previous plugin.

    Parameters
    ----------
    frame           : ndarray -- full resolution BGR frame.
    detection_frame : ndarray -- possibly downscaled frame fed to YOLO.
    all_dets        : list[Detection] -- current detections (persons + objects).
    det_cfg         : DetectionConfig -- current detection configuration.

    Returns
    -------
    list[Detection] to replace the detection list, or None to keep unchanged.
    """
    # Example: filter out low-confidence detections
    filtered = [d for d in all_dets if d.conf >= self._conf]
    return filtered

You can modify the list in-place and return None, or return a new list to replace it entirely.

Parameter Reference¶

Parameter	Type	Description
`frame`	`ndarray`	Full resolution BGR frame
`detection_frame`	`ndarray`	Possibly downscaled frame fed to YOLO
`all_dets`	`list[Detection]`	Current detections (persons + objects)
`det_cfg`	`DetectionConfig`	Current detection config (conf thresholds, class IDs, etc.)

For a complete worked example, see Object Detection Plugin Tutorial.

Writing a Phenomena Plugin¶

Phenomena plugins track gaze-based social phenomena (mutual gaze, joint attention, gaze following, etc.) and integrate with the dashboard, CSV output, and live charting.

Class Skeleton¶

class MyTracker(PhenomenaPlugin):
    """One-line description of your phenomena tracker."""

    name = "my_tracker"
    dashboard_panel = "right"   # "left" or "right"

    def __init__(self, threshold: float = 0.5):
        self._threshold = threshold
        self._events = []

Attribute	Purpose
`name`	Unique string identifier. Appears in CSV headers, dashboard titles, and the internal registry.
`dashboard_panel`	Which side panel to draw into: `"left"` or `"right"`.

The update() Method¶

Called once per video frame with all available pipeline data as keyword arguments. Pull only what you need.

def update(self, **kwargs) -> dict:
    frame_no     = kwargs['frame_no']
    persons_gaze = kwargs.get('persons_gaze', [])
    hits         = kwargs.get('hits', set())
    # ... your tracking logic ...
    return {}

Full kwargs Reference¶

Key	Type	Description
`frame_no`	`int`	Current frame index
`persons_gaze`	`list[(origin, ray_end, angles)]`	Per-face gaze data
`face_bboxes`	`list[(x1, y1, x2, y2)]`	Face bounding boxes
`hit_events`	`list[dict]`	Per-hit records with `face_idx`, `object`, `object_conf`, `bbox`
`joint_objs`	`set[int]`	Object indices currently under joint attention
`dets`	`list[Detection]`	Non-person YOLO detections
`n_faces`	`int`	Number of visible faces this frame
`face_track_ids`	`list[int]`	Stable re-ID track IDs (one per face)
`hits`	`set[(face_idx, obj_idx)]`	Gaze-object intersection pairs
`aux_frames`	`dict`	Auxiliary video frames keyed by `(pid, stream_type)`

The return value is a dict of plugin-specific live state. Other parts of the system (e.g. the GUI) may inspect it. Return {} if you have nothing to report.

Dashboard Display¶

Implement dashboard_data() to provide structured data for the dashboard renderer:

def dashboard_data(self, *, pid_map=None) -> dict:
    rows = []
    if self._events:
        rows.append({'label': 'Total events', 'value': str(len(self._events))})
        rows.append({'label': 'Last event', 'value': f'frame {self._events[-1]}'})
    return {
        'title':      'MY TRACKER',
        'colour':     (200, 200, 200),   # BGR tuple
        'rows':       rows,
        'empty_text': '--',
    }

Key	Type	Description
`title`	`str`	Section heading in the panel
`colour`	`tuple`	BGR colour for the title
`rows`	`list[dict]`	Each dict has `'label'` and optionally `'value'`, `'pct'`
`empty_text`	`str`	Placeholder when rows is empty

CSV Output¶

Implement csv_rows() to append rows to the post-run summary CSV:

def csv_rows(self, total_frames: int, *, pid_map=None) -> list:
    if not self._events:
        return []
    return [
        [],                                     # blank separator line
        ["my_tracker_events"],                   # section header
        ["category", "frame_no", "value"],       # column header
        *[["event", e['frame'], e['val']] for e in self._events],
    ]

The pattern is: blank row, section-name row, column-header row, then data rows. Return an empty list if there is nothing to write.

Optional Methods¶

These methods have default (no-op) implementations in PhenomenaPlugin. Override as needed:

Method	Purpose
`draw_frame(frame)`	Annotate the video frame in-place (called after `update`).
`console_summary(total_frames, *, pid_map)`	Return a string for post-run stdout output.
`time_series_data()`	Return time-series data for post-run chart generation.
`latest_metric()`	Return the current-frame scalar metric value for live charting.
`latest_metrics()`	Return a dict of per-series metric values for the live dashboard.
`dashboard_widget()`	Return a custom QWidget for the GUI dashboard, or `None`.

For a complete worked example, see Phenomena Plugin Tutorial.

Writing a Data Collection Plugin¶

Data collection plugins provide custom output hooks that run every frame and after the run completes. Use them for custom file formats, database writes, streaming output, or chart generation.

Class Skeleton¶

class MyOutput(DataCollectionPlugin):
    """One-line description of your data output."""

    name = "my_output"

    def __init__(self, output_path: str = "output.json"):
        self._path = output_path
        self._records = []

Per-Frame Hook¶

def on_frame(self, **kwargs) -> None:
    """Called once per frame after all pipeline stages and display updates.

    Common kwargs: frame_no, persons_gaze, face_bboxes, hit_events,
    face_track_ids, hits, objects, confirmed_objs, pid_map
    """
    frame_no = kwargs['frame_no']
    hits = kwargs.get('hits', set())
    self._records.append({'frame': frame_no, 'n_hits': len(hits)})

Post-Run Hook¶

def on_run_complete(self, **kwargs) -> None:
    """Called after the video loop ends with summary data.

    Common kwargs: total_frames, total_hits, look_counts,
    source, all_trackers
    """
    import json
    with open(self._path, 'w') as f:
        json.dump(self._records, f)

Chart Generation (Optional)¶

def generate_charts(self, output_dir: str, **kwargs) -> list[str]:
    """Generate custom post-run charts. Called when --charts is enabled.

    Parameters
    ----------
    output_dir : str
        Directory where chart files should be saved.
    **kwargs
        Same summary data as on_run_complete.

    Returns
    -------
    list[str] -- file paths of created chart images.
    """
    import matplotlib.pyplot as plt
    fig, ax = plt.subplots()
    ax.plot([r['frame'] for r in self._records],
            [r['n_hits'] for r in self._records])
    path = f"{output_dir}/my_output_chart.png"
    fig.savefig(path)
    plt.close(fig)
    return [path]

For a complete worked example, see Data Collection Plugin Tutorial.

Testing Your Plugin¶

Run MindSight with your activation flag:

python MindSight.py --source video.mp4 --my-plugin --my-param 0.7

Check that:

Your plugin prints any startup confirmation (if you added one in from_args).
No import errors appear at startup -- if your plugin fails to load, the registry emits a RuntimeWarning with the traceback.
Your output appears in the expected place (dashboard panel, CSV section, chart file, or custom output).
No errors appear when new kwargs are added by other pipeline components -- always use kwargs.get() with defaults for optional keys.