IDE4EEG v0.6 - User Manual

IDE4EEG User Manual

Environment for design and debugging of EEG analysis pipelines

Integrated Development Environment for EEG


IDE4EEG reads raw EEG recordings in several formats, applies a configurable preprocessing pipeline (filtering, artifact rejection, ICA, Matching Pursuit), and produces time-domain, frequency-domain, time-frequency, connectivity, and source-localisation analyses with publication-ready plots. The two main analysis paths are event-locked (stimulus-related epochs) and rest (continuous); the choice is made on the Preprocess tab’s Segmentation setup panel and propagates through every subsequent step.

The intended workflow is interactive: you load a recording, configure preprocessing and analysis steps in the GUI, run the pipeline, browse the results, and iterate. The same configuration can also be saved as a config.toml file and run from the command line for batch processing or cluster jobs, or exported as a self-contained Python script — all three paths share the same code, so results match (up to small floating-point differences under parallelism).

This manual mirrors the GUI tab layout. Chapters 1–7 each cover one tab in the order you would encounter it on first use: Config, Input, Preprocess, Analysis, Run, Output, Help. Chapter 8 documents the bundled helper applications (Svarog, ConnectiVIS, empi, MP Book Viewer). The appendices cover installation, supported file formats, the complete config.toml reference, an FAQ, and advanced/hidden parameters.


Conventions and MNE integration

IDE4EEG is built on top of MNE-Python: every loaded recording lives as an mne.io.Raw object, every cut signal as mne.Epochs, and the canonical save format is MNE-FIF (*-raw.fif for continuous data, *-epo.fif for epochs). Channel types use MNE’s vocabulary (eeg, eog, emg, ecg, bio, stim, misc, …); montages are taken from MNE’s built-in library (standard_1020, standard_1005, biosemi64, …). When in doubt about a parameter’s meaning, the corresponding MNE function’s docstring is the authoritative reference — IDE4EEG forwards most parameters unchanged.

SI units and the V vs µV convention

MNE stores voltages in volts (V) — not microvolts. Times are in seconds, frequencies in hertz. IDE4EEG follows this convention wherever a parameter is forwarded to MNE:

Legacy IDE4EEG-developed code paths — predating the move onto MNE — kept their thresholds in µV directly for human readability:

Every parameter table in this manual names the unit it expects; the GUI tooltips also show the friendly unit (µV, ms) regardless of how the value is stored on disk.

Number entry — decimal separator and sign

IDE4EEG uses . (period) as the decimal separator everywhere, regardless of the system locale. This matches:

A German / Polish / French OS may format numbers with , (comma) elsewhere — that comma is not accepted by IDE4EEG’s input fields. Type 0.5, not 0,5.

Numeric input fields that semantically can’t be negative (frequency bounds, amplitudes, scales, iteration counts) are clamped at typing time on the GUI side: the - key is rejected outright by the field’s validator. The pre-launch consistency rules also flag negative values that arrive via TOML edits or clipboard paste, so a typo never reaches the pipeline silently.

Where MNE code is invoked directly

Preprocessing. The following steps wrap MNE transparently — IDE4EEG supplies the parameters, MNE does the work:

Analysis. Chapter 4 is split on purpose. §4.1 UW-developed analyses (Matching Pursuit decomposition, MP→dipole, MVAR / DTF / PDC connectivity, FASP EEG profiles) is original IDE4EEG code. §4.2 MNE-wrapped analyses — ERP butterfly / joint / topomap / GFP, cluster permutation, PSD multitaper / Welch / per-channel, TFR Morlet/multitaper, ERD/S topomap, evoked comparison, drop-log, channel locations, source estimation — are thin wrappers over the corresponding MNE functions, organised on the Analysis tab into six collapsible category panels.

Source estimation. All four source-estimation entries on the Analysis tab call MNE inverse-solution routines: classical ERP dipole fit (mne.fit_dipole), minimum-norm family (mne.minimum_norm.apply_inverse with method ∈ MNE / dSPM / sLORETA / eLORETA), LCMV beamformer (mne.beamformer.make_lcmv + apply_lcmv), and MxNE / iRMxNE (mne.inverse_sparse.mixed_norm). This includes the MP→dipole pipeline in §4.1.1: the time-frequency MP decomposition is IDE4EEG-native (empi), but the inverse problem that turns each MP atom into a current source is solved by MNE against an MNE-derived BEM and forward solution.

Independent of MNE. Matching Pursuit decomposition (empi binary), the MVAR / DTF / PDC connectivity solver, the FASP EEG-profile algorithm, and the L2CS-Net + InsightFace gaze pipeline are IDE4EEG-native code — no MNE dependency on the algorithmic side, although their outputs are still wrapped into MNE-compatible structures where useful.


1. Config tab

The Config tab holds settings that apply to the whole pipeline rather than to any individual step. From here you point IDE4EEG at the bundled helper applications, control how many CPU cores it uses, switch a few global behaviours on or off, and load/save TOML configuration files.

A new install needs nothing here to run — the only mandatory inputs are on the Input tab. Visit the Config tab once to download the helper apps, then leave it alone.

1.1 Loading and saving config files

The bottom row of the tab has four buttons (also reachable via the File menu):

If a config.toml exists in the working directory when IDE4EEG launches it is loaded automatically. Otherwise the GUI starts with sensible defaults — no config file is required.

1.2 External tool paths

These paths are auto-detected at startup. On first launch of a fresh install, IDE4EEG auto-downloads the missing helpers (SVAROG with bundled empi, ConnectiVIS, and Adoptium Temurin JDK 17) — they are treated as crucial parts of the package, not optional add-ons. A modal progress dialog appears while the ~80–150 MB total downloads in the background; click Cancel to defer and use the Config tab’s per-row Download recent button later. CLI / batch mode (ide4eeg --run config.toml) auto-installs silently to stdout.

When the auto-install completes (or if you already had a helper installed) the corresponding path field on the Config tab is filled in automatically. Only the missing tools are fetched — a Svarog version you’ve pinned by hand at ~/.obci/svarog/ will NOT be overwritten by the first-launch flow.

The per-row Download recent button on the Config tab forces a refresh — it downloads the latest GitLab CI artifact and overwrites the existing install. Use this when you want to update to a newer build; the prompt explicitly warns when other helpers will also be re-fetched.

See Chapter 8 for what each helper does.

Parameter Auto-install path Description
svarog_jar ~/.obci/svarog/svarog-standalone-*.jar Svarog standalone JAR (signal viewing, MP book display, interactive review).
connectivis_jar ~/.obci/connectivis/connectivis.jar ConnectiVIS JAR (3D viewer for dipole sources and connectivity arrows). Shown as visualisation under the Brain geometry (3D) subsection.
empi_path ~/.obci/svarog/empi-*-<platform> empi binary for Matching Pursuit decomposition.
fsaverage (Browse/Download) ~/.obci/ide4eeg/fsaverage/ FreeSurfer’s average brain model — a curated ~250 MB subset, downloaded on demand for dipole/source localization and ConnectiVIS 3D views. Shown under Brain geometry (3D). See §4.1.1.

Download recent drops files at the paths above (the * is a version stamp; <platform> is mac-x86_64 / mac-arm64 / linux-x86_64 / windows-x86_64). When auto-detection at startup finds a matching file, the corresponding Parameter field is filled in automatically. Configuring an explicit path in the field is only needed when you want a custom build outside ~/.obci/.

Video backends for Svarog video sync. Svarog plays video frames lockstep with EEG signal scrolling using one of three backends, in priority order:

  1. mpv (default since Svarog 4.20) — bundled with the standalone ZIPs and declared as Depends: in the .deb. The Download recent button extracts portable mpv into ~/.obci/svarog/mpv/ on Linux and Windows from the official mpv release archives. Plays MKV / MP4 / AVI / WebM directly via libavcodec.
  2. JavaFX 17.0.15 — embedded fallback. Built into the full Svarog standalone JAR; the default light JAR excludes the native JavaFX libraries to keep the download small. Codec coverage is narrow: MP4/H.264/AAC, FLV, FXM, M3U8 only — JavaFX cannot play MKV, WebM, AVI, or MOV containers, so this fallback only covers studies recorded as MP4. Recordings from OBS, ffmpeg-piped webcams, or any pipeline producing MKV need mpv or VLC.
  3. VLC — opt-in for users with an existing system VLC install. No download needed; Svarog detects and uses it automatically when neither mpv nor JavaFX is present. Plays the same broad container set as mpv (MKV/MP4/AVI/WebM/MOV), so it’s the practical workaround for MKV recordings on macOS without Homebrew.

On macOS, portable mpv is impractical for the pip-installed app (it would require bundling ~30 dylibs from /opt/homebrew/lib and rewriting their rpaths). Instead, IDE4EEG checks whether Homebrew is installed; if so, the Download recent button asks for explicit consent and runs brew install mpv on your behalf. If you decline or Homebrew is missing, the button still installs Svarog/empi/ConnectiVIS successfully and reports that mpv is unavailable — install manually with brew install mpv to enable video sync. Note that without mpv on macOS, JavaFX becomes the only fallback, which means video sync works only for MP4/H.264 recordings; MKV/WebM/AVI files won’t play. For non-MP4 study recordings, install either mpv or VLC before launching Svarog.

The macOS full .dmg sidesteps all of this: it bundles a self-contained arm64 mpv (from the stolendata.net portable build) and a static arm64 ffmpeg (from osxexperts.net) inside IDE4EEG-full.app/Contents/Resources/, and a PATH prepend in ide4eeg/__init__.py makes them discoverable to SVAROG’s Runtime.exec("mpv") and Runtime.exec("ffmpeg") calls. No Homebrew required.

Planned improvement. A future release will add a one-click VLC.app download as an explicit macOS fallback for users without Homebrew, populating Svarog’s <vlcPath> config key automatically. VLC.app is self-contained (no dylib bundling needed), plays the same broad container set as mpv, and would close the “macOS without Homebrew + MKV recordings” gap. mpv stays primary on every platform; VLC is added only as a fallback path, not a replacement.

L2CS-Net (default eye-gaze backend). A separate Download button at the bottom of the Tool Paths group installs PyTorch + the l2cs Python package + the Gaze360 weights (~96 MB checkpoint, MIT-mirrored from Ahmednull/L2CS-Net) in one shot. Total install ~500 MB on macOS / Linux-CPU, up to ~2 GB on Linux with CUDA wheels. L2CS provides true per-eye gaze direction; without it, the only available backend is the InsightFace head-pose fallback, which measures where the head is pointing rather than where the eyes are looking (see §3.14 Gaze for the full comparison).

Video processing panel. A persistent panel under Tool Paths reports the live status of the five core Python packages the facetag pipeline needs: OpenCV, PyAV, imageio, InsightFace, ONNX Runtime. (PyTorch + L2CS-Net live in the dedicated L2CS-Net row above — they have their own one-shot installer that also fetches the Gaze360 weights file.) Each package row shows one of:

The panel refreshes automatically when the Config tab gets focus, so packages installed in another terminal flip from [--] to [ok] without restarting IDE4EEG. When any package is missing, an Install missing button appears next to the summary; click it to confirm + run pip install for the core packages with progress streamed live into a monospace log dialog. After install, packages requiring a process restart (cv2, av, insightface, onnxruntime — anything with a C extension) trigger a “Restart IDE4EEG?” prompt that re-execs the Python process so the new modules load cleanly.

The same diagnostic is available without launching the GUI:

python -m ide4eeg.install_diagnostics    # print status only
python -m ide4eeg.install_runner -y      # install everything missing
python -m ide4eeg.install_runner -y --no-l2cs   # skip L2CS extras

The first invocation only prints status (read-only). The second runs the same pip install flow as the GUI’s Install missing button, with progress streamed to stdout and a final per-package summary. Useful on headless machines, in CI, and for scripted setup.

The CLI runner (ide4eeg --run config.toml) runs the same preflight automatically — if a config has prepare_video_artifacts = true and any required package is missing or platform-blocked, the run aborts with the diagnostic text before any preprocessing starts, so the user sees actionable remediation up front instead of a Python traceback ten minutes into a long pipeline.

First-time download troubleshooting (TLS certificate errors). If Download recent fails with a TLS error, IDE4EEG handles it automatically in two steps:

  1. First attempt — validate against certifi’s bundled root CA list. Works for almost everyone; certifi is in requirements.txt.
  2. Fallback attempt — if certifi fails (e.g. corporate networks with TLS-inspecting proxies), retry against the platform’s default CA store.

If both fail, the error dialog shows platform-specific remediation:

1.2.1 Where IDE4EEG stores files

A handful of top-level filesystem locations cover everything IDE4EEG installs, downloads, or saves at runtime:

Path Contents Typical size
~/.obci/svarog/svarog-standalone-*.jar Svarog standalone JAR (signal + book + tag review). ~50 MB
~/.obci/svarog/empi-*-<platform> empi binary (MP decomposition). ~2 MB
~/.obci/svarog/mpv/mpv-*-<platform>/ Portable mpv (Linux / Windows only — macOS uses brew install mpv). ~30 MB
~/.obci/connectivis/connectivis.jar ConnectiVIS JAR (3D dipole / connectivity viewer). ~35 MB
~/.obci/ide4eeg/fsaverage/ FreeSurfer’s average brain model — curated subset (BEM solution + transform, cortical/head surfaces, atlas), downloaded on demand for dipole/source localization & ConnectiVIS 3D (§4.1.1). ~250 MB
~/.obci/ide4eeg/models/L2CSNet_gaze360.pkl L2CS-Net gaze checkpoint (downloaded on first L2CS use). 91 MB
~/.obci/ide4eeg/insightface/models/buffalo_l/ InsightFace face detection / identity models (downloaded on first use). ~200 MB
~/mne_data/MNE-fsaverage-data/ Full fsaverage from MNE’s OSF mirror — used only as the last-resort fallback when the managed ~/.obci/ide4eeg/fsaverage/ subset and any existing copy are absent (§4.1.1). MNE-managed cache, not bundled. ~0.7 GB
~/.obci/ide4eeg/stage5_dismissed.json Per-rule “don’t show again” dismissals (created on first dismissal). < 1 KB
~/IDE4EEG_examples/<name>/ Example datasets + their pipeline outputs (each example is self-contained). ~50–200 MB per example
<venv>/lib/python3.X/site-packages/{cv2,av,imageio,insightface,onnxruntime,torch,torchvision,l2cs}/ Python video stack — installed via pip into the active venv. ~250 MB core; up to ~2 GB with PyTorch CUDA

These principles drive the layout:

Migration of legacy paths. Existing installs may have files in three old locations:

On first launch after upgrading, IDE4EEG runs migrate_legacy_paths() from ide4eeg/__init__.py which performs same-disk renames into the new tree. The migration is idempotent and fast (single rename per file, no re-download). Cross-disk or permission failures are logged as warnings and the legacy path stays in place; on next use, InsightFace re-downloads to the new location while L2CS retries reading from the legacy file as a fallback.

To remove IDE4EEG’s runtime data:

1.3 Parallel jobs

The Parallel jobs field controls how many parallel worker processes (via https://joblib.readthedocs.io/) IDE4EEG uses for CPU-heavy parts of the pipeline. This is the same knob MNE and scikit-learn already expose internally, unified behind one entry point.

GUI default — auto-fallback. GUI pre-fills the field with the auto-resolved value min(physical_cores - 2, available_ram_gb // 2), floored at 1 — i.e. use all cores except two, capped by RAM at roughly 2 GB per worker. On an 8-physical-core / 16 GB machine the field starts showing 6. The system-info readout below the field shows the same number plus the hardware breakdown (live-updated as you edit), with warnings (yellow) for risky values like “more than half of physical cores” or “RAM footprint exceeds one third of detected RAM.”

Headless / CLI default — same auto value as the GUI. “Headless” means running without the GUI — typically ide4eeg --run config.toml over SSH on a remote server, a cluster job, or a CI pipeline. When parallelism.n_jobs is unset (or 0) in the TOML, ide4eeg --run resolves it to the same auto value the GUI computes (min(physical_cores − 2, available_ram_gb // 2), floored at 1), so a batch run uses multiple cores by default. Two exceptions force sequential regardless: frozen/packaged builds, and non-frozen Windows with no explicit n_jobs (loky workers can crash there). Set parallelism.n_jobs = 1 in the TOML for a fully sequential run.

Every parallelism-aware step uses the same n_jobs value: MNE preprocessing (filter, notch_filter, resample), MNE catalog analyses (PSD, TFR, cluster permutation tests), connectivity bootstraps, and MP decomposition. MP decomposition has its own override field (matching_pursuit.cpu_workers on the Preprocess tab) for power users on RAM-constrained machines, since empi’s C++ workers share memory and cost less RAM than joblib’s full-Python workers — but by default the MP field is empty and the inherited value is shown as grey italic placeholder text.

Packaged desktop builds — catalog analyses run single-threaded. On the standalone installers and portable bundles (the “Pythonless” Windows / macOS downloads), the MNE catalog analyses — PSD, TFR, ERD/S, the ERP cluster-permutation test, and source estimation — always run sequentially regardless of this field. A frozen application has no Python interpreter to re-launch, so when it tries to spawn parallel workers it relaunches its own executable; on Windows that handoff crashes or hangs the app (reported against the cluster-permutation test). Forcing those analyses to one worker on frozen builds sidesteps it. Everything else (preprocessing, connectivity, MP) still honours the field, and a normal pip / conda install is unaffected and uses the full value everywhere. If you ever see a packaged build hang or crash at an analysis step, set Parallel jobs = 1 as a safe fallback. For practical guidance on choosing a value and BLAS thread pinning details see Appendix E.5.

1.4 Output directory and naming (overview)

Results are written to a folder named IDE4EEG_OUT_<input_filename>/ with preprocessing/ and analysis/ subdirectories. By default the folder lives next to the input signal; the output_path field on the Input tab overrides this. Overwrite output is on by default, so each full Run overwrites the previous results in place (one folder per recording). If you uncheck it, each full Run instead creates a timestamped subfolder (IDE4EEG_OUT_<input_filename>/<date_time>_<mode>/) so earlier results are kept. This timestamping applies to the full Run only — the per-step 👁 eye-button previews always write to the in-place folder regardless of the toggle, because they reuse stable paths to locate each step’s snapshot (a timestamped preview folder per click would defeat that lookup). Note: if you set a custom Preprocessing- or Analysis-output path on the Output tab, that path is used verbatim and Overwrite output has no timestamping effect (you chose the exact location).

Full table of output subfolders, file naming conventions, and what triggers each artefact: see Chapter 6 (Output tab).

1.5 Other Options

Option Default Description
Overwrite output on Overwrite previous results in place (one folder per recording). Uncheck to create a timestamped subfolder per full Run instead. Applies to the full Run only — eye-button step previews always write in place (they need stable paths to find step snapshots). No effect when a custom Preprocessing/Analysis output path is set.
Verbose console logging off Print startup progress and Svarog launch commands to the terminal.
Allow changing the order of preprocessing steps and adding new filters off Reveal ↑/↓ arrows on every reorderable step and the Constraints editor (see §3.2). Off by default — the standard pipeline order is correct for most workflows.
Use MNE viewer instead of Svarog off Force every interactive review window (eye buttons, Output-tab double-click) to use MNE’s plot rather than Svarog. Useful if you don’t have Svarog installed or prefer MNE’s interactive plot. Independent of the explicit “Open in MNE” / “Open in SVAROG” buttons on the Input tab.
Display functions and config names in expanded panels off Show a small grey row at the top of each preprocessing panel listing the actual Python function it calls and the TOML config sections it reads. The row lives inside the panel body, so it appears when you expand a step (a collapsed Preprocessing tab won’t show it). Useful for debugging and matching GUI steps to CLI equivalents.

1.6 Constraints editor (advanced)

Visible only when Allow changing the order… is checked. Lets you edit the step-order constraints used by the reorderable preprocessing steps. See §3.2.


2. Input tab

The Input tab is where you tell IDE4EEG which recording to analyse and inspect the signal you just loaded. Every preprocessing and analysis decision in the rest of the GUI assumes a valid recording is loaded here. The segmentation mode (event-locked epochs vs continuous) is set later on the Preprocess tab — see §3.1 Segmentation setup.

2.1 Choosing input files

Type or browse for the path in the Input field. It can point to a single EEG file or a directory; when given a directory, IDE4EEG walks it recursively and treats every supported file as a batch member. The supported file formats and their companion-file requirements are listed in Appendix B.

Recent files. Click Recent files to pick from the last few inputs. Use Clear to wipe the list.

Output root path. The Output field controls where IDE4EEG_OUT_<filename>/ is created. Empty = save next to the input file.

Video path. For §3.14 Gaze the video is auto-detected from the input filename (recording.rawrecording.mp4). Override with the Video field if needed.

MNE sample dataset. The Download MNE example (1.5 GB) button downloads the MNE sample dataset — a single-subject auditory/visual EEG+MEG recording (59 EEG channels, 600 Hz, 278 s, 4 conditions) with full FreeSurfer reconstruction. It auto-configures the classical ERP analysis: filter 0.1–40 Hz → epoch -0.2 to 0.5 s → baseline → reject 150 µV → ERP butterfly / joint / topomap / GFP / cluster permutation / ERP dipole fit with subject-specific 3D visualisation.

2.2 Signal info panel

Once a file is selected, IDE4EEG reads its header and shows duration, sampling rate, channel count, channel-type histogram, and any event markers it found. A collapsible Detail view lists every channel name and event tag.

View buttons. Two buttons launch a viewer on the raw input (no preprocessing): Open in SVAROG and Open in MNE. These are explicit file-viewer buttons — they bypass the Use MNE viewer instead of Svarog option in §1.5. Disabled until a file is loaded.

2.3 Channel types

Every channel in a loaded recording is tagged with an MNE channel type (eeg, eog, emg, ecg, bio, stim, misc, …). The Input tab is where this typing is reviewed and, when needed, overridden — every later panel on the Preprocess and Analysis tabs reads these type assignments to populate the adaptive EEG only / EOG only / … filter buttons and to power MNE’s pick_types(eeg=True) calls.

Channel selection (which channels to actually keep, drop, or hand to bad-channel detection) is configured later, on the Preprocess tab — see §3.5 Bad channels and §3.10 Select channels.

2.3.1 Channel type inference

This subsection covers where each channel’s type comes from (§2.3 above covers what the type is used for).

Where types come from, in order of authority:

  1. File-format metadata. For FIF, EEGLAB, and BrainVision files, channel types are read directly from the native file metadata (explicit per-channel type tags, chanlocs structures, header fields). Authoritative — IDE4EEG respects whatever the reader assigned.

  2. Readmanager name heuristic. IDE4EEG’s overlay (step 3) delegates every ambiguous (eeg/blank) channel name to readmanager.chtype_heuristic(name). The heuristic recognises (in priority order):

    Rule Example names Type
    Substring eog EOG Left Horiz, VEOG, EOG Fp1-M2 eog
    Substring emg EMG Chin1, EMG Ant Tibia-0 emg
    Substring ecg / ekg ECG ECGI, EKG_lead1 ecg
    Substring resp / sao2 / spo2 Resp Thermistor, SaO2 SaO2 bio
    Substring stim / trig / marker / status / sync, prefix sti STIM, Trigger, STI 014 stim
    Tokenised 10-05 position lookup Fp1, C3, EEG F3-CLE, Fp1-M2 eeg
    Fall-through Aux1, Photo, Channel_42 misc

    Non-EEG substring rules take priority over the position lookup, so EOG Fp1-M2 (an EOG reference channel using Fp1/M2 references) is correctly typed as eog rather than eeg. The 10-05 position check tokenises on whitespace/punctuation before matching, so short position names (A1, C3, O1) don’t accidentally match inside unrelated words (audio1, Data1, misc3). Requires readmanager ≥ 1.4.0.

  3. IDE4EEG overlay (universal). After every file reader runs (FIF, EEGLAB, BrainVision, EDF, BDF, BrainTech .raw), IDE4EEG applies ide4eeg.input.input._refine_channel_types to the loaded signal. It (a) respects any specific non-eeg type the reader already assigned (FIF/BrainVision/EEGLAB native types, including a deliberately-misc channel) and (b) delegates only the ambiguous default-eeg/blank channels to the readmanager heuristic of step 2. This is why EDF/BDF files — which MNE blanket-defaults to eeg because the format has no channel-type field — get the same name-based refinement as a BrainTech recording.

    Unknown channels in standard montages. When most electrodes have recognisable 10-05 names (a “standard-named” recording, recognised EEG channels in the majority), any reader-eeg channel the heuristic still cannot place is demoted to misc so it does not silently join EEG-only operations (CAR, pick('eeg'), PSD, ICA, ICLabel, REST). For numbered / non-standard montages (e.g. BioSemi A1..A32, Ch1..ChN) such channels are instead kept as eeg. Either way the demotion/keep is announced in the Run log so you can confirm or correct it on the Input tab (§2.3.2).

  4. User overrides — see §2.3.2.

2.3.2 User overrides

When the automatic heuristic gets a channel wrong — typically for lab-specific names the substring rules don’t recognise (Heart_sensor, Chest_strap, Channel_42) — you can override it via the GUI or the TOML config.

In the GUI: expand the Review channel types collapsible section in the Signal Info area. This is an inline panel that lists every channel with its current MNE type. Right-click any channel → Set type → submenu to change it. Changes apply live: the override dict updates, every channel panel on Preprocessing/Analysis tabs refreshes, the per-type filter buttons (EEG only, EOG only, …) rebuild to reflect the new counts, and the Signal Info type-count summary updates. Picking auto removes an override and restores the auto-inferred type.

In TOML:

[choosing_channels.type_overrides]
"Heart_sensor" = "ecg"
"Chest_strap"  = "bio"
"Aux_L"        = "eog"
"BadChannel"   = "misc"     # exclude from EEG analysis by name

Keys are exact channel names (as they appear in the loaded file); values are any valid MNE channel type (eeg, eog, emg, ecg, bio, stim, misc, ref_meg, seeg, ecog, dbs, fnirs). Overrides always win — they take effect after both the file reader and the name-based heuristic have run. Entries whose channel name isn’t in the loaded file are logged and silently skipped.

The Review panel is the single entry point for editing channel types. It lives on the Input tab because channel typing is a file-level concern (a property of “what’s in this recording?”) rather than a preprocessing step. Per-tab channel panels are read-only for types — they USE the type assignments (to power the adaptive filter buttons) but don’t let you edit them. This keeps the editing surface in one place and prevents GUI↔︎pipeline disagreements by construction.

2.3.3 Adaptive per-type filter buttons

Every channel panel with eeg_filter=True on the Preprocessing and Analysis tabs shows a row of per-type filter buttons next to All and None. The buttons are adaptive: only types actually present in the loaded signal get a button, with the current count shown in parentheses. For a MASS PSG example:

[All] [None]  [EEG only (18)] [EOG only (2)] [EMG only (5)] [ECG only (1)] [BIO only (5)]

Clicking ECG only (1) unchecks everything except the single ECG channel — useful for configuring ICA EOG correlation references, channel-subset spectral analyses, or any pipeline step that needs a specific modality. If you right-click a channel in the Review panel and change its type, the filter button row rebuilds automatically.

Channels whose current type isn’t eeg are dimmed in the channel panels and get a trailing type badge (e.g. EOG Left Horiz [eog]).

Override precedence:

  1. User override from choosing_channels.type_overrides (authoritative)
  2. File-format reader’s native metadata (FIF, BrainVision, EEGLAB)
  3. Readmanager’s chtype_heuristic (applied via the universal overlay to every reader’s ambiguous eeg/blank channels — see §2.3.1 step 3)
  4. For a channel still unmatched after the above, the outcome depends on the rest of the montage — demoted to misc when the recording is otherwise standard-named (recognised electrodes are the majority), or kept as eeg for numbered / non-standard montages. These channels are flagged in the Run log so you can confirm them on the Input tab.

If you need channel-level selection rather than retyping, use dropped_channels or selected_channels — those act after typing and exclude channels entirely regardless of their type.


3. Preprocess tab

The Preprocess tab is where you configure the signal-cleaning pipeline. Each step has its own collapsible panel with an enable checkbox, parameter fields, and a 🗄️ Save toggle controlling whether the step’s intermediate output is written to disk.

The pipeline has three sections:

  1. Segmentation setup — fixed and always runs first. One panel, where you choose between event-locked epochs and timed-window (rest) mode and pick which events to use.
  2. Reorderable steps — twelve steps you can enable individually and reorder via ↑/↓ arrows on each panel header. Trim, Resampling, Detect bad channels, Montage, Reference, Filtering, MP filter, MP decomposition, ICA, Select channels (manual exclusion), EEG artifacts (p2p + slope + muscle/EMG + flat, on by default), and Gaze. (In the default order MP filter and MP decomposition sit right after Filtering, before ICA — so MP operates on the filtered, pre-ICA signal; both are off by default. MP decomposition does not modify the signal — it produces the .db atom book the Analysis-tab analyses consume; see §3.11 / §3.12.)
  3. Final segment handling — three fixed panels that always run after the reorderable steps complete: Cut Segments, Mark detected artifacts, and Save Segments. They cut the continuous signal into epochs, condition the artifact marks (and optionally drop bad segments, with optional manual review), and save the final cleaned epochs.

The reorder controls are hidden by default. Tick Allow changing the order of preprocessing steps in the Config tab (see §1.5) to reveal them. The default order is correct for most workflows.

Internally the pipeline has three phases — Segmentation setup (always runs first), the reorderable steps (drag handles, user-controlled order), and three final fixed panels (Cut Segments, Mark detected artifacts, Save Segments). In code + maintainer docs these are called the “preamble”, “reorderable steps”, and “postamble”; the manual uses the panel names directly.

3.1 Segmentation setup

GUI panel: Segmentation setup (always visible, non-checkable) Pipeline phase: runs first, before any reorderable step

3.1.1 Purpose & non-destruction invariant

Segmentation setup prepares the coordinate system used by the rest of the preprocessing chain. It does two things:

  1. Builds the event-ID dictionary — a pure config operation that maps the user’s selected event names (GUI event panel, or [epochs.tags] selected = [...] in TOML) to integer IDs, consumed later by mne.Epochs / _cut_segments.
  2. Attaches synthetic rest markers — only in continuous mode. REST annotations are added to mark synthetic window onsets via signal.set_annotations(...). The original STIM channel is never touched.

Non-destruction invariant. Every preprocessing step preserves the event markers the file reader produced. Rest-mode analyses do not overwrite the original STIM channel — synthetic window markers go to REST annotations instead, leaving file-derived events intact. A recording with hardware triggers analysed in continuous mode keeps both coordinate systems available in the saved -epo.fif outputs.

The rule is enforced by construction: no preprocessing step writes to STIM. Rest-marker generation is purely metadata (set_annotations is lazy — no load_data() is forced) and downstream consumers pick which event source to read based on the active segmentation mode (annotations in rest mode via events_from_annotations, the STIM channel in event-locked mode via find_events). Stim channels are filtered by type (mne.channel_type(info, i) == "stim"), not by exact name.

3.1.2 Mode: continuous (rest)

Used when segmentation.mode = "rest".

Parameter Default Description
rest_duration [0, ""] [start, end] time interval (seconds) of the continuous signal to extract for rest analysis. Empty-string or omitted end = full signal.
window_length 20 Length (seconds) of each analysis window. The rest segment is divided into non-overlapping windows of this duration.
check_rest false Interactively review rest segments after automatic rejection. Blocks batch runs — Stage 5 refuses the combination at preflight on --run without --review-allowed.

Rest windows are computed in the trimmed signal’s coordinate system — rest.rest_duration means seconds relative to t=0 of the post-trim data, matching MNE’s convention that raw.times restarts at 0 after crop.

In rest mode the spectral analyses estimate on the continuous signal and omit artifact-marked spans rather than cutting fixed windows — see §4.2.2 ([segmentation].reject_by_annotation, default true).

3.1.3 Mode: event-locked epochs

Used when segmentation.mode = "epochs".

Parameter Default Description
start_offset -0.3 Epoch start time relative to the event trigger (seconds). Negative = before the event.
stop_offset 0.7 Epoch end time relative to the event trigger (seconds).
epochs_baseline "None" Baseline correction: "None" to skip, [start, end] in seconds (e.g. [-0.3, 0]), or ["None", "None"] for baseline across the entire epoch. Inside the list, individual "None" entries mean “use the epoch boundary” (so ["None", 0] = from epoch start to 0 s).
check_epochs false Interactively review epochs after automatic rejection. Blocks batch runs — Stage 5 refuses the combination at preflight on --run without --review-allowed.
reject_dict (legacy — ignored) No longer applied. The epoch cut uses reject=None; single-sample rejection was replaced by the proportion/run-based artifact check (issue #11). Bad segments are now derived from the upstream artifact marks (§3.16).
flat_dict (legacy — ignored) No longer applied — flat spans are detected upstream by the EEG-artifacts flat detector and aggregated in §3.16.

Example with custom thresholds:

[epochs]
start_offset = -0.2
stop_offset = 0.8
epochs_baseline = [-0.2, 0]

Tags — selecting which events to include. In the GUI, events discovered from the file appear as checkboxes. In TOML:

[epochs.tags]
selected = ["picture_1", "picture_2", "picture_A"]

Legacy formats (still supported):

# Auto-discover target / nontarget by keyword:
[epochs.tags]
AUTO = true

# Manual tag groups:
[epochs.tags]
AUTO = false
visual = ["picture_1", "picture_2"]
auditory = ["beep"]

3.1.4 👁 segments preview

The Segmentation setup panel header carries a 👁 segments button that opens Svarog with the raw input signal overlaid by .tag markers for each segment the current config would produce — a preview of where segments will fall, before running the pipeline.

Both point markers (segment starts / event anchors) and duration blocks (full windows) are written, so Svarog shows both the anchor positions and the actual signal ranges that will become segments. The button requires an input signal to be loaded on the Input tab; it opens the raw signal (no preprocessing applied yet).

3.2 Step order and constraints

The following twelve panels are the reorderable steps — you can enable each individually and rearrange them via ↑/↓ arrows on the panel headers. Each reorderable step has a checkbox in its panel header. When the order controls are visible (Config tab → Allow changing the order…), each step also gets / arrows for repositioning. Two kinds of constraints govern valid orderings:

Default constraints:

[preprocessing]
# MP filter carries its own internal decomposition
# (§3.12), so there is no hard ordering between mp_decomposition
# and mp_filter.  Default hard set is empty; add custom rules here.
hard_constraints = []
soft_constraints = [
    ["filtering", "ica"],     # Winkler et al. 2015 — HP filter ICA input
    ["bad_channels", "ica"],  # bad channels first for clean decomposition
]

The constraints editor on the Config tab lets you add or remove rules. The current step order is stored as preprocessing.step_order = [...]. If you omit step_order in TOML, the canonical default order is used.

3.3 Trim signal

GUI panel: Trim signal (checkable, per-step Save)

Crops the recording to a region of interest, removing irrelevant segments far from any event. Reduces memory usage and prevents boundary artifacts from contaminating later processing.

Algorithm.

  1. Read trim_start (default 0, the start of the recording) and trim_end (default unset, the end of the recording).
  2. Clamp boundaries to [0, signal_duration].
  3. Call signal.crop(tmin=trim_start, tmax=trim_end) if the resulting window differs from the full recording.
Parameter Default Description
trim_start 0.0 Start time in seconds. 0 = beginning.
trim_end "" End time in seconds. The TOML default is the empty string (unset); leaving it empty means “end of file”.

Memory-preserving (lazy) cropping. signal.copy().crop(...) works on MNE’s file-backed Raw object — .copy() is a metadata-only shallow copy and .crop() just adjusts the internal sample-range markers. The full recording is never materialised in RAM. Only the kept sub-range is read from disk on demand, when a downstream step that actually needs data (filtering, ICA fit, etc.) consumes it. The same lazy property holds for set_montage, set_eeg_reference, and set_annotations, which are all pure metadata. As a result, recordings substantially larger than available RAM — multi-hour overnight polysomnography, long resting-state sessions, gigabyte-scale BrainTech .raw files — can be trimmed and processed without ever loading the original full file. When [trim] save = true, only the cropped sub-range is loaded for the FIF write.

Save default: [trim] save = false. For Trim the 🗄️ drawer is a derived, locked indicator, not a manual toggle — it turns on automatically only when the Trim step is enabled and the trim is non-trivial (a real crop), and stays off for a no-op trim, so a default run writes no redundant <base>-trimmed-raw.fif. When it is on, that snapshot is the canonical cropped signal reused for Svarog launches from the Preprocessing tab (see below).

Eye button. Opens the Preview & Trim window directly (full-recording overview with the trim region shaded; click on the canvas to set start/end, or type values; +/- amplitude controls and a DC-removal checkbox). It does not open a Svarog split view because Svarog’s split-sync aligns by sample index, which left the cropped FIF’s time axis labelled 00:00 and the original at trim_start — visually misleading when the alignment was actually correct.

Trim is the source of truth for “what the pipeline sees.” Every “open in Svarog” button on the Preprocessing tab — Select channels → Mark in Svarog, Detect bad channels → Run and check, the manual ICA / bad-channels / segments review hooks driven from inside the running pipeline — feeds Svarog the cropped segment, not the full input. If [trim] save = true and the on-disk snapshot’s [cfg:<hash>] matches the live config, the snapshot is reused; otherwise a tempfile-named temp FIF is written for the launch and cleaned up afterwards. Channel names are preserved by crop, so any .tag file Svarog writes back translates unchanged.

Hidden defaults for resample_freq, cover_time, threshold_time (the last two deprecated): see Appendix E.1.

3.4 Resampling

GUI panel: Resample (checkable, reorderable, per-step Save)

Downsamples the signal to a lower sampling frequency, reducing data size and computation time without losing information below the target Nyquist frequency. Uses MNE’s signal.resample(sfreq=target), which internally applies a low-pass anti-aliasing FIR filter at the new Nyquist frequency, then resamples via MNE’s default FFT-based method. Signals already at or below the target rate are left unchanged.

Parameter Default Description
resample_freq 0 Target sampling frequency (Hz). 0 = keep original rate.

Reference: Gramfort A et al. (2014) “MNE software for processing MEG and EEG data.” NeuroImage 86:446–460.

3.5 Bad channel detection

GUI panel: Detect bad channels (checkable, per-step Save)

Automatically identifies noisy, flat, or uncorrelated EEG channels that would degrade later processing (filtering, ICA, epoch averaging). Detected channels are then dropped from the signal via signal.pick(picks=None, exclude="bads") in _step_bad_channels — they are removed, not interpolated from neighbours.

Algorithm — five robust detectors + iterate-once cleanup. IDE4EEG implements five parallel detectors — flat (SD-based), flatline duration, robust deviation, windowed correlation, HF noise — plus a one-pass cleanup that re-runs the cross-channel detectors after the obvious bads are dropped. All five criteria are toggleable independently and ship enabled by default. They run sequentially on a single shared pre-conditioning of the signal (notch + 1 Hz high-pass) so the thresholds are comparable across recordings. Algorithm and default values follow the PREP pipeline (Bigdely-Shamlo et al. 2015, doi:10.3389/fninf.2015.00016) for criteria 1, 3, 4, 5; Criterion 2 (flatline duration) mirrors EEGLAB clean_flatlines.m (Mullen et al. 2015). See § What IDE4EEG implements vs. PREP below for the precise correspondence and docs/EEG_bad_channels.pdf for the cross-package + literature survey.

Per-channel rejection reasons. Each flagged channel keeps the criterion/criteria that flagged it (a channel can trip several — e.g. a flat channel also de-correlates). These are: (1) printed in the Run log — Bad channels final list (2 channels): O2 (correlation, hf_noise); Fp1 (deviation); and (2) written into the Svarog preselect .tag as a desc.reason field per channel (display only — never the channel name, so the editable review’s name round-trip is unaffected), so the bad-channel review can show why each channel was flagged. Channels added by the iterate-to-convergence loop are attributed to the criterion that caught them. (Whether the installed Svarog jar renders desc.reason depends on its version — the IDE4EEG side writes it; a small Svarog render is the remaining piece.)

Preconditioning (shared by all five detectors)

  1. Notch at the line frequency (50 Hz default for Europe / most of Asia / Africa / Australia; 60 Hz for the Americas / Japan; inherited from filters.notch_freq when configured). With harmonics enabled (default ON), the notch removes the fundamental AND every integer harmonic up to Nyquist (50, 100, 150, … or 60, 120, 180, …).
  2. High-pass at hp_freq_hz (default 1.0 Hz). DC drift dominates raw amplitude and destroys correlation; the 1 Hz HP is load-bearing for the cross-channel statistics.

Robust statistics — what MAD means. Every detector below uses the median absolute deviation instead of the ordinary standard deviation: MAD(x) = medianᵢ(|xᵢ − median(x)|) — the median distance of the samples from their median. Scaled by 1.4826 = 1/Φ⁻¹(0.75), it becomes a consistent estimator of the Gaussian σ:

σ_robust = 1.4826 · MAD

The MAD-based form is used because a single bad channel cannot poison the reference used to judge the others — that property is what makes the five detectors below survive a recording with many concurrent bads.

Detector 1 — Absolute amplitude checks

Two µV-threshold checks bundled into one conditional statement in the GUI panel:

If median SD for all channels ∈ [recording_amp_check_min_uV, recording_amp_check_max_uV] µV then check for each channel SD < flat_sd_threshold_uV µV.

The bundled form mirrors the orchestrator’s actual logic: the sanity check is the precondition for the per-channel SD floor, not an independent setting.

The if-clause (median-SD plausibility). At the top of the orchestrator, the median across channels of the robust SD (1.4826 · MAD) is compared against the configured envelope (default [0.5, 200] µV). Out-of-range strongly suggests a calibration error — a wrong calibrationGain factor, units recorded in V instead of µV, or a saturated amp — rather than every channel being simultaneously dead. The orchestrator logs a WARNING with the actual median and the envelope, and the per-channel SD check is skipped for that run only. The other criteria (Flatline / Amplitude outlier / Correlation / HF noise) are calibration-insensitive and keep running.

The then-clause (per-channel SD floor). When the median is in range, each channel’s robust SD is compared against flat_sd_threshold_uV (default 1e-3 µV = 10⁻⁹ V, matching PREP MATLAB’s findNoisyChannels.m:289). Channels below the floor are flagged. Catches numerically-flat / silent-ADC channels; cheap insurance with no failure mode of its own as long as the calibration is correct.

The two checks share one master toggle in the GUI because neither makes sense in isolation: the per-channel SD floor is brittle against calibration errors, and the median-SD sanity check exists specifically to gate it. The legacy enable_flat and enable_recording_amp_check TOML keys still round-trip independently for back-compat, but the GUI surfaces them as one. Power users who want the per-channel SD check to run unconditionally can set the bounds wide (e.g. [0, 1e9] µV) — the if-clause then always passes.

Detector 2 — Flatline duration (eps-relative)

Mirrors EEGLAB clean_flatlines.m (Mullen et al. 2015): for each channel, find the longest contiguous run of samples where |x[i+1] − x[i]| < max_jitter × machine_precision (i.e. samples are essentially identical at the data’s numerical precision). A channel is flagged if its longest such run meets or exceeds flatline_max_duration_s (default 5 s; jitter default 20).

What “machine precision” means. machine_precision = np.finfo(dtype).eps — the smallest distinguishable difference in the data’s number format. On float64 (MNE default) it’s ~2.2×10⁻¹⁶ V, so at max_jitter = 20 the test asks “do these two samples differ by less than ~4.4×10⁻¹⁵ V?”. At that scale this is essentially “are they bitwise identical?” — true only for silent ADCs, railed channels, and post-rereference collisions, not for any real EEG. The threshold scales with the data’s dtype precision rather than its µV magnitude, so the detector is calibration-insensitive: a recording with a wrong calibrationGain factor (the wakeEEG.raw case) doesn’t trip it. Caveat: float32 data shrinks the safety margin to ~2.4 µV; rare in MNE workflows but worth knowing if your file reader produces float32.

Complements Detector 1: same defect class (channel is flat) but a fundamentally different threshold family (duration vs. amplitude). Catches silent ADC, railed ADC, post-rereference collision, and any literally-constant stretches.

Detector 3 — Amplitude outlier (PREP: “Robust deviation”)

For every EEG channel, compute the robust SD. Then compute a robust z-score across channels using the median + MAD of the per-channel SDs:

              σ_robust(c) − median_c( σ_robust )
z(c) = ───────────────────────────────────────────────────────────
        1.4826 · median_c( | σ_robust − median_c( σ_robust ) | )

Flag any channel with |z(c)| > deviation_z_threshold (default 5.0). Catches gross under- and over-amplitude channels. The MAD-based cross-channel statistic ensures a single bad channel cannot poison the reference used to judge the others.

Detector 4 — Windowed correlation (max over other channels)

Split the recording into correlation_window_seconds-long (default 1.0 s) non-overlapping windows. For every window w and every channel c, compute the maximum absolute Pearson correlation between c and any other EEG channel:

ρ(c, w) = max  | corr(x_c, x_c') |_w           (c' ≠ c)
            c'

A window is “bad” for c when ρ(c, w) < correlation_threshold (default 0.3, lowered from PREP’s 0.4 — see Appendix C). Channel c is flagged when the fraction of bad windows exceeds bad_window_fraction_threshold (default 0.05, i.e. 5 %). The fraction-of-bad-windows form — instead of a single full-record correlation — means a transient artifact does not doom a channel for the whole record.

Not montage-aware. Despite the intuition that volume-conducted EEG channels should correlate most strongly with their spatial neighbours, this criterion uses every other EEG channel as a candidate, not a k-nearest-neighbour subset. The best-correlated channel will physically usually be a spatial neighbour, but the algorithm doesn’t enforce it. The original PREP MATLAB has a kNN-with-electrode-positions variant; pyprep and IDE4EEG both drop the kNN to avoid the montage requirement.

Two safety bailouts. The criterion is skipped when there are fewer than 3 EEG channels — the max correlation against OTHERS statistic becomes degenerate (with exactly 2 channels, both share the same value and pass-or-fail together). It also bails out and flags nothing when every channel exceeds the bad-window fraction — that’s a sign the detector has degenerately failed (no good reference left). Both bailouts log a WARNING explaining the cause and suggesting remediation (lower correlation_threshold, disable the criterion, or investigate the recording). The other three detectors still run and contribute their findings.

Detector 5 — High-frequency noise ratio

Compute the robust SD on the HF-filtered signal (above hf_lower_hz, default 50 Hz) and on the broadband signal; take their ratio per channel:

        σ_robust_HF(c)
η(c) = ──────────────────────
       σ_robust_broadband(c)

Then a robust z-score across channels of η(c), and flag any channel whose z > hf_noise_z_threshold (default 5.0) — only positive deviations (high HF is bad; low HF is normal). Catches bridged electrodes and EMG-contaminated channels that may pass the amplitude and correlation checks. The amplitude-normalised form (a ratio, not absolute HF power) is the reason this is orthogonal to the deviation detector: a loud-but-clean channel passes; a quiet-but-EMG-contaminated channel doesn’t.

Refinement passes — “Iterate up to N times” (default 1)

After the first-pass union of flagged channels is taken, drop them and re-run the three robust cross-channel detectors (deviation, correlation, HF) on the cleaner reference, repeating until a pass finds nothing new or the cap is hit. This catches successive layers of borderline channels masked by the obvious outliers — the cross-channel MAD shrinks once the worst offenders are out, surfacing channels that were just under the z = 5 threshold the previous time around (the original PREP paper demonstrates this iteration is the single biggest robustness gain in their pipeline; PREP itself iterates to convergence, capped at 4). The “Iterate up to [N] times” pulldown sets the cap via max_iterations (0–4): 0 = single pass (no refinement), 1 = the recommended default, 4 = PREP’s cap. The loop only ever adds channels, so it always terminates and stops early on convergence. (An older iterate_once = true/false bool is still accepted in old configs and maps to 1 / 0.)

Cross-coverage by design

The five criteria are complementary, not orthogonal: a clearly-flat channel (literal zeros) also fails the correlation detector (zero-variance ⇒ zero correlation with anyone), and a severely over-amplitude channel may trip both deviation and HF noise. That overlap is a feature — the union of the five catches every defect class robustly. Per-criterion synthetic-signal tests in tests/test_bad_channels_prep.py assert “the targeted detector fires on its planted defect”; they do not assert “only that one detector fires”, because the cross-coverage is intentional.

Manual review (interactive)

When review_bad_channels = true, an interactive signal browser opens after the auto-detection step (if any) so the user can confirm or change the flagged list:

Batch-break footgun. Setting review_bad_channels = true is incompatible with an unattended ide4eeg --run batch — the review window opens and the pipeline halts until the user closes it. Stage 5 refuses the combination at preflight unless --review-allowed is passed (see §5.1).

Top-level parameters

Parameter Default Description
choose_bad_channels "auto" Auto-detection algorithm: "auto" (algorithmic) or "none" (skip auto-detection). The GUI panel-header enable checkbox is the master switch; this TOML key is read by CLI batch runs. Loading a TOML with "none" surfaces in the GUI as the Detect bad channels panel-header checkbox being unchecked — so the disabled state is visible at a glance instead of silently bypassed.
review_bad_channels false Open the interactive Svarog / MNE picker after auto-detection. Default off (batch-safe).

TOML back-compat. Old-style enum values choose_bad_channels = "manual" / "both" are auto-migrated at load: "manual""none" + review_bad_channels = true; "both""auto" + review_bad_channels = true. Each rewrite logs a one-shot WARNING quoting the source path so you can update the TOML to the new orthogonal shape.

Output. When save = true, the bad-channels step writes the cleaned signal as <base>-bad-marked-raw.fif. The config is updated with the flagged channel list; the signal’s info["bads"] is set accordingly; then signal.pick(picks=None, exclude="bads") drops the channels from the working signal. (Diagnostic per-criterion plots are a planned follow-up — the legacy detector wrote bad_channels_detection/*.png per-channel band-power plots; the current detector does not yet write equivalents. The Run-log diagnostic table — one line per channel per criterion with the determining value and [FLAGGED] marker — partially fills the same role.)

👁 eye button. Clicking the eye button on the Detect bad channels panel header drives a truncated pipeline through the bad-channels step (no analyses run) and then opens Svarog with the detected channels pre-marked (--select-mode bad_channels), mirroring the Select channels eye pattern. The detected channel names also appear in the panel’s body label. The view is read-only — any edits the user makes inside Svarog are logged but not synced back to the cfg; to persist edits, tick Review bad channels manually and re-run the pipeline so the existing review-mode flow takes over. The truncated run auto-saves the bad-channels snapshot regardless of the 🗄 panel-header toggle, so subsequent eye clicks on earlier steps reuse the fresh snapshot.

What IDE4EEG implements vs. PREP

IDE4EEG is substantially PREP-aligned but not literally complete. Four of the five detectors use PREP’s published thresholds exactly (criterion 2, Flatline duration, is from EEGLAB clean_flatlines.m, not PREP). Three further differences from the canonical PREP algorithm:

Aspect PREP (Bigdely-Shamlo 2015) IDE4EEG
Flat detection robust SD below floor same (_detect_flat_channels)
Robust deviation z-score abs(z) > 5 across channels same (_detect_deviation_channels)
Windowed correlation 1-s windows, abs(r) < 0.3, >5 % bad same (_detect_correlation_channels)
HF/broadband SD ratio z > 5 above 50 Hz same (_detect_hf_noise_channels)
Pre-conditioning HP 1 Hz same (prep.hp_freq_hz)
Iterate-once cleanup iterate until convergence one extra pass only — captures most of the benefit without the convergence-edge-case logic
RANSAC interpolation criterion included; strongest single criterion not implemented — requires 3-D montage, expensive; planned as opt-in advanced detector
Derived SNR criterion (corr × HF) included not implemented — redundant given we have both components individually
Robust-average reference during assess. rereferences inside the loop not done — Reference is a separate user-controlled reorderable step

The fair phrasing in any external paper is “the four most-cited PREP detectors with PREP-published thresholds plus EEGLAB’s flatline-duration check and a one-pass iterate-once cleanup”, not “the PREP pipeline” verbatim.

Advanced parameters ([choosing_channels.prep] sub-block): see Appendix E.2.

3.6 Montage

GUI panel: Montage (checkable, reorderable) Step key: montage · Backend: _set_montage (channels_and_signal.py)

Assigns physical electrode positions to every EEG channel. Coordinates are required downstream for topographic plots, source estimation, and any spatial filtering.

Algorithm. Loads a standard montage (e.g. standard_1020, standard_1005, biosemi64) from MNE’s built-in library and applies it via signal.set_montage(montage, on_missing="warn"). Channel names are matched to montage positions; unmatched channels keep their existing position (or none) and a warning is logged.

Native-position keep. When the loaded file already provides 3D digitised positions for every EEG channel (e.g. an MNE sample FIF with EEG 001… naming + digitised coords), the standard-montage step is skipped to preserve the file’s own coordinates — applying standard_1020 would silently blank them out by name-matching. The config records this with the sentinel value electrodes_layout = "native (from file)" (the literal string is exposed as ide4eeg.NATIVE_POSITIONS_SENTINEL).

TOML back-compat. Step-orders that list montage but not reference are auto-migrated by the pipeline driver to insert reference right after montage.

Parameter Default Description
electrodes_layout "native (from file)" MNE montage name ("standard_1020", "standard_1005", "10-20", "biosemi64", …) or the literal sentinel "native (from file)" to keep the file’s own positions. The default is the sentinel; a fresh install with a file that has native positions will use them automatically. See the MNE montage docs.

This step has no Save toggle: set_montage writes only metadata into signal.info, no signal data is altered. The 👁 button on the panel header opens the montage in 3D / topographic view.

Reference: Gramfort A et al. (2014) — MNE software (NeuroImage 86:446–460).

3.7 Reference

GUI panel: Reference (checkable, reorderable, per-step Save) Step key: reference · Backend: _set_reference (channels_and_signal.py)

Re-references the signal — i.e. subtracts a reference signal from every channel — using signal.set_eeg_reference(...). Independent of §3.6 Montage; montage and reference are separate reorderable steps so you can apply them in either order.

Six options:

Preset Formula Description
Raw signal (no re-referencing) x_i(t) unchanged Keep the recording’s native reference. Default.
Common Average (CAR) x_i'(t) = x_i(t) − (1/N) · Σ x_j(t) Subtract the mean across all good EEG channels.
Linked mastoids x_i'(t) = x_i(t) − mean(M1, M2) Average of mastoid channels.
Linked ears x_i'(t) = x_i(t) − mean(A1, A2) Average of ear channels.
REST (infinity) Yao (2001) standardisation Reconstruct a reference-free potential via the lead-field matrix.
Custom x_i'(t) = x_i(t) − mean(ref_channels) Comma-separated list of channels — their mean becomes the reference, and they are added to info["bads"] so subsequent steps and channel filters ignore them.
Parameter Default Description
re_reference [] [] / "" / "None" (raw — all three accepted), "average" / "CAR" (common average), "REST" (infinity), or a list / comma-string of channel names (custom).

Default is [] (no re-referencing) — picking a reference is an explicit opt-in decision the user makes based on the recording’s hardware reference and the downstream analysis (e.g. CAR before source localisation; mastoids for ERP).

When save = true, the rereferenced signal is written to <base>-rereferenced-raw.fif. The 👁 button shows the before/after comparison.

Reference: Yao D (2001) “A method to standardize a reference of scalp EEG recordings to a point at infinity.” Physiol. Meas. 22:693.

3.8 Filtering

GUI panel: Filtering and Resampling (checkable, per-step Save)

Removes unwanted frequency components: slow electrode drift (highpass), high-frequency noise and EMG (lowpass), and power-line interference (notch). In the default IIR path only EEG channels are filtered (STIM and auxiliary channels are bypassed); the FIR path delegates to MNE, which filters all data channels (EEG plus EOG/EMG/ECG) but leaves STIM untouched.

IIR (Butterworth / Chebyshev II) — default. Filters are designed automatically from the cutoff frequency and the signal’s sampling rate using scipy.signal.iirdesign. Stored in second-order sections (SOS) format for numerical stability and applied zero-phase via sosfiltfilt (bidirectional, no phase distortion). Note that the forward+backward pass squares the filter’s magnitude response — a -3 dB design becomes -6 dB at cutoff — and doubles the effective filter order. The filter-response plots produced by plot_filt = true show the single-pass design, not the realised doubled response.

Filter Passband edge Stopband edge Design
Highpass f_hp Hz f_hp / 2 Hz Butterworth, gstop=10 dB, gpass=3 dB
Lowpass f_lp Hz min(2*f_lp, 0.95*Nyquist) Hz Butterworth, gstop=12 dB, gpass=3 dB
Notch f_notch ± 2.5 Hz f_notch ± 0.1 Hz Chebyshev II, gstop=25 dB

FIR (MNE windowed sinc). Uses MNE’s Raw.filter(..., method="fir") / Raw.notch_filter(..., method="fir") with a Hamming-windowed sinc design and automatic filter length. Always stable, linear phase, zero group delay. Applied via FFT overlap-add. Slower than IIR but has no stability concerns.

SOS vs ba format. SOS decomposes the filter into cascaded biquad stages, numerically stable for high-order filters. The older ba (numerator/denominator polynomial) format can produce unstable filters due to floating-point errors, especially at high sampling rates.

Parameter Default Description
show_filt false Display filter response plots interactively.
plot_filt false Save filter response plots to disk.
method "iir" "iir" (Butterworth/Chebyshev) or "fir" (MNE).
highpass_freq 0 Highpass cutoff Hz. 0 = off (default). Common: 0.1, 0.5, 1.0 Hz.
lowpass_freq 0 Lowpass cutoff Hz. 0 = off (default). Common: 30, 40, 100 Hz.
notch_freq 0 Notch centre Hz. 0 = off (default); presets 50 (Europe/Asia) or 60 (Americas).
notch_harmonics false The harmonics checkbox next to the notch field. When ON, the notch removes the fundamental AND every integer harmonic up to Nyquist (50/100/150/… or 60/120/180/…). Off by default (single notch); no effect when notch_freq = 0.

Output. When plot_filt = true, filters_plot/ folder with frequency response PNGs.

3.9 ICA — artifact subspace removal

GUI panel: ICA (checkable, per-step Save, 👁 View Step Result) Backend: ide4eeg/preprocessing/ica.py::fit_and_apply_ica

Independent Component Analysis separates the EEG into statistically independent sources. Components flagged by the selector as artifacts (eye blinks, eye movements, muscle, cardiac, line noise, channel noise) are projected out, preserving the brain activity in the remaining subspace.

The TOML section is still named [ICA_EOG] for config-file compatibility with older configs; the historical “EOG” naming is misleading now that the default selector is ICLabel (which flags eye, muscle, heart, line noise, channel noise, and “other” in addition to EOG).

Precision note on attribution. The bullets and table below are deliberately specific about who implements what. mne.preprocessing.ICA is a dispatcher and state container; the numerical kernels for the three method= choices live in three different places (one in MNE in-tree, two in separately-released upstream packages — python-picard and scikit-learn’s FastICA), and the ICLabel selector lives in yet another separate package (mne_icalabel) that the MNE project maintains but does not ship as core. IDE4EEG’s ica.py is a workflow wrapper around all of these — it ships no original numerical ICA code. We try to over-credit nothing.

Algorithm.

  1. Fit-time filter copy. A raw.copy() is taken and filtered with three independent knobs (all applied to the COPY only; the original signal — and therefore the cleaned signal returned by ica.apply — never sees these filters). Code source: raw.copy().filter(...) / raw.copy().notch_filter(...) in MNE-Python in-tree (Gramfort 2014). Methodological source for the 1 Hz default: Winkler 2015 (high-pass filtering an ICA fit copy near 1 Hz substantially improves ocular-component identification without losing slow ERPs in the analysis signal).

  2. Pre-fit reject gate. Segments whose peak-to-peak amplitude exceeds reject_uv (default 500 µV) are dropped from the fit via ica.fit(..., reject=dict(eeg=...), tstep=reject_tstep), so rare big transients (electrode pops, subject coughs) don’t dominate the decomposition. reject_tstep (default 2.0 s) is the length of the sliding window MNE scores each segment against; shorter windows reject more aggressively. Code source: mne.preprocessing.ICA.fit (MNE-Python in-tree, Gramfort 2014). Methodological background: Delorme & Makeig 2004 (EEGLAB) and Onton et al. 2006 — peak-to-peak rejection of high-amplitude transients before ICA is the community-standard pre-ICA hygiene step. IDE4EEG-side contribution: the reject_uv scalar µV ergonomic shim that fans out to MNE’s reject=dict(eeg=...) dict in V.

  3. Pre-whitening (PCA). Before the chosen decomposition algorithm runs, mne.preprocessing.ICA.fit internally:

    1. picks channels (eeg / mag / grad as configured),
    2. applies the reject=dict(eeg=...) peak-to-peak gate above to build the training segments,
    3. applies decim if > 1,
    4. mean-centres,
    5. runs a PCA decomposition and whitens by the principal components scaled to unit variance, projecting to the chosen number of components (the user’s int, the float-fraction-of-variance, or mne.compute_rank()’s rank estimate when n_components="rank").

    The whitened, dimension-reduced data is what gets handed to Picard / Infomax / FastICA in step 4. ICA’s mathematical contract requires whitened input (zero mean, unit variance, decorrelated) — without pre-whitening, none of the three algorithms converge correctly. Code source: the PCA whitening and dimension-reduction step inside mne.preprocessing.ICA.fit (MNE-Python in-tree, Gramfort 2014), with the rank estimator mne.compute_rank() in the same package. Methodological source for the whitening requirement: Hyvärinen & Oja 2000 — the canonical exposition of ICA’s algorithmic family, including why whitening must come before the decorrelation-→-independence step.

    PCA whitening with the right n_components also doubles as the rank-defense step: average-reference EEG has rank n − 1 (the average constraint removes one degree of freedom), so feeding 32 components into ICA on average-referenced 32-channel data would build a singular mixing matrix. The IDE4EEG default n_components="rank" calls mne.compute_rank(fit_copy) (SVD-based) to detect this automatically and pass the correct value. Integer and float-fraction overrides are also accepted.

  4. Decomposition. The whitened data is then handed to one of three numerical kernels, selected by method. The default is "picard" with fit_params=dict(ortho=False, extended=True), which optimises the same extended-Infomax objective as Lee et al. 1999 but 3–10× faster via preconditioning.

    IDE4EEG-side contribution to this step: BLAS threads are temporarily uncapped via threadpoolctl.threadpool_limits(limits=None) for the duration of the fit so Picard’s inner loop (and the BLAS calls inside Infomax / FastICA) can use all cores, regardless of any earlier thread cap. After the fit, the previous limit is restored.

  5. Auto-detection (selector). Once the decomposition is fitted, components are scored against one of two automatic labelling routes (or their union):

    GUI presentation. The selector is two checkable groupboxes — ICLabel (iclabel) and CORRMAP (find_bads); ticking both yields the union (both), noted inline (greyed) in the “Selection of components.” header above them (full keep-vs-remove rationale in that header’s tooltip). The CORRMAP box holds the three detector ticks (muscle / ECG / EOG). ECG greys out when the signal has no ECG/MEG channel. EOG is signal-adaptive: with a real EOG channel the tick uses it directly; with none, the tick greys and a “choose proxy EOG:” multi-select appears, listing the frontal EEG channels present in the loaded file (Fp*/AF*/F*) — picking ≥1 enables EOG-via-proxy (it is the on/off control in that case). No channel names are assumed a-priori; the picker is empty until you choose. TOML configs keep the canonical lowercase selector values used throughout this section.

    Keep-list vs remove-list — read this before reasoning about both. The two boxes are framed in opposite directions, which is easy to misread. ICLabel is a keep-list: it classifies every component into one of seven classes, so its control is labelled “Keep:” — anything not in a kept class is removed. CORRMAP is a remove-list: it only ever flags specific artifact types, so its control is labelled “Remove:” — each ticked detector removes what it flags. The asymmetry is intrinsic to the two methods (exhaustive classifier vs targeted artifact detectors), not a UI quirk. Despite the opposite framing, both boxes ultimately produce a set of components to remove, and selector="both" takes the union of those two removal sets (MNE’s accumulate-into-ica.exclude idiom): a component is dropped if either box marks it. Union (∪ = “or” / sum), not intersection (∩ = “and” / overlap) — the keep-vs-remove framing is precisely what tempts the intersection misreading.

  6. Interactive review (review_components, optional). When true, opens an interactive picker after the auto-detection step (if any) with the auto picks pre-marked. The user’s selection replaces (does not union with) the auto picks, so un-marks in the picker correctly drop components from the bads. Code source: IDE4EEG-side dispatcher in ica.py that decides between Svarog (if the installed Svarog jar advertises --select-mode ica_components, launches it as a subprocess showing the component sources plus pre-rendered topomap PNGs, one per component) and mne.preprocessing.ICA.plot_sources(...) (MNE in-tree, opened on the GUI main thread) as a fallback.

    selector × review_components matrix. The two knobs are orthogonal — every combination is a valid workflow:

    selector review_components Workflow
    iclabel false Auto ICLabel, no review (production, fast).
    iclabel true Auto ICLabel, then review (vet auto picks).
    both false Auto union of ICLabel + CORRMAP, no review (more aggressive automatic cleanup).
    both true Auto union, then review (vet the combined picks).
    find_bads false/true EOG/ECG/muscle reference-channel detection, optional review.
    none true Manual-only workflow (picker opens with empty preselect).
    none false No ICA cleanup applied (only the decomposition runs; useful when only ICA’s visualisation is wanted).

    TOML back-compat. Old enum values selector = "manual" / "both" that carried implicit review are migrated at load time — see ide4eeg/config_schema.py:migrate_legacy_keys.

    Batch-break footgun. Setting review_components = true is incompatible with an unattended ide4eeg --run batch — the picker opens and the pipeline halts until the user closes it. Stage 5 refuses the combination at preflight unless --review-allowed is passed (see §5.1).

  7. Apply. mne.preprocessing.ICA.apply(signal, exclude=bads) reconstructs the full signal from the kept components only. Code source: MNE-Python in-tree (Gramfort 2014). The step returns the cleaned signal directly; ica.py’s job at this point is just plumbing the exclude= list.

Why “other” is in the default keep-list. The other class covers low-confidence components the classifier isn’t sure about. Removing them by default would strip real brain activity the classifier happened to miss. Conservative default: keep, let the user opt into aggressive cleanup.

Parameter Default Description
method "picard" Decomposition algorithm: "picard" (upstream python-picard), "infomax" (MNE in-tree), or "fastica" (sklearn.decomposition.FastICA).
selector backend-aware¹ Auto-detection: "iclabel" (upstream mne_icalabel), "find_bads" (MNE in-tree EOG/ECG/muscle detectors), "both" (union), or "none".
review_components false Open the interactive Svarog / MNE picker after auto-detection. Default off (batch-safe).
fit_highpass_hz 1.0 Fit-time high-pass Hz (Winkler 2015). 0 = off. Applied to a raw.copy() only — analysis signal is untouched.
fit_lowpass_hz 0.0 Fit-time low-pass Hz. 0 = off (default). Applied to a raw.copy() only — drops muscle / HF harmonics from the ICA fit while leaving the analysis band wider.
fit_notch_hz 0.0 Fit-time notch frequency Hz. 0 = off (default). Applied to a raw.copy() only.
fit_notch_harmonics true When ON, the fit-time notch removes the fundamental + every integer harmonic up to Nyquist (50/100/150/… or 60/120/180/…). No effect when fit_notch_hz = 0.
reject_uv 500 Peak-to-peak µV — segments above this dropped from the pre-fit reject gate. 0 = off.
reject_tstep 2.0 Sliding-window length (s) MNE scores each segment against reject_uv. Forwarded to ICA.fit(tstep=...).
n_components "rank" "rank" calls mne.compute_rank(fit_copy) (SVD-based) to defend against average-reference rank loss; int or float fraction of variance also accepted.
max_iter "auto" Maximum Picard/ICA fit iterations. "auto" lets MNE pick; raise it (e.g. 1000) if the Run log warns the fit hit the iteration ceiling without converging.
iclabel_keep ["brain", "other"] Classes to keep when selector="iclabel".
iclabel_min_prob 0.0 Min classifier confidence to trust the top-1 label. A component whose top-1 probability is below this threshold falls into "other" (regardless of its actual top pick). With the default iclabel_keep containing "other", low-confidence components are kept; with iclabel_keep = ["brain"] (no "other"), they are removed. ICLabel’s CNN softmax tends to be peaky (max-class prob often ≥ 0.9), so meaningful values are typically 0.90.99. The Run-tab log line ICLabel: removing N of M components; ...; max_p range [a .. b] (median c) shows the actual distribution, so you can calibrate.
find_bads_eog true Run the EOG (find_bads_eog) detector. When on but find_bads_eog_ch is empty, EOG detection is skipped with a warning.
find_bads_eog_ch [] EOG reference channels for the CORRMAP find_bads_eog detector. Empty by default (opt-in): with a real EOG channel the GUI’s EOG tick uses it directly; without one, the GUI populates a “choose proxy EOG:” picker from the frontal EEG channels present in the file (Fp*/AF*/F*) and your picks land here. No a-priori channel names are assumed.
find_bads_ecg true Also run find_bads_ecg.
find_bads_muscle true Also run find_bads_muscle.
decim 1 Subsampling factor for the fit (passed straight to mne.preprocessing.ICA.fit).
random_state 42 RNG seed.
save false 🗄️ drawer — writes <base>-ica-cleaned-raw.fif.
save_components_audit false “Save components plot and table” — writes audit tree under artifacts_detection/ICA_EOG/.

¹ Backend-aware default. The out-of-box selector is "iclabel" when an ICLabel backend (onnxruntime or pytorch) is importable, else "find_bads" — so a vanilla (backend-free) install runs without crashing. Install a backend (pip install onnxruntime) to get ICLabel as the default.

Failure modes.

Output (when save = true). Under <preproc_dir>/saved_steps/:

When save_components_audit = true, under <preproc_dir>/artifacts_detection/ICA_EOG/:

Implementation status. IDE4EEG’s ica.py is a workflow wrapper — it ships no original numerical ICA code. The dispatcher and state container (mne.preprocessing.ICA), the PCA pre-whitening, the rank estimator mne.compute_rank, the find_bads_* detectors, ICA.apply, and ICA.plot_* are all MNE-Python in-tree (Gramfort 2014). The decomposition kernels and the ICLabel classifier live in separately released upstream packages:

What IDE4EEG’s ica.py itself contributes:

References:

Ablin P, Cardoso JF, Gramfort A (2018) “Faster Independent Component Analysis by Preconditioning With Hessian Approximations.” IEEE Trans Signal Process 66(15):4040–4049. DOI: 10.1109/TSP.2018.2844203. — Picard, the default method= algorithm; lives in upstream python-picard.

Bell AJ, Sejnowski TJ (1995) “An Information-Maximization Approach to Blind Separation and Blind Deconvolution.” Neural Comput 7(6):1129–1159. DOI: 10.1162/neco.1995.7.6.1129. — Original Infomax; the algorithm behind method="infomax" (MNE in-tree).

Lee TW, Girolami M, Sejnowski TJ (1999) “Independent Component Analysis Using an Extended Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources.” Neural Comput 11(2):417–441. DOI: 10.1162/089976699300016719. — Extended Infomax; the objective Picard accelerates and what method="infomax" produces with fit_params(extended=True).

Hyvärinen A (1999) “Fast and Robust Fixed-Point Algorithms for Independent Component Analysis.” IEEE Trans Neural Netw 10(3):626–634. DOI: 10.1109/72.761722. — FastICA; the algorithm behind method="fastica" (lives in sklearn.decomposition.FastICA).

Hyvärinen A, Oja E (2000) “Independent component analysis: algorithms and applications.” Neural Networks 13(4-5):411–430. DOI: 10.1016/S0893-6080(00)00026-5. — Canonical exposition of the ICA algorithmic family; cited for the pre-whitening requirement (step 3): all three method= choices require zero-mean, unit-variance, decorrelated input.

Winkler I, Debener S, Müller KR, Tangermann M (2015) “On the Influence of High-Pass Filtering on ICA-Based Artifact Reduction in EEG-ERP.” Proc. 37th Annual Int. Conf. IEEE EMBC, 4101–4105. DOI: 10.1109/EMBC.2015.7319296. — The 1 Hz fit-time high-pass (fit_highpass_hz).

Pion-Tonachini L, Kreutz-Delgado K, Makeig S (2019) “ICLabel: An Automated Electroencephalographic Independent Component Classifier, Dataset, and Website.” NeuroImage 198:181–197. DOI: 10.1016/j.neuroimage.2019.05.026. — The ICLabel CNN classifier behind selector="iclabel" (originally MATLAB/EEGLAB).

Li A, Feitelberg J, Saini AP, Höchenberger R, Scheltienne M (2022) “MNE-ICALabel: Automatically Annotating ICA Components with ICLabel in Python.” J Open Source Softw 7(76):4484. DOI: 10.21105/joss.04484. — The Python port we actually import (mne_icalabel.label_components); the original ICLabel shipped only as a MATLAB/EEGLAB plugin.

Dammers J, Schiek M, Boers F, Silex C, Zvyagintsev M, Pietrzyk U, Mathiak K (2008) “Integration of amplitude and phase statistics for complete artifact removal in independent components of neuromagnetic recordings.” IEEE Trans Biomed Eng 55(10):2353–2362. DOI: 10.1109/TBME.2008.926677. — Cross-trial phase statistics behind MNE’s find_bads_ecg.

Dharmaprani D, Nguyen HK, Lewis TW, DeLosAngeles D, Willoughby JO, Pope KJ (2016) “A comparison of independent component analysis algorithms and measures to discriminate between EEG and artifact components.” EMBC 825–828. DOI: 10.1109/EMBC.2016.7590833. — The three-criterion (slope + peripheral power + spatial smoothness) detector MNE’s find_bads_muscle implements.

Whitham EM, Pope KJ, Fitzgibbon SP, Lewis TW, Clark CR, Loveless S, Broberg M, Wallace A, DeLosAngeles D, Lillie P, Hardy A, Fronsko R, Pulbrook A, Willoughby JO (2007) “Scalp electrical recording during paralysis: quantitative evidence that EEG frequencies above 20 Hz are contaminated by EMG.” Clin Neurophysiol 118(8):1877–1888. DOI: 10.1016/j.clinph.2007.04.027. — Paralyzed-subject EMG reference dataset that the muscle-detector thresholds are tuned against.

Delorme A, Makeig S (2004) “EEGLAB: An Open Source Toolbox for Analysis of Single-Trial EEG Dynamics Including Independent Component Analysis.” J Neurosci Methods 134(1):9–21. DOI: 10.1016/j.jneumeth.2003.10.009. — Background for reject_uv (peak-to-peak amplitude rejection as a standard pre-ICA step).

Onton J, Westerfield M, Townsend J, Makeig S (2006) “Imaging Human EEG Dynamics Using Independent Component Analysis.” Neurosci Biobehav Rev 30(6):808–822. DOI: 10.1016/j.neubiorev.2006.06.007. — Canonical ICA-for-EEG best-practices review (motivates pre-ICA hygiene including reject_uv).

Gramfort A et al. (2014) “MNE software for processing MEG and EEG data.” NeuroImage 86:446–460. DOI: 10.1016/j.neuroimage.2013.10.027. — The framework providing mne.preprocessing.ICA (dispatcher, PCA whitening, state container, apply, plot_*) and mne.compute_rank.

3.10 Channel selection (manual exclusion)

GUI panel: Select channels (checkable, per-step Save)

Use this step to restrict analysis to a chosen channel subset, or to exclude named channels you don’t want — typically non-EEG technical channels (e.g. Photo, Audio, Sample_Counter). Nothing is dropped automatically; only channels you list in dropped_channels (or omit from selected_channels) are removed.

Algorithm.

  1. Parse selected_channels ("all" = keep everything, or a list of names).
  2. Parse dropped_channels (names to exclude).
  3. Final channel set: selected_channels \ dropped_channels.
  4. Drop all other channels from the MNE Raw object.
  5. Optional interactive review (check_final_channels, default true): open MNE’s signal browser for visual confirmation.
Parameter Default Description
dropped_channels [] Channels to exclude entirely (e.g. non-EEG technical channels); empty = drop none.
selected_channels "all" Channels to keep. "all" keeps everything except those in dropped_channels.

Channel types (eeg/eog/ecg/…) are reviewed and overridden separately on the Input tab — see §2.3 Channel types. The selection here acts after typing.

3.11 Matching Pursuit decomposition

GUI panel: MP Decomposition (checkable, save is a locked indicator — on whenever the step is enabled, since the .db book is the step output, has [Run])

Decomposes the EEG signal into a sum of Gabor atoms — Gaussian-windowed sinusoids that are optimally matched to the signal’s time-frequency content. The result is an adaptive, parametric representation stored in a persistent SQLite book file (.db).

This step does not modify the signal — unlike the transform steps (and unlike MP filter, which reconstructs the signal), MP decomposition taps the signal, writes the .db atom book, and passes the signal through unchanged to the next step. The book is consumed downstream by the Analysis-tab EEG Profiles and Dipole analyses (via their mp_book_source = "decomposition" setting). In the default step order it sits right after Filtering / MP filter and before ICA, so by default it decomposes the filtered, pre-ICA signal; reorder it (↑/↓) to decompose at any other point in the chain. Its panel view button is 📖 (opens the Book Viewer), not the 👁 before/after signal view the transform steps use. Off by default.

Requires the empi binary. If Svarog is installed, empi is auto-detected from the mp/ subdirectory.

Mathematical foundation. Matching Pursuit (Mallat & Zhang 1993) is a greedy iterative algorithm:

  1. Start with the full signal as the residual: R_0(t) = x(t).
  2. At iteration n, find the Gabor atom g_n that maximises the inner product with the residual: g_n = argmax_g |<R_n, g>|.
  3. Subtract the atom: R_{n+1}(t) = R_n(t) - <R_n, g_n> * g_n(t).
  4. Repeat until residual energy drops below (1 - explained_energy) of the original, or iterations atoms are extracted.

Each Gabor atom is parameterised as:

g(t) = A * exp(-pi * (t - t0)^2 / s^2) * cos(2*pi*f*t + phi)

where A = amplitude, t0 = centre time, s = scale (temporal width), f = frequency, phi = phase.

Multivariate MP (MMP) extends the algorithm to multiple channels simultaneously: at each iteration, the atom parameters (t0, s, f) are shared across all channels, while amplitude and phase are fitted per channel. This produces “macroatoms” that describe coherent multi-channel activity.

empi (Różański 2024) is a GPU-accelerated implementation that uses continuous parameter optimisation over an epsilon-dense dictionary, achieving faster convergence and more precise atom placement than discrete-dictionary approaches.

Algorithm types.

Algorithm Key Description
SMP smp Single-channel MP: each channel decomposed independently.
MMP1 mmp1 Multivariate MP, constant phase across derivations.
MMP3 mmp3 Multivariate MP, variable phase.

MMP requires ≥2 channels. MMP1 and MMP3 are multivariate — they fit atoms shared across channels, so they need at least two channels to decompose. On single-channel input (a single derivation, or matching_pursuit.channels narrowed to one) empi rejects an MMP run and exits with an error. IDE4EEG’s preflight catches this before the run and asks you to switch to SMP or select more channels — SMP (each channel decomposed independently) is the correct choice for single-channel recordings.

Parameters exposed in the GUI (Preprocess tab → MP Decomposition panel):

Parameter TOML key Default GUI control Description
Algorithm matching_pursuit.algorithm "smp" dropdown SMP / MMP1 / MMP3 — see the table above.
Channels matching_pursuit.channels all EEG channel-pick panel Subset of channels to decompose; empty = every EEG channel.
Empi binary path matching_pursuit.empi_path auto file picker Path to the empi executable. Auto-detected from Svarog’s mp/ subdir if Svarog is installed.
Iterations matching_pursuit.iterations 30 spin Maximum atoms to extract. 0 = no cap (then explained_energy is the only stop condition).
Explained energy matching_pursuit.explained_energy 0.99 percent edit Stop when this fraction of total energy is explained. 1.0 = stop only at the iteration cap.
CPU workers matching_pursuit.cpu_workers inherit spin (0 = inherit) Override for empi worker count — see the CPU workers note below.

Fixed defaults (round-tripped through config.toml but not editable from the GUI):

TOML key Value Why fixed
matching_pursuit.optimization "global" Continuous-dictionary refinement gives the best atom localisation accuracy; the "local" and "none" paths exist in empi for benchmarking against legacy implementations. Change in the TOML only if you know why.
matching_pursuit.include_delta true Vestigial: empi always runs with both Dirac-delta and Gabor atoms (--delta --gabor). Retained for config round-trip compatibility but not read when building the empi command.
matching_pursuit.energy_error 0.01 Dictionary density (ε²). Smaller = denser dictionary = more precise but quadratically slower. 0.01 is tuned for the accuracy / runtime balance of typical EEG.
matching_pursuit.full_atoms_in_signal false Vestigial: not wired to empi — no --full-atoms-in-signal flag is emitted, so this key currently has no effect (edge-atom handling is fixed by empi’s default). Retained for config round-trip compatibility.

CPU workers — why empi has its own override, separate from parallelism.n_jobs.

Empi’s parallelism uses a fundamentally different cost model than the joblib/loky workers that drive every other parallelisable step (MNE filter / resample, TFR, FOOOF, …). The pipeline-level parallelism.n_jobs heuristic (see §1.3) caps worker count by ram_gb // 2 because each loky worker spawns a fresh Python interpreter — typically 200–500 MB of resident memory per worker once MNE + NumPy are loaded. On a 16 GB box parallelism.n_jobs would top out around 8.

Empi runs as a single C++ subprocess; its workers share the atom dictionary and signal buffer through one address space. Total RSS is roughly 100 MB regardless of how many workers you configure — there is no per-worker Python interpreter to pay for. So a machine that’s appropriately set to parallelism.n_jobs = 4 for the rest of the pipeline can comfortably run matching_pursuit.cpu_workers = 8 (or as many as the CPU has physical cores). The override exists precisely to break out of the RAM-bound heuristic when the underlying workload is RAM-cheap.

Resolution order at empi invocation:

  1. matching_pursuit.cpu_workers > 0 — explicit user override (takes precedence).
  2. Otherwise, parallelism.n_jobs from [parallelism] (the unified pipeline knob).
  3. 1 as the final fallback for tests and callers that didn’t inject anything.

Each empi worker runs single-threaded (--cpu-threads 1); Gabor-dictionary matching is not BLAS-heavy, so adding BLAS threads per worker hurts more than it helps.

Output. SQLite .db book file containing all atoms with parameters: segment, channel, iteration, frequency, scale, amplitude, energy, phase. Consumed by EEG profiles and dipole fitting when those analyses have mp_book_source = "decomposition" (the default). MP filter has its OWN internal book produced independently — see §3.12.

MP book reuse — hash-controlled. MP decomposition is one of the most expensive preprocessing steps (minutes to hours). To avoid recomputing when nothing relevant has changed, MP follows a hash-controlled reuse policy. The freshness check runs whether or not mp_decomposition is in step_order. See §3.20.5 MP book reuse for the full policy table.

References:

Mallat SG, Zhang Z (1993) “Matching Pursuits with Time-Frequency Dictionaries.” IEEE Trans Signal Process 41(12):3397–3415. DOI: 10.1109/78.258082

Kuś R, Różański PT, Durka PJ (2013) “Multivariate matching pursuit in optimal Gabor dictionaries.” BioMed Eng OnLine 12:94. DOI: 10.1186/1475-925X-12-94

Różański PT (2024) “empi: GPU-Accelerated Matching Pursuit with Continuous Dictionaries.” ACM Trans Math Softw 50(3):17. DOI: 10.1145/3674832

3.12 MP filter

GUI panel: MP filter — sits right after Filtering, before ICA in the default step order (checkable, per-step Save). Prerequisite: none — the step runs its own internal MP decomposition independently of the standalone MP Decomposition step.

Reconstructs the signal from a selected subset of Matching Pursuit atoms, implementing a nonlinear filter. Unlike conventional linear filters which operate in the frequency domain, MP filtering can select atoms based on combined time-frequency-energy criteria, removing or keeping specific signal components with atomic precision.

Why an internal decomposition?

The standalone MP Decomposition step decomposes inside the pipeline’s segments — rest windows in rest mode, event-locked epochs in epochs mode. In epochs mode the atoms are too short to be useful for time-domain reconstruction across the whole recording. MP filter solves that by running its own internal decomposition in rest mode regardless of the pipeline’s segmentation mode, producing a separate .db book at <output>/mp_filter/book_<base>_filter.db that lives alongside the standalone book at <output>/mp_decomposition/. The two coexist; downstream analyses (EEG Profiles, MP-Dipoles) can choose either via the per-analysis mp_book_source setting on their Analysis-tab panels.

The book pulldown on those panels also offers a third option — “Use existing book file…” — which opens a file picker so you can point the analysis at any pre-existing .db book on disk (e.g. one produced in an earlier session, or by hand) without re-running a decomposition step. This sets mp_book_source = "file" and stores the chosen path in mp_book_file. Unlike the older auto-discovery this replaced, the file is explicit (no ambiguous directory scan); a missing or empty selection is caught before the run with a clear message. For MMP-dipoles the picked book must be an MMP1 book — this is verified when the book is opened (from its stored algorithm metadata).

Algorithm.

  1. Resolve the internal-decomp window length: [mp_filter].window_length (explicit override) → [rest].window_length (inherited) → a whole-signal window when [rest].whole_signal is set → 20 s (final fallback).
  2. Cut the signal into non-overlapping window_length-second rest windows from sample 0.
  3. Run empi on the concatenated buffer using [mp_filter.decomp] parameters (same shape as [matching_pursuit]: algorithm, iterations, explained_energy, channels, etc.). Skip empi when the on-disk book’s _input_hash matches the freshly-computed _compute_mp_filter_book_hash digest (reuse semantics analogous to standalone MP — editing reconstruction-only [mp_filter.filter] fields never invalidates the book).
  4. For each window and channel, filter the atom list by user-specified [mp_filter.filter] criteria (frequency range, scale range, amplitude range, time range, energy threshold, iteration cutoff).
  5. Depending on [mp_filter.filter].mode:
  6. Replace each window’s samples in the Raw object with the reconstruction.

Parameters — internal decomposition ([mp_filter.decomp], mirrors [matching_pursuit])

Parameter Default Description
algorithm "smp" SMP / MMP1 / MMP3 — see §3.11 table.
iterations 30 Max atoms per channel/segment.
explained_energy 0.99 Stop fraction; 1.0 = iteration cap only.
channels "" Subset of channels to decompose; empty = all EEG channels (falls back to all non-STIM data channels only if no EEG channels are present). Mirrors §3.11.
cpu_workers 0 Inherit from parallelism.n_jobs (override per §3.11).

Parameters — window length ([mp_filter])

Parameter Default Description
window_length 0 Seconds per internal rest window. 0 = inherit from [rest].window_length (default 20 s).

Parameters — reconstruction criteria ([mp_filter.filter], “osc. EEG” defaults)

The preset combo at the top of the criteria table offers osc. EEG (f>0.5 Hz, w>0.1 s) as the default selection — oscillatory atoms in the physiological EEG band, rejecting slow drift (< 0.5 Hz), impulse-like spikes (< 0.1 s), and the 50 Hz line and harmonics (> 49 Hz). Picking a different EEG-structure preset (Sleep spindles, Alpha, …) populates the same row.

Parameter Default Description
mode "keep" "keep" or "remove" matching atoms.
freq_min 0.5 Atoms with frequency ≥ this (Hz). 0 = no constraint.
freq_max 49 Atoms with frequency ≤ this (Hz). 0 = no constraint.
amp_min 0 Atom peak-to-peak amplitude lower bound (µV).
amp_max 0 Upper bound (µV).
scale_min 0.1 Atom scale (width) lower bound (s).
scale_max 0 Upper bound (s).
time_min 0 Atom-centre time lower bound (s, in-window).
time_max 0 Upper bound.
min_energy 0 Atom-energy lower bound.
max_iterations 0 Only first N atoms (by iteration). 0 = all.

Book reuse. Editing reconstruction criteria (the [mp_filter.filter] sub-block) does not trigger an empi rerun — the filter operates on already-decomposed atoms. Editing the internal decomp parameters or window_length DOES invalidate, and the next run re-executes empi. See §3.20.5 for the staleness machinery.

Position in the pipeline. The default step_order puts MP filter right after Filtering, before ICA — so it reconstructs the filtered, pre-ICA signal. Being before artifact detection also means the artifact-rejection step sees the reconstructed signal — usually the user’s intent for “denoise before scoring”. Users can reorder via the panel’s ↑/↓ arrows; the historical hard constraint mp_decomposition → mp_filter was dropped (the two steps are now independent).

3.13 Detect EEG artifacts

GUI panel: Detect EEG artifacts (checkable, reorderable, on by default). The EEG-artifact detection suite: a robust-statistics, multi-scale family of detectors (p2p, slope, hf, flat) that mark contaminated spans for the downstream Mark detected artifacts stage. It runs by default — uncheck the panel to disable it. Its 🗄️ drawer is a locked indicator (lit while the step is active), not a toggle: the per-channel marks -tag.txt is the step’s primary output and is written unconditionally when the step runs (a marking step has no signal snapshot to gate). (The previous Klekowicz-2009 amplitude/slope/muscle detector was retired; the legacy-artifacts git tag preserves that code for anyone comparing.)

Tag-first — it marks, it does not drop. Each detector marks artifacts as a channel-scoped BAD_<type> carrier on the signal (MNE annotations, with the robust-z folded in as BAD_<type>|z=…), and writes a per-channel eeg_artifacts/<file>-tag.txt review file (MNE’s channel-scoped annotation .txt format, which replaced the legacy BrainTech XML .tag for these detectors) — written unconditionally when the step runs, since the marks file is the step’s primary output. It leaves the signal and segments intact, in either segmentation mode. Automatic rejection is a later phase.

Amplitudes, muscle, flat: p2p, slope, hf, flat

Four tag-first detectors. p2p and slope are per-sample detectors that share a detection frame and run on a dyadic ladder of window widths L = 2, 4, 8, … (octave-spaced), derived per sampling rate from physical-millisecond constants. Each scale is robust-z’d over the whole recording (median/MAD) and the scales are combined by max-z. The ladder is split at the crossover (50 ms): the slope detector owns the rungs below it (the steep band), p2p the rungs above (the amplitude band), up to max scale (300 ms; slower events are the drift detector’s / analysis high-pass’s job). hf is a windowed band-power detector on its own high-frequency band — a different method for a different artifact class (sustained muscle/EMG rather than transient steps and pops). flat is the low-amplitude counterpart of p2p — the low tail of the same peak-to-peak feature, detecting transient flat/dropout spans.

Shared detection frame. All four run on a throwaway EEG-only copy with a mains notch only (50/60 Hz + optional harmonics — the full comb matters for hf, which spans the harmonics; no high-pass — p2p is DC-invariant and the slope is its own high-pass; no low-pass — it would smooth the sharp structures and erase the hf band). p2p, slope and flat read the wide notched signal directly (flat from its running peak-to-peak); hf additionally band-passes that copy to its high band. All emit one channel-scoped annotation (and one -tag.txt row) per contiguous flagged run per channel, with no padding or merge-gap — the run is the detection; extent comes from the multi-scale / multi-window response and final decisions are deferred to the aggregation/review stage. (The slope detector blanks a boundary region of width ≈ half the coarsest slope rung at each end of the recording, where the convolution edge would otherwise create phantom tags.)

Parameters (the panel)

Field Default Meaning
Notch 50 mains-notch frequency (off / 50 / 60 Hz), plus harmonics
Robust-z (p2p / slope / hf) 5.0 / 5.0 / 5.0 flag a scale/window where abs(z) exceeds it — a separate field in each detector’s box
flat (ptp/median <) 0.2 flat when a window’s peak-to-peak drops below this fraction of the channel’s median ptp
ignore jumps < 50 µV optional floor: ignore deflections smaller than this, and the too-clean-channel guard (uncheck = off)
p2p on run the peak-to-peak amplitude detector
slope on run the regression-slope detector
muscle (hf) on run the high-frequency muscle/EMG band-power detector
flat on run the relative flat-span detector (low tail of p2p)

Config-only constants (in [eeg_artifacts], no GUI field — physics/expert constants): crossover_ms (50), max_scale_ms (300), slope_min_samples (2, the first-difference floor), min_windows (20, the n_eff = n / L_coarsest stability gate — applied to both dyadic detectors, each at its own coarsest rung, so p2p with the coarser rungs trips first — see the window-count feedback below), the muscle band/window: hf_lo_hz (90), hf_hi_cap_hz (250, upper-edge cap before the 0.45·fs Nyquist clamp), hf_window_s (0.5); and the flat-span windows: flat_window_s (0.2, the ptp window for flatness), flat_min_duration_s (0.5, the shortest flat run worth marking).

The amplitude floor is adaptive (per recording). “Floor: k = N × channel median p2p” (floor_rel_k, default 1.0) sets the p2p and slope amplitude floor relative to each channel’s own amplitude: floor = floor_rel_k · median_ptp, where median_ptp is the per-channel median of the running 300 ms peak-to-peak (the same max−min scale the flat detector uses). This tracks each recording’s gain/montage instead of a fixed µV value. A multi-dataset calibration sweep (docs/_eeg_artifacts_v2_sweep.py, findings in .gitlab/RESEARCH_v2_calibration_sweep.md) showed the old fixed 50 µV floor suppressed 0.00 → 0.77 of flags purely as a function of recording amplitude — over-suppressing low-amplitude ERP (median_ptp ≈ 20 µV) and a no-op on high-amplitude recordings (median_ptp ≈ 66 µV; a 3.3× cross-dataset spread) — while a relative floor reproduced the same average aggression with under half the variance. Set floor_rel_k = 0 to fall back to the legacy absolute µV floors below.

The absolute floor doubles as a guard (legacy path, floor_rel_k = 0). “Amplitude floor = N µV” (min_amplitude_uv) sets the smallest p2p deflection worth tagging and floors the robust scale so a near-flat or dead channel cannot blow up its own z. Used only when floor_rel_k = 0; set it as a deflection height in µV, or 0 to flag on robust-z alone.

The slope has its own absolute anchor (legacy path, floor_rel_k = 0). “Slope floor = N µV step” (slope_min_amplitude_uv, default 50, falls back to min_amplitude_uv for legacy configs) is the slope detector’s absolute backstop, used only when floor_rel_k = 0decoupled from the p2p floor so the two sensitivities can be tuned independently — entered as a step amplitude in µV, not a µV/ms slope. Internally it is converted per scale to a µV/ms backstop, A · P(L) · fs/1000 (see “Why the slope backstop is a µV step, not a µV/ms slope” below), so a step taller than A trips the slope detector at every dyadic scale. With the default adaptive floor the anchor A is instead floor_rel_k · median_ptp per channel.

Why the slope backstop is a µV step, not a µV/ms slope. The slope detector reads a least-squares regression slope over each dyadic window L; for a step of height h, that reading is h · P(L) · fs/1000 µV/ms, where P(L) = Σ_{k>0} c_k is the kernel’s step-response gain (the slope a unit step produces at scale L; P(L) ≈ 1.5/L, computed exactly from the kernel). A short window sees a step as steep, a long window “dilutes” it — so the same step reads a different µV/ms at every scale. The backstop is set to A · P(L) · fs/1000 — exactly the slope an A-µV step would produce at L — so the flag test |slope| > backstop reduces to h > A: the P(L)·fs/1000 cancels, and one µV step amplitude gives a consistent, scale- and fs-invariant threshold across the whole bank. A raw µV/ms number could not: it would mean a different step size at every rung and drift with the sampling rate.

The amplitude ceiling (Advanced, off by default). Its symmetric partner: “Amplitude ceiling = N µV” (p2p_ceiling_uv, 0 = off) also flags any p2p window above an absolute µV level, regardless of robust-z. A channel whose amplitude is uniformly huge has its MAD scale with it, so the per-channel robust-z never fires — only an absolute ceiling catches that gross-excursion class. It operates on the p2p detector’s sliding dyadic windows (crossover_msmax_scale_ms, ≤300 ms), not the whole segment — so it catches fast / local gross excursions; a slow whole-segment swing (large only over tens of seconds) is the high-pass / drift detector’s job, by design. (This is the upstream home that replaced the legacy [segment_rejection] peak-to-peak reject_threshold — for its fast-excursion subset; that criterion has been retired, see §3.16 and docs/EEG_artifacts_marking.pdf §“Consolidation as pure aggregation”.)

Window-count feedback. The panel warns when the trimmed signal is too short for a stable whole-record reference at the coarsest p2p scale (n_eff = n / L): 20–40 effective windows → an orange advisory; under 20 → a red message and p2p is switched off.

Defaults lean sensitive — raise z on busy data. The per-detector z = 5 is a deliberately sensitive default (tag-first: mark, don’t drop). A check on MNE-Python’s sample dataset against its own EOG blink detector (docs/_eeg_artifacts_v2_validation.py) found z = 5 over-flags that (busy ERP) recording ~10×, while z = 8–10 still recovers ~97 % of the blinks with far fewer tags. So on real, busy EEG consider raising the p2p / slope z; a single dataset can’t establish a resting/clinical default, so tune per recording / artifact type. (The multi-dataset calibration sweep — docs/_eeg_artifacts_v2_sweep.py — found z = 5 sits on the knee for every dataset and gave no signal to move it; its main outcome was the adaptive amplitude floor above.)

Full specification, literature, and design rationale: docs/EEG_artifacts_detectors.pdf and docs/eeg_artifacts_v2_implementation_plan.md (committed under docs/).

3.14 Gaze (video artifact detection)

GUI panel: Detect gaze artifacts (checkable, reorderable; its 🗄️ drawer is a locked indicator — the marks -tag.txt is written unconditionally when the step runs, like the EEG-artifacts detector)

Identifies time intervals where the subject is not looking at the screen, using a synchronised video recording. These intervals are marked as not_looking artifacts and excluded from EEG analysis. Particularly useful for infant EEG studies where attention monitoring is essential.

The pipeline processes each video frame through two stages.

Stage 1: Face detection and identity (InsightFace). InsightFace is the only supported backend (facetag_backend = "insightface").

Component Model What it does
Detection RetinaFace Finds all face bounding boxes in each frame
Identity ArcFace Computes a 512-d embedding per face, compared to reference photos
Tracking Custom Adaptive encoding + spatial/size/switch tracking across frames

InsightFace also produces 5 facial keypoints (left eye, right eye, nose tip, left and right mouth corners) as a byproduct of detection. These are reused by the InsightFace gaze backend.

Requirements: pip install insightface onnxruntime. Models (~200 MB) are downloaded automatically on first use to ~/.obci/ide4eeg/insightface/models/buffalo_l/. (IDE4EEG redirects InsightFace away from its upstream default ~/.insightface/ so all of our runtime data lives under one tree; existing users with ~/.insightface/ get migrated once on first launch.)

Stage 2: Gaze estimation. Two backends, selected via facetag_gaze_method:

Backend Config value What it measures Extra deps Speed
L2CS-Net (default) "l2cs" Eye-gaze direction (where the eyes are actually looking; ResNet50, 3.92° MAE on MPIIGaze) l2cs + PyTorch (~2 GB) ~20 ms / frame
InsightFace "insightface" Head pose only (where the head is pointing; reuses 5 keypoints) None (reuses Stage 1) ~0 ms extra

The two backends are categorically different. L2CS-Net answers “where are the eyes looking?”. InsightFace answers “where is the head pointing?”. A subject whose head faces the camera but whose eyes are darting to a side monitor is “looking at the screen” by InsightFace’s measure but not by L2CS-Net’s. Pick L2CS-Net unless you have a specific reason to fall back.

Choosing a backend. Default "l2cs" — true eye-gaze direction is what most experimental designs want. Use "insightface" only when PyTorch can’t be installed (e.g. Intel Mac on a Python version without torch wheels) or when L2CS-Net fails on a particular video (rare; e.g. very low-resolution face crops). For maximum throughput combine either backend with facetag_frame_skip = 3–5.

Parameters. All parameters are top-level (not in a TOML section).

Parameter Default Description
prepare_video_artifacts false Enable video artifact detection.
video_path "" Path to .mp4 video file. Auto-detected from EEG filename if empty.
video_reference_dir "" Directory with 3–5 reference photos of the subject looking at the camera. Empty = the GUI’s Reference faces picker creates <output>/IDE4EEG_OUT_<base>/preprocessing/reference_faces/. Batch (--run) caveat: there is no picker — an empty or non-existent video_reference_dir makes the gaze step silently skip (a WARNING is logged and the run continues, no error), so set it explicitly for headless runs or gaze detection does nothing.
video_exclude_dir "" Directory with photos of faces to exclude (e.g. mother). Empty = disabled.
facetag_backend "insightface" Face detection/identity backend. Only "insightface" is supported.
facetag_gaze_method "l2cs" Gaze backend: "insightface" or "l2cs".
facetag_max_angle 20.0 Maximum angle (degrees) between current and reference gaze vectors to classify as “looking”.
facetag_gaze_smoothing 0.0 Temporal smoothing (EMA alpha, 0–1). 0 = off; 0.3 = moderate; 0.1 = heavy.
facetag_frame_skip 3 Process every Nth frame (1 = every frame).
facetag_downscale 1.0 Resize factor before face detection (0.5 = half resolution).
facetag_det_size 0 InsightFace detection resolution. 0 = full frame; 640 = fast default.
facetag_face_tolerance 0.6 Maximum encoding distance for face match.
facetag_min_interval 0.0 Minimum duration (seconds) for a not-looking interval.
facetag_no_face_is_artifact true Treat frames with no detected face as “not looking”.
facetag_ref_use_roi false Use face ROI (not full image) for reference photo gaze vectors.
facetag_tracking_adapt_rate 0.5 Rate of adaptive encoding update (0 = static, 1 = instant).
facetag_position_weight 0.4 Weight of spatial continuity in face tracking score.
facetag_size_weight 0.2 Weight of face size consistency in tracking score.
facetag_switch_resistance 0.25 Extra cost for switching to a different face.

Outputs. When prepare_video_artifacts = true:

The Run-tab Stop button cancels the video frame loop cooperatively — cancel_event.is_set() is checked once per frame inside facetag_video, so a multi-hour video doesn’t keep running after Stop.

Speed tips.

These options combine multiplicatively.

3.15 Cut segments

After all reorderable steps complete, five fixed panels handle the final stages: cutting the continuous signal into segments, dropping bad ones (with optional manual review), saving the final epochs, and computing per-segment statistics.

The rejection rule. A segment is dropped from the same upstream artifact marks (there is no independent detector at this stage): the per-segment artifact rule (§3.16) — drop when any channel’s contaminated-sample fraction exceeds artifact_threshold or its longest contiguous run reaches artifact_min_run_ms (G1), both knobs in [segment_rejection]. (The cut itself uses reject=None: MNE’s single-sample reject_dict was deliberately replaced by these proportion/run-based checks — issue #11.)

GUI panel: Cut Segments (always enabled, non-checkable)

Slices the continuous preprocessed signal into discrete segments (MNE Epochs objects) for analysis.

Uses MNE’s mne.Epochs() constructor with reject=None (and reject_by_annotation=False) — the cut itself drops nothing. Bad segments are derived afterwards in §3.16 from the upstream artifact marks; MNE’s single-sample reject_dict/flat_dict were deliberately removed (issue #11).

3.16 Mark detected artifacts

GUI panel: Mark detected artifacts — checkable header that auto-follows the detectors (off + disabled when neither Detect step is on, on otherwise; it’s a pure indicator, no pipeline effect of its own). Same title in both modes.

[Mark detected artifacts] owns no detector (it never decides what an artifact is — that is settled upstream), but it does two things, and only the second is dropping. (1) It conditions the single channel-scoped carrier (the BAD_<type> marks the Detect-EEG-artifacts p2p / slope / muscle / flat detectors and gaze attach) into the final marks — per-type floor / consolidate / dilate (the “Adjust time spans of marked artifacts” panel below). These conditioned marks are the one product every consumer reads: the gray overlay band, the continuous-mode PSD omission, EEG-profile atom rejection, and the segment drop — so they all agree by construction. This conditioning runs regardless of whether you drop, and is the panel’s primary work in continuous mode (where the drop is off by default). (2) It optionally drops segments scored against those marks via the “Drop marked” checkbox. The independent peak-to-peak reject_threshold / flat_threshold detection criteria of earlier versions were retired (the absolute amplitude ceiling moved upstream to the p2p detector as p2p_ceiling_uv, and a relative flat-span detector moved upstream as flat — both in §3.13), and the fixed-fraction auto_percentile tail-drop was removed (it discarded a fixed share of segments regardless of the marks).

What it does own is its aggregation policy — the knobs (in the [Mark detected artifacts] panel, revealed when “Drop marked” is ticked) that decide how the consolidated marks collapse into a drop. The policy reads as one line: “epoch bad in 1 channel if artifacts > N % or > M ms; whole epoch bad if > P % of channels bad”. There are no enable checkboxes — a 0 in any field disables that gate (the field is the toggle):

The GUI derives enable_fraction / enable_run from each field being > 0; a config that disables a gate by flag is shown with that field at 0.

👁 eye button (mode-aware). In continuous mode with the window-drop off (mark-only — the default), the eye opens a layered marks overlay in Svarog rather than a segment review: a gray conditioned band (the floor→consolidate→dilate union the spectra omit) with the per-channel colored source marks (p2p / slope / hf / flat / gaze) superimposed on top. The Show detailed source marks checkbox (continuous mode) sets whether that colored layer is shown by default — it stays toggleable live inside Svarog. In event-locked mode, or continuous with “Drop marked” ticked, the eye opens the whole-segment review instead (windows are the unit being dropped there). The layered overlay needs a Svarog jar that implements the --marks-* layering flags; older jars show a flat overlay.

Parameter Default Description
auto_reject true Event-locked drop toggle — the “Drop marked” checkbox. Checked → drop the marked epochs; unchecked → mark-only (cut/save/stats run, nothing drops). Ignored in continuous mode — see drop_in_continuous.
drop_in_continuous false Continuous drop toggle — the “Drop marked” checkbox in continuous mode, off by default (dropping fixed windows breaks the signal’s continuity ⚠). The PSD spectra (reject_by_annotation) and EEG profiles (via the conditioned artifact band) omit marked spans on the continuous signal; the remaining consumers — connectivity, dipole, TFR, per-segment stats, the saved *_clean-epo.fif — run on the cut windows (rest_clean) and see contaminated data unless this is on. The panel’s epoch-drop criteria are hidden until “Drop marked” is ticked.
artifact_threshold 0.3 Per-channel contaminated-sample fraction (0–1) that makes a channel bad; the GUI shows it as a percent (30 %). 0 % = gate off. Lower = stricter.
artifact_min_run_ms 200.0 G1 run-length trigger (ms), OR-ed with the fraction rule. 0 = off (fraction-only).
enable_fraction true Derived from the % field being > 0 (no checkbox); 0 % writes false.
enable_run true Derived from the ms field being > 0 (no checkbox); 0 ms writes false.
enable_consensus true Always written true (the quorum is always applied); consensus_frac = 0 reproduces the any-channel rule. A legacy false config still means any-channel.
consensus_frac 0.0 Quorum: drop only when more than this fraction of channels are bad (GUI shows a percent). 0 = >0 % = any single bad channel.
min_mark_ms all 0 Per-type {p2p, slope, hf, flat, gaze} duration floor (ms) widening short marks.
dilate_ms all 0 Per-type symmetric pad (ms); for p2p/slope/hf it scales per mark by × (z − z_thr + 1).
consolidate_ms all 100 Per-type gap-closing (ms): merge same-type marks whose clean gap is shorter than this (applied only when its per-type guard is on).
consolidate_enable all false Per-type guard for consolidate_ms (the “consolidate gaps” checkbox); off by default.

The TOML section is [segment_rejection]; the keys above are its policy. artifact_threshold / artifact_min_run_ms were moved here from [artifacts] (2026-06-13) — they are the aggregator’s own decision, not a detector knob; an old config that still carries them under [artifacts] is read for back-compat. The retired reject_threshold / flat_threshold / enabled / auto_percentile keys are no longer read (a stale one in an old config is harmless) — see docs/EEG_artifacts_marking.tex §“Consolidation as pure aggregation”.

3.17 Manual segment review

When Review segments manually is checked in the Mark detected artifacts panel (or rest.check_rest / epochs.check_epochs is true in TOML), an interactive epoch browser opens after Drop. Same Svarog → MNE dispatch as bad-channels and ICA: if the installed jar advertises --select-mode bad_epochs, Svarog opens the bad-epoch panel; otherwise MNE’s epochs.plot(...) runs on the GUI main thread.

The Mark detected artifacts panel’s eye button (👁) triggers a one-off preprocessing-only run (analyses skipped) terminating at this review — useful when you want to inspect rejections without committing to a full Run.

What the review window shows. The browser does not display the original continuous recording — it shows your selected epochs concatenated end-to-end into one signal, in event-time order. With start_offset = -0.2, stop_offset = 1.0, each epoch is a 1.2 s block, so 720 selected epochs become an ~864 s review signal that exists only for this step. That is also why the right-hand list (Svarog: Odcinki do odrzucenia, “segments to reject”) reads #index start–end s with times on this concatenated axis, not the recording’s clock — #0 -0.20–1.00 s, #1 1.00–2.20 s, and so on. The list is every epoch, not a list of already-rejected ones; a row marked for rejection is shown struck-through in red with (odrzucone).

The vertical lines delineate the epoch structure on that concatenated axis — epoch boundaries and each epoch’s locking-event (stimulus) onset, which sits +|start_offset| into the block. (The temporary review -epo.fif also carries the recording’s original event tags, including event types you did not select, e.g. tent — these are metadata, never rejection marks.) They are not artifact flags: at this point nothing has been auto-rejected unless the artifact aggregator dropped epochs upstream (see §3.16).

To mark/keep: double-click a row, or Alt-click the signal under the cursor, to toggle an epoch’s rejection; the selection is saved when you close the window (Gotowe). Closing without marking keeps every epoch. Two Svarog 4.20 ergonomics to be aware of: the time axis is unlabeled, and the page width can open close to one epoch — widen Skala czasu (time scale) so epoch boundaries sit inside the view rather than at its edges. Clearer epoch numbering and a labeled per-epoch timeline are requested upstream.

3.18 Save segments

The pipeline always writes the final cleaned epochs as <base>_clean_<mode>-epo.fif (e.g. subject01_clean_epochs-epo.fif) to preprocessing/saved_signals/. This is the primary pipeline output and is never suppressed by save toggles — every analysis depends on it.

3.19 Per-segment statistics

Computes per-epoch per-channel statistics (id, tag, channel, t_start, t_stop, bad-segment / bad-channel flags, and one <detector>_arts column per artifact detector giving that channel’s artifact fraction) into a pandas DataFrame. This table is currently built and passed to the analysis stage but is not yet written to disk or consumed by any analysis — there is no stats/ CSV for it.

3.20 Pipeline-level features

3.20.1 Save toggles (🗄️)

Every preprocessing step has a 🗄️ file-cabinet toggle in its header row, sitting between the step’s enable checkbox and its title. Opacity encodes state — bright when save is on, dimmed to ~25% when off. For most steps the toggle is user-driven and defaults off: enabling a step does not turn its save on automatically (clicking the 👁 eye button does — it force-flips the relevant saves on so the snapshot persists). Two steps are exceptions: their drawers are derived, locked indicators (greyed, not clickable) — see the Trim and MP Decomposition notes below.

What gets saved per step:

The toggle is always visible, even when the step is collapsed. Clicking the 👁 eye button (below) auto-flips the relevant save toggles on so the snapshots persist — useful if you wanted to see the before/after and keep it for later too.

The 🗄️ drawer represents signal snapshots only. Auxiliary outputs (filter plots, ICA component reports, gaze video plots) get their own widgets — typically separate inline checkboxes labelled “Save filter plots”, “Save components plot and table”, and so on. This separation lets you keep an ICA snapshot for the eye-button while skipping the heavy MNE Report HTML build, or vice versa.

3.20.2 View Step Result (👁)

Every preprocessing step has an eye button (👁) in its row header, next to the up/down reorder arrows (when visible). Click it to open a synchronised before/after view of the signal around that step — IDE4EEG’s answer to “I just changed a parameter, show me what that actually did.”

Clicking 👁 auto-flips the 🗄️ save toggles on for this step and its preceding step, so the snapshots persist across future runs.

Viewer dispatch.

  1. If Svarog is installed AND both files share the same sample rate, the button launches Svarog in split view: one window, two plots, horizontally scroll-locked. Uses Svarog’s --split-signal flag (requires a Svarog jar that supports it).
  2. Otherwise (no Svarog jar, or rate mismatch after a resample), it falls back to two independent MNE interactive windows side by side.

Freshness / staleness detection. Each saved snapshot FIF carries a hash of the preprocessing config that produced it, embedded in info["description"] as "after <step> [cfg:<hash>]". When you click 👁, IDE4EEG compares that hash to the hash of your current GUI config. If they match, the viewer opens directly. If you’ve changed any preprocessing parameter since the snapshot was written, the hash differs and the snapshot is treated as stale — IDE4EEG auto-runs the pipeline to regenerate it before opening the viewer.

The same hash drives a write-skip optimisation: when the pipeline reaches a step whose target snapshot already exists on disk with a matching [cfg:<hash>], the save is skipped (the bytes would be identical anyway). Saves hundreds of MB of FIF I/O per run when only downstream parameters changed.

Run-to-step — the eye button can drive the pipeline. If the saved files don’t exist yet (or are stale), clicking the eye button runs preprocessing up to that step instead of asking you to click Run manually. A truncated full pipeline launches via the Run tab: your config is deepcopied, step_order is sliced at the target step, every analysis flag is forced off, and save=True is forced on every enabled saveable step up to and including the target — so both the before and after snapshots land on disk, along with intermediate snapshots you can view later without re-running. Your GUI state is untouched. The viewer opens automatically when the pipeline finishes.

The eye button is available on every saveable preprocessing step: Trim, Select channels, Detect bad channels, Filtering, ICA, EEG Artifacts, Gaze, MP Filter. The Gaze eye drives its own gaze_artifacts step and opens the gaze-marked snapshot, distinct from the EEG-artifacts eye’s eeg-artifacts-marked snapshot. The Trim eye is a special case: it opens the Preview & Trim window directly rather than a Svarog split view, which would mislabel the trimmed segment’s time axis as 00:00. The Montage eye is likewise special-cased — it opens the head diagram of the current electrode layout (matched channels blue, unmatched positions grey) rather than a Svarog split view, because montage changes only electrode coordinates, not signal samples. MP Decomposition has no eye — it produces a .db book, not a -raw.fif. The before/after walk skips steps without a Raw FIF snapshot, so on a trim → mp_decomposition → mp_filter pipeline the MP Filter eye correctly pairs with the trim snapshot, not the MP book.

A second 👁 segments button on the Segmentation setup header previews the segment windows themselves — see §3.1.4.

If a pipeline run is already in progress when you click, the button refuses with a “Pipeline already running” dialog — wait for it to finish or click Stop on the Run tab first.

3.20.3 Status badges (Extras / Manual review)

Each preprocessing step’s header row carries up to two compact status badges (small monochrome icons) next to its title:

The badges let you see at a glance which steps will produce extra output or pause for user input, without expanding every panel.

3.20.4 Hash-based caching (overview)

IDE4EEG uses a Merkle-style hash chain to identify the exact configuration that produced each step’s signal snapshot. The [cfg:<hash>] marker stamped into every saved FIF lets the GUI detect staleness when you change a parameter, drives the eye-button’s freshness check (§3.20.2), and powers the write-skip optimisation when re-running a pipeline whose downstream parameters changed but whose upstream snapshots are still valid.

For internals — the per-step fingerprint, what’s included in / excluded from each step’s hash, and the three invariants that keep the chain stable across re-runs — see Appendix E.6.

3.20.5 MP book reuse

MP decomposition follows a hash-controlled reuse policy: the freshness check runs whether or not mp_decomposition is in step_order.

Preprocessing tab → “MP Decomposition” checkbox Book on disk Hash What happens
Checked none Run empi, write book_*.db, stamp the current config’s hash.
Checked present matches Skip empi, reuse the existing book. Log: MP Decomposition: reusing existing book (hash match).
Checked present differs / legacy Run empi, overwrite the book with a fresh hash.
Unchecked none Skip MP; downstream MP-consuming analyses fall back to ephemeral per-ERP decomposition.
Unchecked present matches Reuse the book silently.
Unchecked present differs Halt with an error explaining how to recover (re-check the box, or restore the upstream parameters that produced the book).
Unchecked present legacy (no hash) Reuse with a one-time soft warning.

This means iterating on filter-criteria parameters (the [matching_pursuit.filter] sub-block, used by mp_filter and the atom-criteria analyses such as EEG profiles / dipole fitting) does not trigger an empi rerun — those criteria operate on already-decomposed atoms, so the sub-block is stripped from the book hash. Same for runtime-only fields (empi_path, cpu_workers, outputs).

The hash covers everything that determines what data MP sees: the source EEG file’s identity (name + size + mtime), preprocessing.step_order (so reordering invalidates), and every config section read by a step that runs before MP — including filters, ICA_EOG, montage, bad_channels, choosing_channels, and the decomposition-relevant fields of matching_pursuit (algorithm, channels, iterations, explained_energy, optimisation, …). Final-segment-handling sections (epochs, mp_filter) are excluded — they can’t influence MP input.

MP filter has its OWN internal book. Same hash-controlled reuse policy as above, but scoped to the [mp_filter] block (with filter sub-block stripped). Editing the internal decomp params or window_length invalidates and re-runs empi; editing reconstruction criteria never invalidates. See §3.12.

Consumer-channel check on reuse. Even when the hash matches, the on-disk book’s channel_names list may not include a channel that an enabled downstream consumer needs (eeg_profiles.channel). When that happens the analysis’s pre-run validation halts with a structured error pointing at two recovery paths: set the analysis channel to one in the book, or check MP Decomposition AND update matching_pursuit.channels to include the required channel (or "all") — the latter rewrites the book.

The same rule applies to ide4eeg --run config.toml and to scripts exported via api.generate_script. MP enable state lives in preprocessing.step_order, not in any top-level flag. Both Run Pipeline and per-analysis Run buttons respect the MP checkbox identically — checking it forces a recompute, unchecking it lets auto-discovery decide.

Migrating from older configs: the legacy prepare_mp_decomposition TOML key was removed. To enable MP, add "mp_decomposition" to preprocessing.step_order. Configs that still use the old key log an error and run without MP.


4. Analysis tab

The Analysis tab groups all post-preprocessing computations: ERPs, time-frequency analyses, statistical tests, connectivity, source localisation, and the external MNE-Python catalog. Each analysis has its own checkbox, parameter group, and per-analysis [Run] button. Three IDE4EEG-native analyses appear at the top, then a separator, then a series of MNE-Python wrappers grouped by category (ERP / Spectra / Time-frequency / Spatial / Comparison / Source estimation).

4.0 Run buttons & preprocessing-skip logic

4.0.1 What “Run” means per analysis

Most analysis sections (PSD, Connectivity, Statistics, EEG Profiles, MMP Dipole Fitting, …) have their own [Run] button. Clicking it narrows the run to just that analysis — every other prepare_* flag is forced off — and otherwise behaves like Run Pipeline. The current GUI state is collected into a config dict at the moment you click; a config.toml reflecting the per-analysis narrowing is saved to the output directory so you can see exactly what was executed.

4.0.2 Skipping preprocessing

What a per-analysis [Run] saves depends on whether the analysis is book-only (reads the MP book directly) or not.

Analysis Per-analysis [Run] behaviour
Book-only: EEG Profiles, MMP Dipole Fitting If a matching MP book is on disk, skips signal loading and preprocessing entirely — reads atoms straight from the book and runs only the analysis. Fast path. If no matching book exists, falls through to the full pipeline (preprocessing + the analysis).
Non-book-only: Connectivity, MNE-catalog ERP / Spectra / Time-frequency / Spatial / Comparison / Source-estimation panels Runs full preprocessing (same as Run Pipeline), then only this analysis. Saves the cost of other enabled analyses; no faster than Run Pipeline if this is your only enabled analysis.

Notes that apply to both cases:

4.0.3 Event-dependent vs continuous-signal analyses

Some analyses require event markers (≥ 2 event types are needed to compute differences); others work on continuous data. The Analysis tab marks each analysis with which modes it supports:

Analysis Available for
MMP → dipole sources Both modes
EEG profiles Both modes
Connectivity Both modes
MNE catalog (ERP / Comparison / Time-frequency) Needs events
MNE catalog (Spectra / Spatial) Both modes
MNE catalog (Source estimation) Needs events

When the loaded recording’s segmentation mode doesn’t supply what an analysis needs, the GUI greys out the unavailable entries and shows a tooltip explaining why.

4.1 UW-developed analyses

4.1.1 MMP → dipole sources

GUI panel: MMP → dipole sources (checkable, has [Run]) Config section: [dipole_fitting] Prerequisite: MMP1 book (multivariate MP decomposition). SMP and MMP3 are rejected.

Fits equivalent current dipoles (ECDs) to Multivariate Matching Pursuit macroatoms, localising the brain sources of coherent multi-channel EEG activity on a template brain (fsaverage) or a subject-specific FreeSurfer reconstruction.

Mathematical foundation. An equivalent current dipole (Scherg 1990) is a point source of neural current characterised by position (x, y, z), orientation (o_x, o_y, o_z), and moment q (nAm). The scalp potential it produces at electrode i is:

V_i = L_i(x, y, z) * q * (o_x, o_y, o_z)

where L_i is the lead-field vector from the BEM forward model.

Fitting procedure.

  1. Read macroatoms from the MMP book. Each macroatom groups all channels for one MP iteration, sharing time, frequency, and scale parameters while having per-channel amplitude and phase.
  2. Determine amplitude signs via circular mean of phases across channels.
  3. Construct an MNE EvokedArray from the macroatom’s spatial pattern.
  4. Fit the dipole using mne.fit_dipole() with the BEM model.
  5. Record: position (x, y, z), orientation, goodness-of-fit (% variance explained), amplitude.
  6. Optionally compute distance to nearest cortical surface voxel (requires nibabel + fsaverage source space).

First-run network requirement (fsaverage). The template head model (fsaverage BEM + head↔︎MRI transform) is not bundled in the desktop builds — it is downloaded once via MNE into ~/mne_data/ the first time you fit dipoles (a few MB, over HTTPS). So the first dipole fit needs an internet connection; afterwards it runs fully offline from that cache. If you are offline on first use, the fit fails with a clear message pointing at ~/mne_data/ and the dipole_fitting.fsaverage_dir option (point it at a manually-downloaded fsaverage to work offline). Bundling it into the app was considered and deliberately declined, to keep the download small.

Atom selection (not classification): freq_min/freq_max and scale_min/scale_max filter which MP atoms get a dipole fit (e.g. the spindle band 11–16 Hz, scale ≥ 0.5 s as in the dipole_spindles example). There is no automatic structure labelling — the output CSV carries each atom’s frequency_Hz and scale_s, and you classify post hoc.

Parameter Default Description
ref_channel "average" Reference channel for dipole fitting.
montage "standard_1005" Electrode montage name (MNE-compatible). standard_1005 covers most 10-20 and 10-10 systems.
ignored_channels "" Comma-separated channel names to exclude from the dipole fit (e.g. known-bad electrodes). Empty = use all EEG channels.
max_iterations 0 Maximum number of dipoles to fit. 0 = all atoms.
min_gof 0 Minimum goodness-of-fit (%). 0 = no constraint.
cortical_distance false Compute cortical distances (requires nibabel).
fsaverage_dir "" FreeSurfer subject directory. Empty = fsaverage (auto-downloaded). For subject-specific, set to the subject’s recon-all output.
freq_min/freq_max 0 Atom frequency filter (Hz). 0 = no constraint.
scale_min/scale_max 0 Atom duration filter (s). 0 = no constraint.

Output files (in <output>/analysis/dipole_analysis/):

Template mode (fsaverage — default). fsaverage is FreeSurfer’s average brain model — the average of 40 healthy adults’ MRI scans, created at the Montreal Neurological Institute. It is the standard reference brain used across neuroscience when a subject’s own MRI is not available, and it backs dipole fitting, source estimation (MNE/dSPM/sLORETA/LCMV/MxNE), and the ConnectiVIS 3D views.

IDE4EEG provisions it as a managed tool. The first time a run needs it, IDE4EEG resolves fsaverage in this order: (1) an explicit dipole_fitting.fsaverage_dir override; (2) the managed curated subset in ~/.obci/ide4eeg/fsaverage/ (~259 MB — only the files the analyses read); (3) an existing full copy in ~/mne_data/; (4) as a last resort, MNE’s full ~0.7 GB OSF download. When nothing is present, it auto-downloads the curated subset from GitLab (the GUI asks once before a run that needs it; you can also pre-install it in Config → Tool Paths → Brain geometry (3D)). Progress shows in the Run log.

What the curated subset contains. The full FreeSurfer fsaverage subject is ~761 MB; IDE4EEG only reads a fraction of it, so the managed package ships exactly the files below (~259 MB, plus the bundled LICENSE-FreeSurfer.txt + ATTRIBUTION.txt). Everything else in the full subject — extra BEM resolutions, the T1/other MRI volumes, the inflated/sphere/curv surfaces, additional atlases and source spaces — is not used by IDE4EEG and is omitted.

File Size Used by
bem/fsaverage-5120-5120-5120-bem-sol.fif 226 MB BEM forward solution — dipole fitting + all source estimation (the bulk of the package)
bem/fsaverage-trans.fif 4 KB head↔︎MRI transform — fitting, MNI mapping, 3D placement
bem/fsaverage-5120-5120-5120-bem.fif 364 KB BEM surfaces
bem/fsaverage-inner_skull-bem.fif 484 KB inner-skull BEM
bem/{brain,inner_skull,outer_skull,outer_skin}.surf 4 × 364 KB head/brain outlines for plot_dipole_locations
surf/{lh,rh}.pial 2 × 5.6 MB cortical surface — ConnectiVIS brain PLY
surf/{lh,rh}.white 2 × 5.6 MB white-matter surface — 3D outlines
surf/lh.seghead 5.3 MB scalp surface — head PLY
label/{lh,rh}.aparc.annot 2 × 1.3 MB Desikan–Killiany atlas colours
mri/aparc.a2005s+aseg.mgz 368 KB a2005s cortical parcellation volume — cortical_distance feature

If the OSF last-resort path is ever reached and fails with a 429 Too Many Requests, that mirror is rate-limiting (your network is fine) — wait a few minutes and retry. To use a custom brain, set dipole_fitting.fsaverage_dir to a manually-placed fsaverage subject directory (the folder named fsaverage, containing bem/, surf/, label/ — specifically bem/fsaverage-5120-5120-5120-bem-sol.fif and bem/fsaverage-trans.fif); note this is the subject folder itself, not FreeSurfer’s parent SUBJECTS_DIR.

When fsaverage_dir is empty or points to the MNE-downloaded fsaverage, dipole fitting uses this template brain. The brain cortex PLY (with Desikan-Killiany atlas colours) is cached once in ~/.obci/connectivis/fsaverage_brain_aparc.ply (327k vertices, 13 MB) and shared across all runs.

ConnectiVIS renders the brain PLY inside its built-in head model (a stylised mesh with nose, ears, neck). The head is cosmetic — not registered to MNI space, so electrodes may hover slightly above or sink into the visible scalp. The electrode and dipole positions themselves are geometrically correct in MNI coordinates.

Limitation. Because fsaverage is an average of 40 people, individual facial features (nose, ears, chin) are literally averaged out. The exported head surface is a featureless smooth ellipsoid. That is why ConnectiVIS uses its built-in stylised head for visualisation. For a real face, use subject-specific mode.

Subject-specific mode. When fsaverage_dir points to a subject’s own FreeSurfer recon-all output (not fsaverage), the pipeline uses the subject’s individual anatomy. Requires a full FreeSurfer reconstruction (recon-all -all -s SUBJ01).

Per-run outputs (in the output directory, not cached):

All files share the subject’s FreeSurfer surface RAS frame, so brain, head, dipoles, and electrodes align perfectly in ConnectiVIS — no cosmetic AC3D head needed.

# Subject-specific dipole fitting
[dipole_fitting]
fsaverage_dir = "/path/to/subjects/SUBJ01"
ref_channel   = "average"
montage       = "standard_1005"

Prerequisites for subject-specific mode:

  1. FreeSurfer recon-all must have completed for the subject.
  2. BEM solution (*-bem-sol.fif) — mne.make_bem_model + mne.make_bem_solution (or mne.make_forward_solution).
  3. Head-to-MRI transform (*-trans.fif) mapping HEAD coordinates to the subject’s MRI. Created by mne.gui.coregistration or mne coreg.
  4. Parcellation annotations (lh.aparc.annot + rh.aparc.annot) in the subject’s label/ directory.

Example (MP-based). See examples/dipole_spindles/ — methodology from Durka et al. (2024) Sensors 24(3):842 (DOI). 24-channel sleep EEG, MMP1 decomposition (200 iterations), dipole fit with spindle selection (11–16 Hz, scale ≥ 0.5 s). A separate, non-standard approach using multivariate Matching Pursuit. Uses fsaverage (template mode).

References:

Scherg M (1990) “Fundamentals of dipole source potential analysis.” Adv Audiol 6:40–69.

Durka PJ (2018) “Matching Pursuit Dipole Fitting for EEG source localization.” University of Warsaw.

Kuś R, Różański PT, Durka PJ (2013) “Multivariate matching pursuit in optimal Gabor dictionaries.” BioMed Eng OnLine 12:94.

4.1.2 EEG profiles

GUI panel: EEG Profiles (checkable, has [Run]) Config section: [eeg_profiles] Prerequisite: MP book (decomposition)

Detects and counts EEG graphoelements — stereotyped waveforms such as sleep spindles, alpha waves, delta waves, K-complexes — by filtering MP atoms on their Frequency, Amplitude, Scale, and Phase (FASP) parameters. See Malinowska et al. 2013 for the original disorders-of-consciousness application.

Use this for continuous (rest-mode) signals. EEG profiles are designed around a continuous time axis — graphoelement counts are binned over absolute time. The GUI flags this when you pair Profiles with event-locked mode.

FASP filtering. Each atom from the MP book is characterised by frequency f, amplitude A, scale s, phase phi. A structure template defines acceptable ranges for each parameter. An atom matches if all criteria are satisfied:

match = (freq_min <= f <= freq_max) AND
        (amp_min <= |A| <= amp_max) AND
        (scale_min <= s <= scale_max) AND
        (phase_min <= phi <= phase_max)

where 0 in any field means “no constraint” (always passes).

Preset structure templates.

Structure Frequency [Hz] Amplitude p2p [µV] Scale [s] Phase
Sleep spindles 11–16 ≥ 12 0.5–2.5
SWA 0.5–2 ≥ 75 0.5–6
Alpha waves 8–12 ≥ 5
Delta waves 0.5–4 ≥ 75 0.5–6
Beta waves 15–25 ≥ 5 0–0.5
Theta waves 4–8 ≥ 15 ≥ 1.0

SWA (slow-wave activity) is the AASM N3 marker — narrower (0.5–2 Hz) than Delta waves. Sleep spindles and SWA are the staple sleep-staging structures, so they get the two clearest, maximally-contrasting lane colours (vivid red spindles vs strong blue SWA). On the SWA percentage panel a dashed N3 ≥ 20 % reference line is drawn (an epoch scores N3 when SWA fills ≥ 20 % of it); that line appears only on SWA, not on Delta waves (whose 0.5–4 Hz band is broader than the AASM SWA band).

Time-evolution metrics. Detections are binned into pages — intervals equal to the MP segment length (read from book metadata, usually 20 or 30 s). Four display modes:

Mode Metric Description
count Detections per page Total number of matching atoms per page.
percentage sum(scale_s) / interval * 100 Fraction of the page occupied by the structure (%).
amplitudes Per-occurrence markers One stem at each detection time, height = peak-to-peak amplitude [µV].
power Σ energy / epoch Mean power of this structure’s atoms over each page (e.g. all delta waves in the delta panel): the page’s summed atom energy (∫a²(t)dt, µV²·s — Svarog C-BOOK-1) divided by the page length → µV². For Delta this is the slow-wave-activity (SWA) per epoch curve, its most useful application. (Binned per page; ÷epoch makes 20 s and 30 s pages comparable.)
Parameter Default Description
structures (built-in) List of structure definitions (FASP).
mode "amplitudes" Display mode — one of count, percentage, amplitudes, power.
channel 0 Channel name or mmp1 for the MMP-1 book; 0-based int also accepted.

Bin width is always equal to the MP segment length. It is not user-configurable — the legacy interval_sec config key is ignored if present.

👁 Open in Svarog. The EEG Profiles panel header carries a 👁 button that opens Svarog with the cleaned signal, the MP book synchronised below it (the book rendering is the profile view), and the detected structures overlaid from the per-channel -tag.txt mark file — one synchronised view, so you can scrub the recording and see detected structures inline.

Outputs (in eeg_profiles/):

Remove atoms in artifact-marked spans (panel checkbox, on by default). When the EEG Artifacts step has marked the signal, MP atoms whose time-center falls inside a BAD_* span (the channel-scoped artifact carrier, taken whole-montage) are dropped before binning, so every output — chart, CSV, -tag.txt overlay, histograms — excludes artifact-contaminated atoms. It is a no-op when no artifacts were marked, so leaving it on is harmless.

Reference: Malinowska U, Chatelle C, Bruno MA, Noirhomme Q, Laureys S, Durka PJ (2013) “Electroencephalographic profiles for differentiation of disorders of consciousness.” BioMedical Engineering OnLine 12:109. DOI: 10.1186/1475-925X-12-109

4.1.3 Connectivity (DTF/PDC)

GUI panel: Connectivity (checkable, has [Run]) Config section: [connectivity]

Estimates directed and undirected information flow between EEG channels, revealing how brain regions communicate during cognitive tasks or rest. The implementation is based on ConnectiviPy (Krzemiński & Kamiński, University of Warsaw), developed during Google Summer of Code 2015 under INCF sponsorship.

Multivariate Autoregressive (MVAR) model. The k-channel EEG signal is modelled as:

X(t) = A_1 * X(t-1) + A_2 * X(t-2) + ... + A_p * X(t-p) + E(t)

where A_m are (k × k) coefficient matrices, p is the model order, and E(t) is Gaussian noise with covariance V.

Model order is selected via Akaike Information Criterion: AIC(p) = N * log(det(V(p))) + 2 * p * k², p_optimal = argmin AIC(p).

MVAR fitting methods.

Method Key Description
Yule-Walker yw Solves the matrix Yule-Walker equations. Fast, default.
Nuttall-Strand ns Forward-backward lattice. Better for short data.
Vieira-Morf vm Harmonic-mean normalised lattice. Most robust.

Spectral decomposition. From the fitted MVAR coefficients:

A(f) = I - sum_{m=1}^{p} A_m * exp(-j*2*pi*f*m/fs)
H(f) = A(f)^(-1)          -- transfer function
S(f) = H(f) * V * H(f)^H  -- cross-spectral density

Connectivity measures.

Method Type Formula Interpretation
DTF AR DTF(f)_{i->j} = abs(H_ij) / sqrt(sum_k abs(H_ik)^2) Directed transfer from j to i, normalised by total outflow.
gDTF AR DTF weighted by noise covariance Accounts for differing noise levels across channels.
ffDTF AR abs(H_ij) / sqrt(sum_f sum_k abs(H_ik(f))^2) Full-frequency normalisation.
dDTF AR DTF * Partial_Coherence Direct DTF: removes indirect (mediated) paths.
iDTF AR Instantaneous DTF Captures contemporaneous interactions.
PDC AR PDC(f)_{i->j} = abs(A_ij) / sqrt(sum_k abs(A_kj)^2) Partial directed coherence: direct influence at the input side.
gPDC AR PDC weighted by inverse noise covariance Generalised PDC.
iPDC AR Instantaneous PDC Zero-lag PDC variant.
PCoh AR Partial coherence Undirected, controls for volume conduction.
Coh Signal Standard coherency from FFT Undirected frequency-domain correlation.
PSI Signal Phase Slope Index Directed, based on the slope of the phase spectrum.
GCI Signal Granger Causality Index Directed, based on prediction error reduction.
AEC Signal Amplitude Envelope Correlation Undirected, correlation of Hilbert envelopes.

Short-time connectivity is optionally computed in a sliding window to track dynamics over time.

Parameter Default Description
methods ["dtf", "coh"] List of connectivity measures.
mvar_method "yw" MVAR fitting method.
mvar_order 0 AR model order. 0 = automatic via Akaike.
resolution 100 Frequency resolution for spectral estimation.
channels "all" Channels to include.
short_time false Enable sliding-window connectivity.
st_window 0 Short-time window length (s). 0 = auto.
st_overlap 0 Short-time window overlap (s).
significance false Compute bootstrap significance thresholds.
sig_reps 100 Bootstrap repetitions.
sig_alpha 0.05 Significance level.

Parallelism. The bootstrap / surrogate / short-time inner loops in connectivity/conn.py are dispatched via joblib.Parallel, inheriting n_jobs from [parallelism]. MVAR-based methods (DTF, PDC, iDTF, iPDC, gDTF, gPDC, dDTF, ffDTF) typically see 3–5× speedup on 8 cores when significance or short-time is on. Coherency/PSI/GCI per-rep work is small enough that joblib spawn overhead competes — those paths show flat or marginally-slower wall time for small inputs but break even on longer recordings.

AEC notes. AEC bandpass-filters the signal in five default bands (theta [6, 7], alpha [8, 13], beta [15, 25], low-gamma [30, 45], high-gamma [55, 70] Hz). Bands whose upper edge would exceed the Nyquist frequency are skipped with a one-shot warning rather than aborting (e.g. high-gamma drops out at fs ≤ 140 Hz, including standard 128 Hz BrainTech).

ConnectiVIS .dat export. For every successfully-computed method, the connectivity step writes one .dat file per frequency band so the user can scrub bands × methods in 3D.

Band Range (Hz)
fullband 0 → Nyquist
delta 1–4
theta 6–7
alpha 8–13
beta 15–25
low-gamma 30–45
high-gamma 55–70

Bands are defined locally in connectivity_analysis.py::_CONNECTIVIS_BANDS (separate from the AEC FQ_BANDS registry — adding delta does not affect AEC). fullband averages all bins from DC to Nyquist. Bands whose upper bound exceeds Nyquist are skipped with an info-level log line.

Filenames follow <base>___<tag>_<data_label>_<method>_<band>.dat. A run with DTF + PDC selected produces 14 files (2 methods × 7 bands). Per-file headers include ;title and ;band so each file self-describes. Significance and short-time results are intentionally not exported to .dat — their shapes (CI matrices, k×k×T tensors) need a different scheme. They remain available as PNG plots.

Output. Per-method directional plots, channel × channel heatmaps, PSD plots, frequency-resolved connectivity, optional short-time spectrograms, and the .dat files for ConnectiVIS.

References:

Kamiński M, Blinowska KJ (1991) “A new method of the description of the information flow in the brain structures.” Biol Cybern 65:203–210.

Baccalá LA, Sameshima K (2001) “Partial directed coherence.” Biol Cybern 84:463–474.

Blinowska KJ, Kuś R, Kamiński M (2004) “Granger causality and information flow in multivariate processes.” Phys Rev E 70:050902.

4.2 MNE-wrapped analyses

Below the Analyses from MNE: separator, the Analysis tab exposes the MNE-Python catalog as six collapsible category panels — ERP, Spectra, Time-frequency, Spatial, Comparison, and Source estimation. Each panel has its own enable checkbox, a category-level [?] help button, an inline [Run] button, and a list of individual entries with per-entry [?] buttons that link straight to the MNE documentation. A few entries have inline parameter fields (frequency range, number of time points, …); the rest run with MNE’s defaults.

Config key: mne_catalog (list of analysis IDs). Outputs land in the MNE/ subdirectory of IDE4EEG_OUT_*.

4.2.0 Per-category channel selection

Four of the six category panels — ERP, Spectra, Time-frequency, Comparison — each carry a single Channels checkbox panel at the top of the section (same widget used by Connectivity and the MMP→dipole analysis). All channel-aware entries within that category share the panel’s selection at run time. (“Channel-aware” = an entry that plots or analyses per-channel data — a butterfly trace, a PSD overlay — as opposed to a whole-scalp topography.) The remaining categories (Spatial, Source estimation) have no panel — channel selection would distort topographic maps or break the inverse problem.

The 11 channel-aware entries are: erp_butterfly, erp_image, gfp, cluster_permutation, evoked_image, compare_per_channel (ERP) · psd_multitaper, psd_welch (Spectra) · tfr_morlet, tfr_multitaper (Time-frequency) · compare_evokeds (Comparison). erp_joint is intentionally excluded — plot_joint needs the full montage to draw the topomap inserts at GFP peaks; a small subset crashes the underlying MNE call.

Default = all EEG channels. Untick channels to restrict the whole category’s analyses to a region of interest (e.g. uncheck everything except motor-cortex electrodes in the Time-frequency panel while ERP butterfly elsewhere still shows whole-scalp). Filter buttons on the panel (All / None / EEG only) speed up bulk picks.

TOML back-compat. Old per-entry mne_params.<aid>.channels = "Fz, Cz" configs continue to work at runtime (the resolver accepts both list and comma-string forms). Save the config from the GUI to migrate to the per-category panel.

Validation: unknown channel names log a warning and are dropped from the subset; if NONE of the requested names exist in the data, that analysis raises an error (visible in the Run-tab log) and the rest of the catalog continues.

TOML keys: - GUI persistence: mne_channels.<Category>list[str] (subset) or "all". Per-category panel state, restored after the signal loads. - Runtime (auto-injected): mne_params.<analysis_id>.channelslist[str]. Set by the GUI from the per-category panel selection right before pipeline launch (omitted when the panel is “all”); headless callers may still write either form.

4.2.1 ERP

Panel: ERP  ·  Available for: event-locked epochs (some entries need 2+ event types)

The ERP — Event-Related Potential — is the trial-averaged waveform per condition:

ERP(t) = (1/N) · Σ_i epoch_i(t)

Phase-locked neural responses sum constructively while non-phase-locked noise cancels out.

ID Name Requires Parameters
erp_butterfly ERP butterfly (one line per channel)
erp_image ERP image (heatmap of single-trial ERPs)
erp_joint ERP joint (topos at peaks) montage
erp_topomap ERP topomap series montage n_times
gfp Global field power (std across channels)
cluster_permutation Spatio-temporal cluster permutation test 2+ tags step_down_p, n_permutations
evoked_image Evoked image (channels × time)
compare_per_channel Compare evokeds (per-channel grid) 2+ tags

Spatio-temporal cluster permutation test (Maris & Oostenveld 2007). Non-parametric method controlling the family-wise error rate when testing for differences across many time points and channels, without overly conservative corrections like Bonferroni. Per time point and channel, compute a t-statistic comparing two conditions; threshold to identify above-significance samples; group adjacent above-threshold samples (in time and space) into clusters; sum statistics within each cluster; randomly reassign condition labels and repeat (default n_permutations = 1024); report the fraction of permutations where the maximum cluster statistic exceeds the observed cluster statistic. Wraps mne.stats.spatio_temporal_cluster_test with channel adjacency from the electrode montage.

Parameter Default Description
step_down_p 0.05 P-value for step-down-in-jumps test.
n_permutations 1024 Number of permutations (higher = more precise).

Reference: Maris E, Oostenveld R (2007) “Nonparametric statistical testing of EEG- and MEG-data.” J Neurosci Methods 164(1):177–190.

4.2.2 Spectra

Panel: Spectra  ·  Available for: rest or epochs

ID Name Requires Parameters
psd_topomap PSD topomap (default 5 standard bands) montage bands, scale ("dB"/"power")
psd_multitaper PSD via multitaper (Spectrum.plot) fmin, fmax, scale ("dB"/"power"/"amplitude"), bandwidth, tmin, tmax, xscale, average, ci
psd_welch PSD via Welch’s method fmin, fmax, scale (same three choices), n_per_seg, n_overlap, average, xscale
psd_topo Per-channel PSD laid out at sensor positions montage fmin, fmax, scale ("dB"/"power")

psd_topomap covers the previous psd_bands_topomap use case via its bands parameter (the two were merged into one entry). Line-plot entries (psd_multitaper, psd_welch) expose three scale choices (dB / power / amplitude); topographic entries only the first two (no amplitude).

Rest mode — continuous estimation that omits artifacts. In rest / continuous mode all four PSD entries estimate on the continuous preprocessed signal rather than on the cut windows, and omit artifact-marked spans via MNE’s reject_by_annotation — so a sub-window blink or muscle burst is excluded at full granularity (with the same min_mark_ms / dilate_ms conditioning the [Mark detected artifacts] step applies, §3.16) instead of contaminating a whole window. This is controlled by [segmentation].reject_by_annotation (default true; set false to estimate on the raw continuous signal with no omission). Time-frequency (TFR) entries stay windowed — they need an intact epoch time axis and cannot drop sub-spans (§4.2.3). In event-locked mode every spectral entry is computed on the epochs as before.

4.2.3 Time-frequency

Panel: Time-frequency  ·  Available for: rest mode OR event-locked epochs (ERD/S Topomap is event-locked-only — it needs real per-condition baselines)

All three TFR entries call Epochs.compute_tfr(method=...) (MNE ≥ 1.4 API).

ID Name Requires Parameters
tfr_morlet TFR via Morlet wavelet freq_min, freq_max (0 = Nyquist), n_cycles_mode ("adaptive"/"constant"), n_cycles
tfr_multitaper TFR via multitaper freq_min, freq_max, n_cycles_mode, n_cycles, time_bandwidth (default 4.0, advanced)
erds_topomap ERD/S topographic maps — baseline-corrected montage freq_min, freq_max, n_cycles_mode, n_cycles

n_cycles_mode picks the time-frequency-resolution trade-off: - "adaptive" (default): the number of cycles grows with frequency (n_cycles_i = freqs_i / value), so low frequencies use a wider — longer — time window (MNE-tutorial convention). Default value = 2 gives 5 cycles at 10 Hz, 25 cycles at 50 Hz. - "constant": n_cycles cycles at every frequency (Tallon-Baudry convention; commonly 7).

Wavelet length must fit the epoch. MNE’s Morlet kernel is truncated to ±5σ where σ = n_cycles / (2πf) (σ is the width of the wavelet’s Gaussian envelope; ±5σ captures essentially all its energy), giving an effective duration of ≈ 1.59·n_cycles/f seconds. At fmin this is the longest kernel in the bank — and the formula depends on which n_cycles_mode is active:

When the kernel exceeds the epoch length, the low-frequency rows of the TFR are essentially noise: MNE itself warns (“at least one of the wavelets is longer than the signal”). IDE4EEG pre-empts MNE’s terse message with an actionable line naming the entry, the kernel duration, and the epoch length — visible on the Run-tab log. Fix any of: raise freq_min, lower n_cycles, switch to adaptive mode, or use longer epochs. Multitaper uses DPSS windows with comparable nominal duration; the same check applies.

ERD/S Topomap behaviour: the catalog entry first computes a Morlet TFR per condition, then calls power.apply_baseline(baseline=(None, 0), mode="percent"). The baseline window is fixed to the entire pre-event interval (epoch start → event onset at t=0 s, relative time). The colorbar reads in %: negative values = ERD (event-related desynchronization, i.e. power decrease), positive values = ERS (synchronization, power increase). The cmap is diverging (RdBu_r) centred at 0.

4.2.4 Spatial

Panel: Spatial  ·  Available for: rest or epochs

ID Name Requires Parameters
evoked_topomap_anim Evoked topomap animation (sequence of frames) montage n_times
channel_locations 2D plot of electrode positions on the head montage

4.2.5 Comparison

Panel: Comparison  ·  Available for: event-locked epochs

ID Name Requires Parameters
compare_evokeds Overlay grand-average ERPs for all conditions with confidence intervals 2+ tags ci
drop_log Bar chart of why each epoch was rejected (amplitude, flat, user)

4.2.6 Source estimation

Panel: Source estimation  ·  Subtitle: MNE + connectivis

Four source-estimation entries that share [dipole_fitting] paths / BEM / trans configuration but expose their own per-entry parameters. The output of every entry is also wrapped into ConnectiVIS scene companion files (brain PLY, head PLY, electrodes DAT) so the 3D view button on the Output tab works identically across them.

a. ERP dipole fit (erp_dipole_fit)

Standard MNE dipole-fit tutorial: average epochs per condition → compute noise covariance from baseline → mne.fit_dipole(evoked, cov, bem, trans) at each time point. No MP decomposition needed — the classical approach (Scherg 1990).

Parameter Default Description
time_min 0.0 Start of time window to fit (s). 0 = event onset.
time_max 0.0 End of time window. 0 = full epoch.
min_gof 50 Minimum goodness-of-fit (%).
min_dist (advanced) 5.0 Minimum distance (mm) the dipole keeps from the inner skull.
pos (advanced) "" Optional "x,y,z" in head-coordinate millimetres to fix location (orientation + amplitude only).

n_jobs is read from [parallelism] and passed straight to mne.fit_dipole. Outputs: cross-section plots, ___erp_dipoles.csv, and ConnectiVIS companion files. Reference: MNE dipole fit tutorial.

b. MNE / dSPM / sLORETA / eLORETA inverse (mne_inverse)

Minimum-norm distributed-source family. Builds a forward solution + inverse operator from the epochs and applies it to the per-condition Evoked. Output is a cortical activation per vertex (~2.5k–8k vertices × time, depending on spacing), saved as a pair of *-lh.stc + *-rh.stc FIF files (one per cortical hemisphere — MNE’s native source-estimate format; reload with mne.read_source_estimate(stem)) plus a static PNG screenshot per condition rendered via PyVista.

Parameter Default Description
method "dSPM" "MNE" / "dSPM" / "sLORETA" / "eLORETA".
snr 3.0 Signal-to-noise ratio (λ² = 1/SNR²).
loose (advanced) 0.2 Source orientation constraint (0 = fixed, 1 = free).
depth (advanced) 0.8 Depth-weighting exponent.
spacing (advanced) "ico4" Source-space subdivision.
n_peaks (GUI label: “Strongest sources to export”) 5 Extract the N cortical vertices with the strongest source amplitude (per-vertex peak over time, then top-N) and export them as ___mne_inverse_<method>_dipoles.csv + ___electrodes.dat for ConnectiVIS 3D view. Lossy by construction (collapses ~2.5k–8k smooth vertices to N point sources); the full FIF -lh.stc/-rh.stc pair is always saved separately. 0 = skip the CSV (FIF + PNG only).

Reference: MNE inverse tutorial.

c. LCMV beamformer (lcmv_beamformer)

Linearly Constrained Minimum-Variance beamformer (mne.beamformer.make_lcmv + apply_lcmv). Same output shape as mne_inverse.

Parameter Default Description
reg 0.05 Tikhonov regularisation of the data covariance.
pick_ori (advanced) "max-power" "max-power" / "normal" / "vector".
weight_norm (advanced) "unit-noise-gain" "unit-noise-gain" / "nai" / "none".
spacing (advanced) "ico4" Source-space subdivision.
n_peaks (GUI label: “Strongest sources to export”) 5 Same as mne_inverse above — top-N beamformer-output peaks → dipoles CSV + electrodes DAT for ConnectiVIS. 0 = skip the CSV.

Reference: MNE LCMV tutorial.

d. MxNE / iRMxNE sparse (mxne)

mne.inverse_sparse.mixed_norm. Returns a handful of focal sources (~5–20 vertices) instead of a smooth distribution — feeds ConnectiVIS automatically via ___mxne_dipoles.csv, no n_peaks opt-in needed.

Parameter Default Description
alpha 80.0 Sparsity / regularisation (% of α_max). 50–90 typical for ERP.
depth (advanced) 0.9 Depth-weighting exponent.
loose (advanced) 0.2 Source orientation constraint.
n_mxne_iter (advanced) 1 1 = vanilla MxNE. ≥ 2 enables iterative re-weighted MxNE (iRMxNE).
spacing (advanced) "ico4" Source-space subdivision.

Reference: MNE mixed-norm example.

Choosing a source-estimation entry

Method Output shape ConnectiVIS path
erp_dipole_fit one ECD per time sample direct (___erp_dipoles.csv)
mxne ~5–20 sparse vertices direct (___mxne_dipoles.csv)
mne_inverse (MNE/dSPM/sLORETA/eLORETA) per-vertex cortical map top-N peaks via n_peaks (default 5)
lcmv_beamformer per-vertex cortical map top-N peaks via n_peaks (default 5)

The discrete-source methods feed ConnectiVIS natively; distributed methods always write *-lh.stc + *-rh.stc FIF pairs (one per cortical hemisphere) plus a static PNG, and by default also extract the n_peaks=5 strongest cortical vertices to *_dipoles.csv for ConnectiVIS (set n_peaks=0 to opt out of the CSV — the FIF .stc files are unaffected).

References: all MNE-wrapper analyses cite Gramfort A et al. (2014) “MNE software for processing MEG and EEG data.” NeuroImage 86:446–460. Each entry’s [?] button also links to the relevant MNE tutorial.


5. Run tab

The Run tab is where you actually execute the pipeline. It shows the live log, a per-stage checklist with progress bars, and a Stop button to cancel an in-progress run. The same code path is used for the GUI’s Run Pipeline button, the per-analysis [Run] buttons on the Analysis tab, and the command-line ide4eeg --run myconfig.toml.

A typical session: configure the steps you want on Preprocess and Analysis, then switch to the Run tab and click Run Pipeline.

5.1 Run Pipeline button

Executes the complete pipeline: all checked preprocessing steps followed by all checked analyses, for every input file. The current GUI state is collected into a config dict and saved as config.toml in the output directory before execution starts.

5.2 Input validation

IDE4EEG validates all numeric configuration parameters before running the pipeline. Invalid values are caught at two levels:

GUI (live feedback). Key numeric fields turn red as you type when the value is out of range:

When you click Run Pipeline, a dialog warns about invalid TF frequency ranges and lets you abort.

5.2.1 Stage 5 consistency-rule framework

Beyond the numeric range checks above, IDE4EEG ships a registry of consistency rules covering things that the pipeline would otherwise discover too late (typos, stale configs reused across recordings, ordering constraints between preprocessing steps, dipole/EEG-profile dependencies on Matching Pursuit, etc.). Rules fire at five points:

Fire-point When it runs
config_load When a TOML config is loaded into the GUI/CLI
signal_load When a recording is selected / loaded
field_edit Live, after every edit to a rule-triggered field
pre_launch When you click Run Pipeline (or --run in CLI)
step_entry Just before each preprocessing step runs

The same rule set runs in both the GUI and the CLI, so a config that passes preflight in one passes in the other.

Field-validation badges (live feedback). Some fields paint a small inline glyph the moment their value becomes inconsistent with the loaded signal or the rest of the config:

Hover the glyph for the full rule message and the rule-id (useful for filing bug reports). Badges currently appear next to:

When the issue is resolved the badge disappears automatically.

Step-entry rules (during a running pipeline). A few rules also fire just before each preprocessing step runs, catching last-moment state divergence – e.g. the dispatcher rejecting a configuration mismatch that pre-launch missed because some signal-side state was not yet known. These rules abort the step with a clean error message rather than letting the underlying tool fail mid-execution with a cryptic message.

Panel-header status surface. Some rules emit on a field that doesn’t have its own widget (e.g. “Dipole fitting requires MMP1 to be in the pipeline”). Those messages render inline on the panel’s header label, alongside the existing “needs MMP1 book” text. The two surfaces don’t fight: the Stage 5 status only clears when the rule that set it stops emitting.

Help → Pipeline invariants. The bottom of the Help tab lists every registered rule, grouped by module (eight modules: Reference, Dipole, Channels, Modes, Step order, Tools, Numeric ranges, Review mode). Each entry shows the rule-id, the fire-points it runs at, and a short rationale. Auto-updates as rules are added.

Pre-flight dialog (clicking Run Pipeline). Any rule that emitted at pre_launch shows up in the dialog: errors block execution (no Run anyway button), warnings just confirm. The CLI’s parity is _run_stage5_preflight_rules_cli: errors raise ValueError, warnings log via logging.warning and the run continues.

“Don’t show this again” per-rule dismissal. Right-click any badge to dismiss the rule that fired it. Dismissals persist across sessions in ~/.obci/ide4eeg/stage5_dismissed.json. A dismissal only silences the inline live-feedback (badge + panel-header status); the pre-launch preflight dialog still surfaces every rule, so you can’t accidentally bypass a run-time error.

To restore a dismissed rule: open Help → Pipeline invariants, scroll to the “Dismissed rules” list, and click Restore next to the rule-id.

CLI (auto-correction). When running from the command line, invalid values are automatically clamped to safe defaults and a warning is logged. The pipeline continues rather than crashing. Validated parameters and their ranges:

Parameter Valid range On invalid (CLI)
segment_rejection.artifact_threshold [0, 1] clamped
segment_rejection.artifact_min_run_ms [0, 60000] clamped
filters.highpass_freq, lowpass_freq ≥ 0; hp < lp set to 0 (disabled)
trim_start < trim_end trim_start reset to 0
rest.window_length > 0 set to 5
epochs.start_offset < stop_offset reset to [-0.3, 0.7]
TF freq max (f_max / fmax) ≤ Nyquist capped at Nyquist when the TF analysis runs

5.3 Live log panel

Below the progress bar, a scrolling log panel shows the same colored output that goes to the terminal in CLI runs (INFO / WARNING / ERROR levels). The panel is always read-only; you can select and copy text from it normally. Clicking inside the log no longer disturbs the running stream — IDE4EEG uses a dedicated end-of-document write cursor so user selections and pipeline appends are independent.

The log is capped at 50 000 blocks (~a few MB of text) so long pipelines emitting tens of thousands of log lines (ICLabel, empi) don’t grow QTextEdit’s document into the multi-GB range.

The Auto-scroll checkbox at the top right controls whether new lines automatically scroll into view; Clear wipes the panel.

5.4 Stopping a run

The Stop button (next to Run Pipeline) is enabled only while a pipeline is running. Clicking it sets a cancel flag that all long-running loops in the pipeline check cooperatively — most stages bail within a second, with the notable exception of MP decomposition, which finishes its current empi invocation before the cancel takes effect.

When a run is cancelled, partial outputs already on disk are kept (per-step snapshots, MP books). Subsequent runs see them and reuse them via the hash-based caching machinery if their [cfg:<hash>] markers still match.

5.5 What gets saved on success or failure

When the pipeline runs (via Run Pipeline or per-analysis [Run]), output files land in IDE4EEG_OUT_<filename>/:

See Chapter 6 for the full output table. On a failed file (e.g. ICLabel raises ICAFailureError), the pipeline aborts that file and continues with the next in a multi-file batch — partial outputs already written stay on disk.

5.6 What config is actually used?

In all cases, the GUI collects the current state of all widgets into a config dict at the moment you click Run. This means:

5.7 Command-line execution

For batch / headless / scripted runs:

ide4eeg                             # open GUI, blank state (default)
ide4eeg myconfig.toml               # GUI with config preloaded
ide4eeg --run myconfig.toml         # batch run, no GUI
ide4eeg --run myconfig.toml -t      # batch with full Python tracebacks
ide4eeg --version                   # print version and exit

The batch run uses the same code path as the GUI, so results match (small floating-point differences are still possible under multi-core parallelism / BLAS scheduling).

For programmatic use, see ide4eeg/api.py (run_file() and step wrappers). The GUI’s Export Script… action writes a self-contained Python script reproducing the current configuration without a TOML file — useful for cluster submission or sharing reproducible analyses.


6. Output tab

The Output tab is a results browser. The left side shows a tree of every file in the output directory; the right side previews the selected file (image, table, FIF metadata, MP book info, or text). The bottom row has helper-app launchers (MNE / Svarog / ConnectiVIS).

When the pipeline finishes, the tab auto-refreshes; double-clicking a file dispatches it to the appropriate viewer. For a folder, a single click unveils its content (expands it in the tree) and a double-click opens it in your system file viewer (Finder / Explorer / file manager); collapse a folder with its disclosure triangle.

6.1 Output directory structure

Results are saved in a folder named IDE4EEG_OUT_<input_filename>/:

Subfolder When written Contents
saved_signals/ always Clean epochs as -epo.fif (the primary preprocessing output)
saved_steps/ when a step’s per-step Save checkbox is checked Intermediate signal at that point as <base>-<suffix>-raw.fif
artifacts_detection/ICA_EOG/ when Save components plot and table is checked on the ICA panel ICA component topomaps, property plots, classification CSV, -ica.fif, MNE Report HTML
bad_channels_detection/ reserved (no plots written by the current PREP detector — see §3.5 Output) (empty)
artifacts_detection/found_artifacts/ when Save is checked on the EEG Artifacts step Amplitude / slope / muscle plots
artifacts_detection/video_artifacts/ when Save is checked on the Gaze step Gaze timeline + summary
filters_plot/ when Save filter plots is checked on Filters Filter frequency-response PNGs
mp_decomposition/ when MP decomposition runs (forced save) MP atom book .db file
analysis/MNE/<entry_id>/ when MNE-based analyses run One subdir per catalog entry (psd_multitaper/, tfr_morlet/, erp_dipole_fit/, …) holding that entry’s plots + CSVs

Per-step save toggles default off. The cleaned epochs in saved_signals/ are always written — that’s the primary pipeline output that all analyses depend on.

The output path is controlled by two config parameters on the Input tab:

6.2 Browsing the tree

The navigation row above the tree has four buttons:

Hidden noise (.DS_Store, Thumbs.db, ___electrodes.dat, ___head.ply, ___brain.ply) is filtered from the tree — those companion files travel with their main artefact (the dipoles.csv) and aren’t useful to browse on their own.

The preview pane (right side) auto-renders by extension:

6.3 Opening files in helper apps

The view row below the tree has buttons that operate on the currently selected file:

Double-click in the tree dispatches by extension automatically:

The double-click dispatch obeys the Use MNE viewer instead of Svarog option (§1.5); the explicit “in SVAROG” / “in MNE” buttons override that preference.


7. Help tab

The Help tab is an in-app reader for this manual. The left pane is a filterable table of contents; the right pane shows the rendered Markdown with a search bar at the top.

You don’t have to read the manual cover-to-cover to use IDE4EEG — every interesting widget in the GUI has a [?] button or a hover tooltip. The Help tab is where the long form lives, and where the [?] buttons jump to when you click their Open in Manual link.

7.1 Context help — the [?] buttons

Most preprocessing and analysis sections have a [?] button next to the panel title (and in some cases, next to individual fields). Clicking it opens a popup with a short explanation of what the section does, followed by an Open in Manual: link that switches to the Help tab and scrolls to the corresponding section.

The link uses the heading text directly — no manual maintenance of anchor IDs. If a heading in this manual is renamed, the [?] button continues to work as long as the manual heading and the GUI [?] button reference the same wording.

7.2 Tooltips

Every label, input field, button, and checkbox in IDE4EEG has a tooltip describing what it does and (where applicable) the TOML key it controls. Hover over a widget for a second to see it.

Tooltips are hand-wrapped at ≤ 66 characters per line because Qt does not auto-wrap setToolTip() strings — every \n becomes a physical break. This means the wrap point you see in the GUI is the same one the source code carries.

7.3 In-app manual viewer

The right pane of the Help tab renders this USER_MANUAL.md file as HTML with a small CSS stylesheet that adapts to system light / dark mode. Hand-wrapped Markdown paragraphs (one source line ≈ one rendered line) are accumulated into single HTML paragraphs at render time so prose flows correctly regardless of source-file wrapping.

Two search affordances:

The text-search box widens the main window to ≥ 900 px when activated so long lines aren’t broken across multiple visible lines (which would defeat scroll-to-match).

7.4 Citing IDE4EEG and references

When you use IDE4EEG in published research, please cite the methods you used. The full reference database lives in ide4eeg/references.py and is exposed via references.refs_for_step(step_key) and references.format_all(). The same database backs the references shown at the bottom of each preprocessing/analysis section in this manual.

Selected references by topic:

A complete BibTeX export will be added in a future release. For now, python -c "from ide4eeg.references import format_all; print(format_all())" prints a sorted, one-line-per-reference summary of every entry.


8. Helper applications

IDE4EEG bundles four external helper applications. The first three are launched on demand from the GUI as subprocesses; the fourth is built-in.

On first launch of a fresh install, SVAROG (with bundled empi), ConnectiVIS, and an Adoptium Temurin JDK 17 are auto-downloaded — they are treated as crucial parts of the package. The GUI shows a modal progress dialog with a Cancel button ~200 ms after the main window appears; CLI / batch mode (ide4eeg --run config.toml) runs the same flow silently with throttled stdout progress and auto-consents to the Java install. Only missing tools are fetched, so a hand-pinned Svarog at ~/.obci/svarog/ is preserved. Per-tool errors are accumulated rather than aborting the chain — if ConnectiVIS fails to download, SVAROG and Java still install. See §1.2 for the path layout and per-row Download recent overwrite contract.

8.1 Svarog

Svarog is the open-source EEG signal viewer developed at the Braintech Lab, University of Warsaw. IDE4EEG launches Svarog whenever a synchronised, time-aligned interactive view is needed:

When Svarog is unavailable (no jar configured, or the Use MNE viewer instead of Svarog option is on), the same review windows fall back to MNE’s interactive plots — except the MP Book Viewer (§8.4), which is built-in and Svarog-independent.

Runtime requirements (Svarog 4.20):

The Config tab’s Download recent button installs the latest CI build into ~/.obci/svarog/. The bundled empi binary inside the SVAROG artifact is auto-extracted alongside the jar in the same step — there is no separate download for empi.

8.2 ConnectiVIS

ConnectiVIS is a 3D viewer for EEG source-reconstruction results: it shows dipole sources as oriented cones on the cortex and directed connectivity as coloured arrows between electrodes. IDE4EEG launches it automatically when dipole or connectivity results are produced, or via the 3D view / 3D view all buttons on the Output tab. It is auto-downloaded on first launch alongside SVAROG (see §1.2); the Config tab’s Download recent button refreshes it independently.

Runtime requirements (ConnectiVIS 2.0+):

ConnectiVIS is the modernised rewrite of Trans3D / Glowa3D (CC Otwarte Systemy Komputerowe) for the Biomedical Physics Division, Faculty of Physics, University of Warsaw. The CLI accepts .dat connectivity matrices with bundled 10-20 electrode positions, or a custom --electrodes montage.dat.

Dipoles panel (right-side controls):

Signal arrows panel (connectivity):

Scalp / Cortex panels: show/hide checkbox + transparency slider each. The cortex carries per-vertex Desikan-Killiany atlas colours from FreeSurfer.

Show panel: four checkboxes gating visibility of electrodes, dipoles, electrode name labels, and signal arrows.

Brain atlas legend (View menu): clickable list of cortex regions — clicking a region flashes it on the mesh for 3 seconds. “Plain cortex” toggle removes atlas colours.

Colors window (View menu): colormaps for signal arrows and dipoles, electrode colour, background colour.

Mouse and keyboard.

File formats.

8.3 empi

empi is the GPU-accelerated Matching Pursuit decomposition engine (Różański 2024). IDE4EEG invokes it via subprocess for §3.11 MP decomposition; the binary ships inside the SVAROG artifact (Svarog’s mp/ subdirectory) and is auto-extracted in the same first-launch step that fetches Svarog — there is no separate empi download.

empi’s CLI is fully documented in its README; IDE4EEG exposes its parameters under [matching_pursuit] in TOML and the MP Decomposition panel in the GUI. The Config tab’s Download recent (Svarog row) refreshes the matching empi build alongside Svarog.

8.4 MP Book Viewer

The MP Book Viewer is a standalone interactive window for browsing Matching Pursuit decomposition results produced by empi. Unlike the other helpers, it ships built-in (ide4eeg.analysis.mp_bookviewer_qt) and runs as a Qt window inside the IDE4EEG process.

It displays Wigner time-frequency energy maps, original signal waveforms, and signal reconstructions from selected atoms.

8.4.1 Opening a file

Three ways to open an empi .db file:

  1. From the Output tab — select an empi .db file in the file tree and click Open in Book Viewer.

  2. From within the viewer — click Open… and choose a .db file.

  3. From the command line:

    python3 -m ide4eeg.analysis.mp_bookviewer_qt              # opens file dialog
    python3 -m ide4eeg.analysis.mp_bookviewer_qt path/to.db   # opens directly

8.4.2 Layout

Four panels stacked vertically:

Panel Content
Wigner map Time-frequency energy density computed from Gabor atoms. Each atom contributes a 2D Gaussian blob. Atoms are shown as white dots; selected atoms have pink rings. Filtered-out atoms appear as small grey dots.
Signal Original signal waveform (read from the samples table in the .db file).
Recon Full reconstruction from the currently filtered atoms.
Selected Reconstruction from only the manually selected atoms.

A colour bar next to the Wigner map shows the energy scale. The bottom status bar shows atom parameters when an atom is clicked.

Action Control
Next / previous segment Toolbar ◀ / ▶ buttons, or Left / Right arrow keys
Next / previous channel Toolbar ▲ / ▼ buttons, or Up / Down arrow keys
Zoom in / out Scroll wheel (centred on cursor). Wigner map: zooms time and frequency; signal panels: time only.
Reset zoom Double-click on the Wigner map, or press Escape
Change colour palette Palette dropdown (jet, hot, viridis, inferno, gray)
Change energy scale Scale dropdown: linear (raw values), log (log₁₀(1 + energy)), sqrt (√energy)

8.4.4 Atom selection

Click an atom dot on the Wigner map to toggle its selection:

Useful for selective signal reconstruction — pick specific time-frequency components to see how they contribute to the signal.

8.4.5 Atom filter bar

Click Filter… on the toolbar to show the filter bar. Enter criteria to restrict which atoms are displayed and used for the Wigner map and reconstruction:

Criterion Description
Freq (min – max) Keep atoms whose centre frequency is in this range (Hz).
Time (min – max) Keep atoms whose centre time is in this range (s).
Energy ≥ Keep atoms with energy at or above this threshold.
Iter ≤ Keep atoms from the first N iterations only (higher-energy atoms are found first).

Click Apply (or press Enter in any field) to recompute the Wigner map. Reset clears all criteria. Filtered-out atoms are shown as faint grey dots on the map.

This is nonlinear filtering: the energy map is recomputed from the selected atom subset, not just masked. The reconstruction panel reflects only the filtered atoms.

8.4.6 Keyboard shortcuts

Key Action
/ Previous / next segment
/ Previous / next channel
Escape Reset zoom to full range
Enter (in filter field) Apply filter

Appendix A. Installation

You’re already running IDE4EEG, so you’ve cleared the basic install. This appendix is the reference for what’s actually required and where the optional pieces fit in.

A.1 Python environment

IDE4EEG is developed on Python 3.14 (current target). The supported range is Python 3.11–3.14: 3.11 is the de facto floor (the input loader uses the stdlib tomllib module, added in 3.11) and 3.14 is what active development happens on. Tested on 3.12 and 3.14; 3.11 and 3.13 are expected to work but not currently exercised. Anything older lacks tomllib and will fail at import.

python3 -m venv .venv && source .venv/bin/activate
pip install ide4eeg                # lite — EEG core only, always succeeds
# or
pip install ide4eeg[video]         # adds OpenCV / PyAV / imageio / InsightFace / onnxruntime / albumentations
# or
pip install ide4eeg[iclabel]       # adds onnxruntime for the ICLabel ICA selector (no video stack)

Use [iclabel] if you want ICLabel-based ICA component classification but not the video/facetag pipeline.

The lite install is the recommended starting point: it succeeds on every supported (Python, OS, arch) cell and fits in ~500 MB. Video processing (face detection + gaze artifact tagging) is opt-in. You can also install the video stack from inside the GUI later — Config tab → Tool Paths → Video stack → Install all video tools drives the same pip-install in a streaming-log dialog. L2CS-Net (the default eye-gaze backend) is always installed in-app via the Download recent button next to L2CS-Net, regardless of which scope you pick on the command line — PyPI bans direct VCS dependencies and L2CS only exists at a GitHub URL.

To launch:

ide4eeg                                    # GUI, blank state (default)
ide4eeg myconfig.toml                      # GUI with config preloaded
ide4eeg --run myconfig.toml                # batch run, no GUI
ide4eeg --run myconfig.toml -t             # batch with full Python tracebacks
python -m ide4eeg                          # equivalent fallback if the
                                           # console script isn't on PATH

In GUI mode the optional positional path is just a shortcut for clicking Load pipeline settings .toml right after launch — it preloads the form so the user can review / edit before running. Batch mode (--run) requires a config path explicitly; there is no current-directory default.

Known wheel gaps

Some packages in the [video] extra have wheel gaps on specific (Python, OS, arch) cells. The lite install (pip install ide4eeg) is unaffected.

Cell Affected Resolution
Intel Mac + Python 3.14 onnxruntime (no cp314 x86_64 wheel) Stay on lite install — facetag will be unavailable. To enable facetag here, downgrade to Python 3.13 (cp313 wheels exist). The in-app Install button reaches the same wheel-resolution failure but surfaces it in a friendlier dialog.
Linux + Python 3.14 (any arch) insightface, stringzilla (no cp314 wheels — pip falls back to building from sdist) Install build tools once, then either pip install ide4eeg[video] or use the in-app installer. Debian/Ubuntu: sudo apt install build-essential. Fedora: sudo dnf groupinstall "Development Tools". Python 3.13 has cp313 wheels and avoids the build entirely.
Apple Silicon, Windows, Linux + Python ≤ 3.13 none Both lite and [video] install scopes work directly.

Development install (from a checkout)

git clone https://gitlab.com/fuw_software/ide4eeg.git
cd ide4eeg
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt    # = pip install -e .[video]
ide4eeg                             # editable -- ide4eeg/*.py edits go live

A.2 Java + helper apps

IDE4EEG uses three external helper applications (Svarog, ConnectiVIS, empi) that are JVM- or native-binary-based.

A.3 Video processing libraries (PyTorch + L2CS)

The default requirements.txt ([video]) install ships face detection + the InsightFace head-pose backend, which serves as the fallback gaze estimator. L2CS-Net (true per-eye gaze direction) is the configured default backend but ships separately — install it with the Download L2CS for gaze detection button on the Config tab. When L2CS isn’t installed, gaze detection automatically falls back to InsightFace head-pose.

The button installs PyTorch + torchvision + the l2cs Python package + the Gaze360 weights checkpoint (~96 MB, MIT-mirrored from Ahmednull/L2CS-Net) in one shot. Total install ~500 MB on macOS / Linux-CPU (PyTorch CPU wheel), up to ~2 GB on Linux with CUDA wheels. There is no in-app uninstaller on the L2CS row; to reclaim the space either click Uninstall video tools in the Video-stack section header (see §1.2), or run pip uninstall torch torchvision l2cs and delete the Gaze360 weights file (~/.obci/ide4eeg/models/L2CSNet_gaze360.pkl).

See §3.14 Gaze for the choice between the two backends.


Appendix B. Supported EEG formats

Format Input file Companion files required
BrainTech .raw .xml required; .tag optional (events — without it, resting-state analysis only)
BrainVision .vhdr .vmrk, .eeg
MNE-FIF .fif (none)
EEGLAB .set .fdt if the data is stored separately (single-file .set needs none)
EDF / EDF+ .edf (none) — annotations read automatically
BDF / BDF+ .bdf (none) — annotations read automatically

Channel-type assignments differ per format — see §2.3.1 for the layered inference pipeline. For polysomnographic mixed-modality recordings (PSG, MASS, Sleep-EDF, Physionet-PSG), EDF/BDF channels default to eeg in MNE; IDE4EEG’s overlay re-runs chtype_heuristic on those defaults so EOG/EMG/ECG/respiration/oximetry channels are correctly typed.

To add a new format, see Appendix D.2.


Appendix C. Complete config.toml reference

Below is a fully-commented config.toml showing all parameters — both the commonly-set ones and the hidden defaults. Copy as a starting point and remove or adjust as needed.

# <<< GENERATED: scripts/gen_appendix_c.py (do not edit by hand) >>>
# =====================================================================
# IDE4EEG -- Complete Configuration Reference
# =====================================================================

# --- Paths and general settings ---
input_path = "examples/dipole_spindles/dipoles_example.raw"
output_path = "None"                    # "None" = save next to input file
segmentation = { mode = "epochs", reject_by_annotation = true }   # mode: "rest"/"epochs"; reject_by_annotation: rest-mode PSD omits marked spans (§4.2.2)
overwrite_output = true

# --- Preprocessing toggles ---
# EEG artifacts runs by default; disable it by removing "eeg_artifacts"
# from preprocessing.step_order (no top-level toggle).
prepare_ica = false
prepare_video_artifacts = false
# facetag_frame_skip = 3              # process every 3rd frame (faster)
# facetag_downscale = 0.5             # half resolution (faster)
# facetag_backend = "insightface"     # only "insightface" is supported
# facetag_gaze_method = "l2cs"        # "l2cs" (default) or "insightface"
# trim_start = 0.0                    # crop signal start (seconds)
# trim_end =                          # crop signal end (omit = full signal)

# --- Analysis toggles ---
# MP decomposition is enabled by adding "mp_decomposition" to
# preprocessing.step_order — there is no top-level toggle.
prepare_eeg_profiles = false
prepare_connectivity_analysis = false
prepare_dipole_fitting = false
# mne_catalog = ["erp_butterfly", "cluster_permutation"]  # external analyses

# --- Signal setup ---
electrodes_layout = "native (from file)" # default sentinel; replace with e.g. "standard_1005"
re_reference = []                       # default = no re-referencing; e.g. ["M1","M2"] / "average" / "None"

# --- Advanced: signal trimming (hidden defaults) ---
# resample_freq = 0                    # Hz; 0 keeps original; set to e.g. 256 to downsample
# cover_time = None                    # deprecated, ignored
# threshold_time = None                # deprecated, ignored

# --- Parallelism ---
[parallelism]
n_jobs = 0                              # default = auto (physical cores); 1 = sequential


[choosing_channels]
choose_bad_channels = "auto"            # "auto" or "none" (algorithm)
review_bad_channels = false             # interactive Svarog/MNE picker
dropped_channels = []                   # default = none; e.g. ["Audio","Sample_Counter","Photo"] to drop hardware-aux channels
selected_channels = "all"
check_final_channels = true

# --- Advanced: channel quality (hidden defaults) ---
# The live bad-channel detector is the PREP-aligned [choosing_channels.prep]
# sub-block; its hidden defaults are documented in Appendix E.2.
# (The legacy correlation_window / correlation_badch_threshold /
# [choosing_channels.badchs_params] keys are accepted in old configs for
# round-trip but no longer drive detection — see §E.2.)


[filters]
show_filt = false
plot_filt = false
method = "iir"                         # "iir" or "fir"
highpass_freq = 0                      # Hz; 0 = off (default). Common: 0.1, 0.5, 1.0
lowpass_freq = 0                       # Hz; 0 = off (default). Common: 30, 40, 100
notch_freq = 0                         # Hz; 0 = off (default). Presets: 50 or 60


[eeg_artifacts]
# [EEG artifacts] panel -> subpanel [Amplitudes: P2P and slope] (tag-first).
notch_hz          = 50.0     # mains notch (0 / None = off); + notch_harmonics
notch_harmonics   = true
floor_rel_k       = 1.0      # adaptive floor = k x channel median p2p; 0 = use absolute uV floors
min_amplitude_uv  = 50.0     # absolute p2p floor (uV); used only when floor_rel_k = 0
crossover_ms      = 50.0     # config-only: slope band below / p2p above
max_scale_ms      = 300.0    # config-only: coarsest window
min_windows       = 20       # config-only: n_eff gate, both detectors
p2p_enable        = true
p2p_z_threshold   = 5.0      # robust-z cutoff for p2p
p2p_ceiling_uv    = 0.0      # abs gross-excursion ceiling (uV); 0 = off
slope_enable      = true
slope_z_threshold = 5.0      # robust-z cutoff for slope
slope_min_amplitude_uv = 50.0  # absolute slope step anchor (uV); used only when floor_rel_k = 0
slope_min_samples = 2        # config-only: first-difference floor
hf_enable         = true     # muscle / EMG high-frequency band power
hf_lo_hz          = 90.0     # EMG band lower edge (uV; above neural band)
hf_hi_cap_hz      = 250.0    # upper edge; effective = min(this, 0.45*fs)
hf_z_threshold    = 5.0      # robust-z cutoff for hf (one-sided)
hf_window_s       = 0.5      # config-only: HF power window (non-overlapping)
flat_enable       = true     # relative flat-span detector (low tail of p2p)
flat_fraction     = 0.2      # flat when ptp < this × median(ptp) per channel
flat_window_s     = 0.2      # config-only: ptp window for flatness
flat_min_duration_s = 0.5    # config-only: min flat run to mark (s)
save              = false


[ICA_EOG]
method              = "picard"          # "picard", "infomax", or "fastica"
selector            = "iclabel"         # "iclabel", "find_bads", "both", "none"
review_components   = false             # interactive Svarog/MNE picker
fit_highpass_hz     = 1.0               # Winkler 2015 fit-time HP (fit copy only); 0 = off
fit_lowpass_hz      = 0.0               # fit-time low-pass (fit copy only); 0 = off
fit_notch_hz        = 0.0               # fit-time notch (fit copy only); 0 = off
fit_notch_harmonics = true              # also remove harmonics (default ON)
reject_uv           = 500               # p2p µV — drop crazy segments pre-fit
reject_tstep        = 2.0               # window (s) MNE scores p2p against
n_components        = "rank"            # "rank" | int | float(0..1)
iclabel_keep        = ["brain", "other"]
iclabel_min_prob    = 0.0
# --- Only used when selector ∈ {find_bads, both} ---
find_bads_eog       = true
find_bads_eog_ch    = []
find_bads_ecg       = true
find_bads_muscle    = true
# --- Advanced ---
decim               = 1                 # fit subsampling factor
random_state        = 42
save                = false             # write audit tree + cleaned -raw.fif


[rest]
rest_duration = [0, ""]                 # [start_s, end_s]; "" = full signal
window_length = 20                      # seconds
check_rest = false                      # default; true = open interactive review (blocks batch runs)


[epochs]
start_offset = -0.3                     # seconds before event
stop_offset = 0.7                       # seconds after event
epochs_baseline = "None"                # "None", [start, end], or ["None", "None"]
check_epochs = false                    # default; true = open interactive review (blocks batch runs)
# --- Advanced (hidden defaults) ---
# (reject_dict / flat_dict are legacy [epochs] keys, no longer applied —
#  the epoch cut uses reject=None; bad segments are derived afterwards by
#  [Mark detected artifacts] aggregating the upstream artifact marks, §3.16.)

    [epochs.tags]
    selected = ["event_name_1", "event_name_2"]

[segment_rejection]
# [Mark detected artifacts] owns no DETECTOR (reject_threshold /
# flat_threshold / auto_percentile were retired — see §3.16), but does two
# things: it CONDITIONS the marks (floor/consolidate/dilate -> the final
# marks every consumer reads, always) and optionally DROPS via 'Drop marked'
# + its policy — the contaminated-fraction gate + the G1 run-length trigger
# (auto_reject in event mode, drop_in_continuous in continuous; both keys
# migrated here from [artifacts] 2026-06-13, old location read for back-compat).
auto_reject = true                      # master gate; false = mark-only
artifact_threshold = 0.3                # per-channel contaminated-% gate (GUI %)
artifact_min_run_ms = 200.0             # G1: drop a segment if any channel's
                                        #   longest contiguous contaminated run
                                        #   reaches this; 0 = off (fraction-only)
enable_fraction = true                  # GUI: derived from the % field > 0
enable_run = true                       # GUI: derived from the ms field > 0
enable_consensus = true                 # always on; the quorum is always applied
                                        #   (consensus_frac = 0 => > 0 => any channel)
consensus_frac = 0.0                    # quorum frac (0–1; 0 = any; GUI shows %)
# Per-type mark conditioning (all 0 = no-op): min_mark_ms = duration floor that
# widens short marks; dilate_ms = symmetric pad. Shared by the epoch drop and
# the rest-mode continuous PSD (§4.2.2). Keys: p2p / slope / hf / flat / gaze.
min_mark_ms = {p2p = 0.0, slope = 0.0, hf = 0.0, flat = 0.0, gaze = 0.0}
dilate_ms   = {p2p = 0.0, slope = 0.0, hf = 0.0, flat = 0.0, gaze = 0.0}


[time_domain]
    [time_domain.cluster_test]
    step_down_p = 0.05
    n_permutations = 1024


[connectivity]
methods = ["dtf", "coh"]            # list (or comma-separated string); default
mvar_method = "yw"                  # yw, ns, vm
mvar_order = 0                      # 0 = auto (Akaike)
resolution = 100
channels = "all"                    # default = all EEG channels; subset list to restrict, e.g. ["Fz","Cz"]
short_time = false                  # true = time-resolved sliding-window connectivity
st_window = 0                       # sliding-window length [s]; 0 = signal length / 5
st_overlap = 0                      # sliding-window overlap [s]; 0 = signal length / 10
significance = false                # true = surrogate / bootstrap significance test
sig_reps = 100                      # surrogate / bootstrap replications
sig_alpha = 0.05                    # significance level (α)


# --- EEG profiles ---
# [eeg_profiles]
# mode = "amplitudes"               # "count", "percentage", "amplitudes", "power"
# channel = 0
# (Bin width is always the MP segment length — not configurable.)

# --- Dipole fitting ---
# [dipole_fitting]
# ref_channel = "average"
# montage = "standard_1005"
# max_iterations = 0                # 0 = all atoms
# min_gof = 0                       # 0 = no constraint

# --- MP decomposition (enabled via preprocessing.step_order) ---
# [matching_pursuit]
# (decomposition parameters: algorithm, iterations, ... — see §3.20)
#     [matching_pursuit.outputs]
#     atom_stats_csv = false           # write per-atom statistics CSV
#     atom_histograms = false          # write atom-parameter histogram plots

# --- MP filter (preprocessing) ---
# Internal decomp + reconstruction; independent of [matching_pursuit].
# [mp_filter]
# window_length = 0                 # 0 = inherit from [rest].window_length
# [mp_filter.decomp]
# algorithm = "smp"
# iterations = 30
# explained_energy = 0.99
# [mp_filter.filter]
# mode = "keep"                     # "keep" or "remove"
# freq_min = 0.5                    # "osc. EEG" preset defaults
# freq_max = 49
# scale_min = 0.1

# --- Preprocessing step order and constraints ---
# [preprocessing]
# step_order = [...]                # omit = canonical default order
# hard_constraints = []
# soft_constraints = [["filtering", "ica"], ["bad_channels", "ica"]]
# allow_reorder = false             # permit GUI/config step reordering
# <<< END GENERATED >>>

Appendix D. FAQ

D.1 General

How do I check my Python version?

python3 --version

See Appendix A.1 for the supported range.

I get ModuleNotFoundError

Make sure the virtual environment is activated. You should see (venv) or (.venv) at the start of your terminal prompt.

How do I deactivate the virtual environment?

deactivate

D.2 File formats

IDE4EEG does not recognise my file

Check that:

  1. The file extension is .raw, .vhdr, .fif, .set, .edf, or .bdf.
  2. Required companion files are present (e.g. .raw needs .xml; .tag is optional — without it, resting-state analysis only).
  3. The input_path in your config points to the correct file or directory.

How do I add a new format?

Add a reader function in ide4eeg/input/input.py that returns (mne_signal, events_desc_id) where mne_signal is an mne.io.RawArray with EEG channels and a STIM channel, and events_desc_id is a dict {'event_name': integer_id, ...}.

Then add the file extension to file_extension in find_file_paths and a dispatch case in read_file. Adding a call to _refine_channel_types(raw) at the end gives the new format the same channel-type heuristic used for BrainTech and EDF.

D.3 Preprocessing

How do I define custom filters?

Filters are specified by cutoff frequency, not by name (the old P3ACE filters_dict no longer exists). In the GUI use the Filters panel on the Preprocessing tab; in TOML set highpass_freq, lowpass_freq, and notch_freq (Hz; 0 disables each) and optionally method = "iir" (default) or "fir" inside [filters]. Transition bandwidths and filter order are derived automatically by design_iir_filter in ide4eeg/preprocessing/channels_and_signal.py (see §3.8 and Appendix D.5 for the design details). There is no per-filter Python dict to edit.

How do I select which events to include?

In the GUI, load a file and check the desired events in the Events panel (Preprocessing tab, Segmentation setup section). For TOML configs:

[epochs.tags]
selected = ["event_1", "event_2", "event_3"]

Where are the clean signals saved?

The cleaned epochs are always written to preprocessing/saved_signals/ as -epo.fif files — this is the primary pipeline output. Intermediate signals from individual preprocessing steps are written to preprocessing/saved_steps/ only when their per-step Save checkbox is checked.

I checked a step’s Save checkbox but I don’t see any output. Why?

Save checkboxes default to off. When checked, the step’s intermediate output (transformed signal, detection plots, ICA components, etc.) is written under preprocessing/ after the next pipeline run. If you ran the pipeline before checking the box, run it again to produce the saved output.

D.4 Analysis

All epochs were rejected — what should I do?

Your rejection criteria may be too strict. Try:

How do I customise the hidden default parameters?

Add them directly to your config.toml. Top-level parameters (like resample_freq, cover_time) go at the root level. Section-specific parameters go inside their section (e.g. auto_reject inside [segment_rejection]). See Appendix E for the full list, or Appendix C for a complete config template.

Cluster test plots show no clusters, but the log mentions some

Detailed cluster information is saved in a .txt file in the MNE/ folder.

D.5 Resampling & filtering

How does filtering work?

IDE4EEG supports two filtering methods:

What frequencies should I use?

Filter Typical value Purpose
Highpass 0.1–1.0 Hz Remove electrode drift and DC offset
Lowpass 30–100 Hz Remove high-frequency noise (EMG, electrical)
Notch 50 Hz (Europe/Asia) or 60 Hz (Americas) Remove power line interference

Set any cutoff to 0 to disable that filter.

Do I need to resample?

No. Filters are designed automatically for the signal’s actual sampling rate. Resampling is optional — set the resample frequency to 0 to keep the original rate. Resampling may still be useful to reduce data size (e.g. 2048 Hz → 512 Hz) before analysis.

Comparison with previous versions

Previous versions of IDE4EEG (P3ACE) required resampling to 512 Hz before filtering because filters were defined as named presets (e.g. highpass_05hz) with hardcoded scipy.signal.iirdesign parameters. The current version specifies filters by cutoff frequency in Hz and designs them automatically for any sampling rate. Key technical changes:

D.6 Video artifacts

For a complete description of the pipeline, backends, gaze methods, and all parameters, see §3.14 Gaze.

What are exclude faces?

When recording EEG from children, the mother (or another adult) is often visible in the video. When the child turns away from the screen, only the mother’s face may be detected and falsely matched as the subject, producing incorrect “looking” reports.

The Exclude faces directory lets you specify photos of people to reject. Any detected face that is closer to an exclude reference than to the subject reference is automatically rejected. Leave the field empty to disable exclusion.

How can I speed up video artifact detection?

Two acceleration options:

Both options can be combined. For example, facetag_frame_skip = 3 with facetag_downscale = 0.5 gives roughly 6× overall speedup.

How do I create reference photos?

Reference photos are images of the subject looking directly at the camera. You need 3–5 such images.

  1. Select from video (recommended): in the Preprocessing tab, click Reference faces on the Gaze panel. The tool opens the video, lets you scrub through frames and click on the subject’s face to save it. Use </> to step one frame, <</>> to jump to the next frame with a detected face. Saved faces appear as cropped thumbnails below the video; click a thumbnail to remove it.
  2. Manual: place 3–5 JPEG/PNG images in a folder. Each image must contain exactly one face (the subject) looking at the camera.

By default, picked photos are saved into <output>/IDE4EEG_OUT_<base>/preprocessing/reference_faces/ so they live alongside the rest of the per-recording results (and exclude_faces/ for the exclude set). Use Change folder to override the location.

How do I review detections?

Click the 👁 (eye) button on the Gaze panel header. This opens the Review Video Artifacts window (auto-loading any previous results), showing the video and EEG side-by-side with the detected intervals and Confirm / Dismiss / Save Changes actions.

Which signal is shown? The review window shows the most-recent upstream snapshot — i.e. the signal after any preceding preprocessing steps that have their Save checkbox enabled (filtering, resampling, bad-channel rejection, montage, ICA, etc.). When you’ve already run the pipeline once with [filters].save = true, the gaze review canvas shows the filtered signal — same data the gaze detector saw. When no upstream snapshot exists yet, the review window falls back to the raw input file. Freshness is checked the same way as the eye button — a stale [cfg:<hash>] is regenerated.

EEG signal plot colours.

Controls.

Double-click any interval row to jump to its start time.

D.7 GUI tips


Appendix E. Advanced / hidden parameters

The following parameters have sensible defaults set internally in check_and_prepare_config (input.py). They are not included in the default config.toml but can be added to override the defaults. Parameters that belong to a specific section ([choosing_channels], [epochs], etc.) must be placed in that section; top-level parameters go at the root.

E.1 Signal trimming

Top-level parameters.

Parameter Type Default Description
resample_freq int 0 Target sampling frequency (Hz). 0 = keep original rate.
cover_time float or None None Deprecated — ignored. Use trim_start / trim_end.
threshold_time float or None None Deprecated — ignored. Use trim_start / trim_end.

When both trim_start and trim_end are unset, the signal passes through uncropped.

E.2 Channel quality (advanced)

All five detectors and their shared pre-conditioning live under the [choosing_channels.prep] sub-block. Every key has a literature-backed default (PREP-published where the criterion comes from PREP; clean_flatlines-derived for the flatline-duration detector); only edit what you have a recording-specific reason to change.

Pre-conditioning (shared by all five detectors)

Parameter Default Description
hp_freq_hz 1.0 High-pass cutoff applied to the working copy before any statistic is computed. DC drift dominates raw amplitude and destroys correlation.
notch_hz 50.0 Line-noise notch frequency in Hz; null (or 0) = no notch. 50 Hz (Europe / most of Asia / Africa / Australia); 60 Hz (Americas / parts of Asia / Japan). Inherited from filters.notch_freq at config-load when that is set. The GUI dropdown only offers off / 50 / 60; CLI users hand-editing the TOML to a non-50/60 frequency (e.g. 47 or 100 Hz) get a WARNING log line on the next GUI populate naming the snap back to 50.
notch_harmonics true When ON, the notch removes the fundamental AND every integer harmonic up to Nyquist (50/100/150/… or 60/120/180/…).
max_iterations 1 Refinement passes after the first detection (0–4). Each pass drops the channels found so far and re-runs the three cross-channel detectors on the cleaner reference, surfacing borderline channels the obvious outliers masked. Stops early on convergence. 0 = single pass; 1 = recommended default; 4 = PREP’s cap. (Old iterate_once = true/false maps to 1/0.)

Detector 1 — Absolute amplitude checks

A bundled if-then conditional: if the median SD across channels is plausible, then flag each channel whose SD is below the dead-amp floor. See §3.5 prose for the design rationale. The GUI surfaces this as one panel with a master checkbox; the TOML keeps enable_flat and enable_recording_amp_check as two independent keys for back-compat (the GUI ties them together via a single bool — saving normalises them to the same value).

Parameter Default Description
enable_recording_amp_check true If-clause master. Both this AND enable_flat must be true for the bundled GUI checkbox to render ON.
enable_flat true Then-clause master. Bundled with enable_recording_amp_check in the GUI.
recording_amp_check_min_uV 0.5 Lower bound of plausible median channel SD (µV). Below this, real scalp EEG is implausibly small.
recording_amp_check_max_uV 200.0 Upper bound. Above this is implausibly large — likely a wrong-units anomaly. Set both bounds wide (e.g. 0 and 1e9) to disable the gate and run the per-channel SD check unconditionally.
flat_sd_threshold_uV 1e-3 Per-channel SD floor in microvolts (= 10⁻⁹ V, PREP MATLAB’s literal findNoisyChannels.m:289 value). Anything with robust SD below this is treated as flat provided the median-SD sanity gate passes. See docs/EEG_bad_channels.pdf for the threshold rationale.

Detector 2 — Flatline duration (eps-relative)

Mirrors EEGLAB clean_flatlines.m. Calibration-insensitive by construction; complements Detector 1.

Parameter Default Description
enable_flatline true Enable the duration-based flat detector. Recommended ON.
flatline_max_jitter 20 How many multiples of machine precision count as “no change” between adjacent samples. machine_precision = np.finfo(dtype).eps; on float64 the effective threshold is ~4.4×10⁻¹⁵ V at the default 20 — numerical-noise level, true only for bitwise-identical samples (silent ADC, railed channel, post-rereference collision). EEGLAB clean_flatlines.m default; almost never needs changing.
flatline_max_duration_s 5.0 Maximum contiguous flat-run duration tolerated, in seconds. EEGLAB clean_flatlines.m default. Shorten (e.g. 1 s) on brief event-locked recordings; lengthen (e.g. 10 s) on long resting-state recordings where a brief flat stretch is not yet pathological.

Detector 3 — Amplitude outlier (PREP: “Robust deviation”)

Parameter Default Description
enable_deviation true Enable the deviation detector. Recommended ON.
deviation_z_threshold 5.0 Robust abs(z) threshold across channels for the per-channel robust SD. Default. Lower = more sensitive (more false positives).

Detector 4 — Windowed correlation (max over other channels)

Parameter Default Description
enable_correlation false Enable the windowed-correlation detector. Off by default — on standard 19-ch 10-20 montages the peripheral ring naturally carries r ≈ 0.7-0.9, so it over-flags; enable it (and tune correlation_threshold) for high-density / research montages.
correlation_threshold 0.3 Minimum abs(r) with any other channel that counts as “OK” for a given window. Lowered from PREP/pyprep’s 0.4 (2026-06-13) after a 12-recording sweep — fewer false positives on standard 19-ch 10-20 montages, whose peripheral ring (T7/T8/F7/F8/O1/O2/P7/P8) naturally carries r ≈ 0.7-0.9. HAPPILEE (Lopez 2022) benchmarked 0.7 (low-electrode pediatric) — too strict here. Raise toward 0.7 on low-density / pediatric / high-impedance recordings; lower toward 0.2 if it over-flags.
correlation_window_seconds 1.0 Window length for the windowed correlation pass.
bad_window_fraction_threshold 0.05 Maximum fraction of windows that may fall below the correlation threshold before a channel is flagged. Default 5 % (raised from 0.01 so a few transiently decorrelated windows don’t condemn a channel).

Detector 5 — High-frequency noise ratio

Parameter Default Description
enable_hf_noise true Enable the HF-noise detector. Recommended ON.
hf_noise_z_threshold 5.0 Robust z threshold across channels of the HF/broadband SD ratio. Only positive deviations are flagged (high HF is bad; low HF is normal).
hf_lower_hz 50.0 Lower cutoff of the “high frequency” band for the HF-noise ratio. Lower this only if your recording is bandlimited (e.g. 30 Hz lowpass already applied).

Top-level (sibling block [bad_channels]):

Parameter Default Description
bad_channels.save_plots false Currently dormant — the PREP-aligned detector does not write diagnostic plots; the toggle is reserved for a planned follow-up. Use the Run-log per-channel diagnostic table (one line per channel per criterion with [FLAGGED] markers) in the meantime. GUI label: “Save detection plots”.

Default TOML block

[choosing_channels.prep]
# Pre-conditioning applied to an internal copy before detection.
hp_freq_hz = 1.0
notch_hz = 50.0           # 50 Hz Europe / 60 Hz Americas; null = off
notch_harmonics = true
max_iterations = 1        # refinement passes (0-4); 0 = single pass

# Recording-level amplitude sanity check (median σᴹ envelope).
# If the median is outside [min, max] µV, the SD-floor "flat" check
# is auto-disabled for the run (calibration anomaly detected).
enable_recording_amp_check = true
recording_amp_check_min_uV = 0.5
recording_amp_check_max_uV = 200.0

# Detector 1 — flat (SD floor, µV-absolute, calibration-gated).
enable_flat = true
flat_sd_threshold_uV = 1e-3   # = 10⁻⁹ V; PREP MATLAB literal value

# Detector 2 — flatline duration (eps-relative, calibration-insensitive).
enable_flatline = true
flatline_max_jitter = 20      # × machine precision (np.finfo.eps)
flatline_max_duration_s = 5.0

# Detector 3 — amplitude outlier (robust z of per-channel σᴹ).
enable_deviation = true
deviation_z_threshold = 5.0

# Detector 4 — windowed cross-channel correlation.
enable_correlation = false
correlation_threshold = 0.3
correlation_window_seconds = 1.0
bad_window_fraction_threshold = 0.05

# Detector 5 — HF / broadband σᴹ ratio (one-sided positive z).
enable_hf_noise = true
hf_noise_z_threshold = 5.0
hf_lower_hz = 50.0

TOML back-compat keys

badchs_params, correlation_window, correlation_badch_threshold and a top-level enable_correlation boolean are accepted in old TOML configs but no longer drive detection. They round-trip through save/load unchanged. The top-level enable_correlation boolean is migrated into prep.enable_correlation so a saved “correlation off” preference survives the load; check_and_prepare_config logs a one-shot informational message when any of the four legacy keys is seen.

E.3 Epoch rejection

[Mark detected artifacts] owns no detector but does two things (§3.16): it conditions the upstream marks into the final spans every consumer reads (always), and optionally drops segments. The [segment_rejection] block carries the span-conditioning knobs, the per-mode “Drop marked” toggle, and the drop policy (the two knobs that turn the conditioned marks into a drop):

Parameter Default Description
auto_reject true Apply automatic segment rejection (the event-mode “Drop marked” checkbox). Checked → drop the marked segments; unchecked → mark-only (cut/save/stats run, nothing drops).
artifact_threshold 0.3 Per-channel contaminated-sample fraction (0–1) that drops a segment (lower = stricter).
artifact_min_run_ms 200.0 G1 run-length trigger (ms), OR-ed with the fraction rule. 0 = off.

The per-segment artifact rule that decides which segments are dropped (fraction artifact_threshold or longest-run artifact_min_run_ms, both in [segment_rejection], §3.16) is where you tune drop sensitivity — moved here from [artifacts] (2026-06-13); an old config carrying them under [artifacts] is read for back-compat. The retired [segment_rejection] enabled / auto_percentile / reject_threshold / flat_threshold keys are no longer read (a stale one in an old config is harmless) — the absolute amplitude ceiling moved upstream to the p2p detector (p2p_ceiling_uv), flat spans to the flat detector, and the fixed-fraction tail-drop was removed. The legacy [epochs] reject_dict / flat_dict keys (V-valued dicts) are likewise no longer applied — the epoch cut uses reject=None (issue #11).

E.4 ICA / EOG (advanced)

Inside [ICA_EOG]. See §3.9 ICA for the full table; the parameters below are the ones not normally set in a standard config.

Parameter Default Description
decim 1 Subsampling factor for the fit. Only safe when the fit copy is band-limited (fit_highpass_hz and/or fit_lowpass_hz); without a low-pass leave at 1.
random_state 42 RNG seed.
iclabel_min_prob 0.0 Minimum classifier confidence for the top-1 label. Below this threshold the component falls into other.

E.5 Parallelism (advanced)

For the basic field on the Config tab and what n_jobs accelerates, see §1.3.

How to choose a value

The GUI shows you everything you need to decide, so you don’t have to guess. After typing a number in the Parallel jobs field, three lines appear below it:

  1. Primary label→ sequential (no parallelism) when the field is 1, or → N parallel workers when larger.
  2. System info line — detected hardware, e.g. 8 physical / 16 logical cores, 16 GB RAM, each worker ≈ 250 MB baseline + working memory. Always shown. Physical core count is probed via psutil if installed, otherwise via sysctl (macOS), /proc/cpuinfo (Linux), or ctypes (Windows), with logical-cores-divided-by-two as a conservative fallback.
  3. Warning line (yellow, only when your value is risky). Multiple warnings can stack:

Practical rule of thumb: start with 2 and raise it one step at a time while watching the warning line and your system monitor. On a typical 8-physical-core / 16 GB machine, 2 to 4 is the sweet spot for most pipelines. Expect a 3–5× speedup on long connectivity-bootstrap runs or MNE catalog batches at the top of that range; smaller jobs are dominated by joblib spawn overhead and will not speed up as much.

Why BLAS is pinned to one thread per worker

IDE4EEG installs a global joblib.parallel_config(backend="loky", inner_max_num_threads=1) once per pipeline run. The inner_max_num_threads=1 setting forces NumPy’s BLAS backend (OpenBLAS / MKL / Accelerate) to use a single thread inside each worker. Without this pin, N joblib workers on a machine where BLAS defaults to cpu_count threads would create N × cpu_count OS threads, exhausting CPUs and triggering OS-level scheduler pathologies. The pin guarantees that the total CPU load equals n_jobs — exactly what the warning line shows. It also makes bit-exact reproducibility of floating-point sums possible regardless of n_jobs, because BLAS summation order is deterministic with a single thread.

Concretely: setting n_jobs = 4 on a 16-thread BLAS will use four CPU cores, not 64.

Packaged desktop builds clamp catalog analyses to one worker

When IDE4EEG runs as a frozen standalone build (the Windows / macOS installers and portable bundles), the MNE catalog analyses are forced to n_jobs = 1 no matter what the Parallel jobs field says. The reason is structural: joblib’s loky backend spawns workers by re-launching the Python interpreter, but a frozen build has no interpreter — it re-launches its own application executable. macOS tolerates this via multiprocessing.freeze_support(), but on Windows the worker-argument handoff fails (not enough values to unpack in _freeze_support), so the app crashes or hangs. The ERP cluster-permutation test was the first analysis to surface it in the field (it is the catalog analysis that most aggressively parallelises). The clamp lives in ide4eeg/utils/parallel.py (mne_n_jobs, which mne_catalog re-exports as _n_jobs for back-compat) and only triggers when sys.frozen is set, so it is invisible to a pip / conda install, which keeps full parallelism. This is the same mitigation the ICA step has carried for the same reason. A complementary global clamp in main._configure_parallelism sets the joblib default to 1 on frozen builds, so every bare joblib.Parallel() that inherits the global config — including connectivity bootstraps / short-time and dipole-fitting — is also single-process on frozen builds. MP decomposition is unaffected (empi is a standalone C++ subprocess and was never at risk).

MP decomposition inherits this value

The empi Matching Pursuit binary has its own --cpu-workers switch ([matching_pursuit]cpu_workers in TOML, “use [ ] / N cores” field in the Preprocess tab). The default is to inherit from parallelism.n_jobs — the Preprocess field shows the inherited value as grey italic placeholder text and refreshes live whenever you edit the Config tab field.

You only need to set matching_pursuit.cpu_workers explicitly if empi’s memory model lets you go higher than joblib allows: empi workers share memory inside a single C++ process (~10–50 MB each), while joblib loky workers are full Python interpreters (~250 MB each). On a RAM-constrained machine the two can reasonably diverge; the explicit override exists for exactly that case. Otherwise leave it blank.

E.6 Hash-based caching internals

IDE4EEG’s hash-based caching (§3.20.4) uses a Merkle-style chain of per-step hashes: each step’s hash covers its own config subset plus the hash of every preceding step that influenced its input. The per-step chain is seeded with the empty string — it does not hash the input file itself (a snapshot’s own filename anchors it to its source). The separate MP-book hash (_compute_mp_book_hash) does fold in the source file’s identity. When a saved snapshot’s stamped [cfg:<hash>] matches the current config’s hash, the snapshot is reused; otherwise it’s regenerated.

What’s included in / excluded from each step’s hash.

Three invariants keep the chain stable:

  1. Hash never includes mutated config. Side-effecting steps (bad-channel detection appending to choosing_channels.bad_channels, ICA populating ICA_EOG.bad_sources) freeze their hash before the mutation. Otherwise the second run’s hash wouldn’t match the first run’s stamp.
  2. No deepcopy of bound methods. Manual-review hooks (_review_bad_channels, _review_ica_components, _review_bad_epochs) are passed as closures, not bound methods. Deepcopying a bound method would try to deepcopy self (the QMainWindow), which fails on Qt’s C++ state.
  3. Pin BEFORE truncation. The GUI eye-click path force-pins the hash against the user-level pre-truncation cfg in _run_truncated_pipeline; the runner’s preamble (preprocessing.py) and _launch_pipeline use force=False so an upstream pin survives. This guarantees that a re-entry click on the same eye sees the same hash the truncated run stamped.

These invariants are tested by tests/test_view_step_result.py and tests/test_preamble_phase1.py.


Appendix F. config.toml vs Export Script

Two ways to capture and re-run a configuration outside the GUI:

  1. config.toml — declarative TOML, the same format the GUI uses internally. Save with Save Config…, run with ide4eeg --run myconfig.toml.
  2. Exported Python script — a self-contained .py file calling ide4eeg.api.run_file() with the current parameters. Generate with Export Script… (Config tab or File → Export Script…), run with python3 generated_script.py.

Both share the same code path under the hood — same preprocessing pipeline, same analysis modules, matching results (up to small floating-point differences under parallelism).

F.1 Side-by-side

Aspect config.toml Exported script
Format declarative TOML imperative Python
Round-trips with the GUI yes (Load Config…) one-way export only
Verbosity compact (only the keys you set) one keyword arg per non-default value
Loops over files / subjects needs an outer shell script inline Python for loop
Conditional logic / parameter sweeps needs templating native Python branching
Custom callbacks (manual-review hooks) not expressible yes (run_file(..., bad_channels_hook=fn))
Reads as documentation requires manual lookup of TOML keys reads as a worked example of ide4eeg.api
Diffability across runs excellent (line-by-line text diff) OK (kwargs block diffs cleanly)
External invocation ide4eeg --run cfg.toml python3 script.py
IDE4EEG location implicit (whatever’s on PYTHONPATH) explicit sys.path.insert(0, "<repo>") line at the top
Disabled features every key persists in the file feature-gated sub-configs are pruned (e.g. [ICA_EOG] is omitted when prepare_ica = false)

F.2 When to prefer config.toml

F.3 When to prefer the exported script

F.4 What both leave on disk

When a pipeline runs (via either route), the resolved config — including any defaults applied internally and any in-flight modifications (e.g. per-analysis narrowing) — is saved as config.toml in the output directory. Even if you launched with the exported script, the output folder still gets a TOML you can later Load Config… back into the GUI.