{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reproducing: Mwata-Velu et al. (2023)\n", "\n", "**Paper**: \"EEG-BCI Features Discrimination between Executed and Imagined Movements Based on FastICA, Hjorth Parameters, and SVM\" \n", "**Journal**: Mathematics 2023, 11, 4409 \n", "**DOI**: https://doi.org/10.3390/math11214409\n", "\n", "## ICA Strategy\n", "\n", "The paper does not specify whether ICA is fitted per-run, per-subject, or globally.\n", "This notebook supports all three strategies via `config.ICA_STRATEGY`.\n", "Change it in `config.py` before running.\n", "\n", "## Key Implementation Decisions\n", "\n", "| Decision | Paper says | We do | Rationale |\n", "|----------|-----------|-------|----------|\n", "| Pipeline order | Figure 1: filter→ICA; Algorithm 1: ICA with internal sub-band eval | ICA then sub-band eval | Energy criterion is meaningless on pre-filtered data |\n", "| ICA algorithm | Gram-Schmidt (Algorithm 1, Step 3) | `algorithm='deflation'` | Deflation uses Gram-Schmidt |\n", "| Energy criterion | ∀χ ∈ {α, β, **γ**} | ∀χ ∈ {**θ**, α, β} | γ never defined; likely typo for θ |\n", "| ICA scope | Not specified | Configurable | Reproducibility variable |\n", "| Classification Method | Methods 1 and 2 | Method 2 only (cross-subject) | Method 1 split is contradictory |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import mne\n", "import warnings\n", "warnings.filterwarnings('ignore', category=RuntimeWarning) # MNE verbosity\n", "mne.set_log_level('WARNING')\n", "\n", "from config import (\n", " ICA_STRATEGY, DATA_DIR, TARGET_CHANNELS,\n", " TRAIN_SUBJECTS, TEST_SUBJECTS, VAL_SUBJECTS, TARGET_RUNS,\n", ")\n", "from pipeline import (\n", " process_subject_per_run, process_subject_per_subject,\n", " process_subject_global, fit_global_ica,\n", " classify_with_averaging, print_results_summary,\n", ")\n", "\n", "print(f\"Data directory: {DATA_DIR}\")\n", "print(f\"ICA strategy: {ICA_STRATEGY}\")\n", "print(f\"Target channels: {TARGET_CHANNELS}\")\n", "print(f\"Subjects: {len(TRAIN_SUBJECTS)} train, {len(TEST_SUBJECTS)} test, {len(VAL_SUBJECTS)} val\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Process All Subjects (Method 2: Cross-Subject Split)\n", "\n", "Table 4 from the paper:\n", "- Training: Subjects 1-83 (80%)\n", "- Testing: Subjects 84-93 (10%)\n", "- Validation: Subjects 94-103 (10%)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If using global ICA, fit it first on training subjects\n", "global_ica_results = None\n", "if ICA_STRATEGY == 'global':\n", " print(\"Fitting global ICA on all training subjects...\")\n", " global_ica_results = fit_global_ica(TRAIN_SUBJECTS)\n", " print(\"Global ICA fitting complete.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def process_subject_group(subject_list, group_name):\n", " \"\"\"Process a group of subjects and collect features/labels.\"\"\"\n", " all_features = []\n", " all_labels = []\n", " total_kept = 0\n", " n_processed = 0\n", " \n", " print(f\"\\n{'='*60}\")\n", " print(f\"PROCESSING {group_name} ({len(subject_list)} subjects)\")\n", " print(f\"{'='*60}\")\n", " \n", " for subject_num in subject_list:\n", " try:\n", " if ICA_STRATEGY == 'global':\n", " feats, labels, runs, n_kept = process_subject_global(\n", " subject_num, global_ica_results, verbose=True)\n", " elif ICA_STRATEGY == 'per_subject':\n", " feats, labels, runs, n_kept = process_subject_per_subject(\n", " subject_num, verbose=True)\n", " elif ICA_STRATEGY == 'per_run':\n", " feats, labels, runs, n_kept = process_subject_per_run(\n", " subject_num, verbose=True)\n", " \n", " if len(feats) > 0:\n", " combined_feats = np.vstack(feats)\n", " combined_labels = np.concatenate(labels)\n", " all_features.append(combined_feats)\n", " all_labels.append(combined_labels)\n", " total_kept += n_kept\n", " n_processed += 1\n", " \n", " n_me = np.sum(combined_labels == 'ME')\n", " n_mi = np.sum(combined_labels == 'MI')\n", " print(f\" S{subject_num:03d}: {len(combined_labels)} epochs \"\n", " f\"(ME={n_me}, MI={n_mi}), {n_kept} ICA components kept\")\n", " except Exception as e:\n", " print(f\" S{subject_num:03d}: ERROR - {e}\")\n", " continue\n", " \n", " if len(all_features) > 0:\n", " X = np.vstack(all_features)\n", " y = np.concatenate(all_labels)\n", " else:\n", " X = np.array([])\n", " y = np.array([])\n", " \n", " print(f\"\\n {group_name} total: {len(y)} samples \"\n", " f\"(ME={np.sum(y=='ME')}, MI={np.sum(y=='MI')})\")\n", " if n_processed > 0:\n", " print(f\" Avg ICA components kept: {total_kept / n_processed:.1f}\")\n", " \n", " return X, y" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X_train, y_train = process_subject_group(TRAIN_SUBJECTS, \"TRAINING\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X_test, y_test = process_subject_group(TEST_SUBJECTS, \"TESTING\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X_val, y_val = process_subject_group(VAL_SUBJECTS, \"VALIDATION\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Summary\n", "print(f\"\\n{'='*60}\")\n", "print(f\"DATASET SUMMARY\")\n", "print(f\"{'='*60}\")\n", "print(f\" Training: {X_train.shape} — ME={np.sum(y_train=='ME')}, MI={np.sum(y_train=='MI')}\")\n", "print(f\" Testing: {X_test.shape} — ME={np.sum(y_test=='ME')}, MI={np.sum(y_test=='MI')}\")\n", "print(f\" Validation: {X_val.shape} — ME={np.sum(y_val=='ME')}, MI={np.sum(y_val=='MI')}\")\n", "print(f\"\\n Paper expects: ~6972 train, ~840 test, ~840 val\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Classification with 5-Run Averaging" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "results = classify_with_averaging(\n", " X_train, y_train,\n", " X_test, y_test,\n", " X_val, y_val,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print_results_summary(results)" ] } ], "metadata": { "kernelspec": { "display_name": "THESIS", "language": "python", "name": "thesis" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.23" } }, "nbformat": 4, "nbformat_minor": 4 }