Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| workflow_dataset_to_rkns [2025/10/20 12:45] – fabricio | workflow_dataset_to_rkns [2025/10/20 14:03] (current) – [AI-Assisted Mapping Generation] fabricio | ||
|---|---|---|---|
| Line 37: | Line 37: | ||
| </ | </ | ||
| - | ==== Conversion Workflow ==== | + | ===== Conversion Workflow |
| The '' | The '' | ||
| - | ====== 1.1 Load and Standardize Signals | + | ==== 1.1 Load and Standardize Signals ==== |
| <code python> | <code python> | ||
| rkns_obj = rkns.from_external_format( | rkns_obj = rkns.from_external_format( | ||
| Line 51: | Line 51: | ||
| Uses regex mappings in '' | Uses regex mappings in '' | ||
| - | ===== 1.2 Add Annotations | + | ==== 1.2 Add Annotations ==== |
| Converts the TSV file to BIDS-compatible format and adds events: | Converts the TSV file to BIDS-compatible format and adds events: | ||
| <code python> | <code python> | ||
| Line 61: | Line 61: | ||
| </ | </ | ||
| - | ===== 1.3 Extract and Categorize Metadata | + | ==== 1.3 Extract and Categorize Metadata ==== |
| - Matches participant ID (e.g., " | - Matches participant ID (e.g., " | ||
| - Looks up the participant in '' | - Looks up the participant in '' | ||
| Line 67: | Line 67: | ||
| - Groups metadata by category and adds each to the RKNS object. | - Groups metadata by category and adds each to the RKNS object. | ||
| - | ===== 1.4 Finalize and Export | + | ==== 1.4 Finalize and Export ==== |
| <code python> | <code python> | ||
| rkns_obj.populate() | rkns_obj.populate() | ||
| Line 73: | Line 73: | ||
| </ | </ | ||
| - | ===== 1.5 Validate | + | ==== 1.5 Validate ==== |
| <code python> | <code python> | ||
| validate_rkns_checksum(output_file) | validate_rkns_checksum(output_file) | ||
| Line 110: | Line 110: | ||
| ---- | ---- | ||
| - | ===== 3. Standardizing Names via Regex Mappings ===== | + | ====== 3. Standardizing Names via Regex Mappings |
| Once you have a compliant TSV, normalize channel and event names non-destructively using regex mappings defined in two JSON files. | Once you have a compliant TSV, normalize channel and event names non-destructively using regex mappings defined in two JSON files. | ||
| - | ====== Channel Names → assets/ | + | ===== Channel Names → assets/ |
| EDF channel labels vary wildly (e.g., "EEG C3-A2", | EDF channel labels vary wildly (e.g., "EEG C3-A2", | ||
| Line 134: | Line 134: | ||
| Use '' | Use '' | ||
| - | | + | 1. Extract unique channel names from your EDF files: |
| <code bash> | <code bash> | ||
| find /data -name " | find /data -name " | ||
| </ | </ | ||
| - | | + | 2. Run the test: |
| <code bash> | <code bash> | ||
| python test_replace_channels.py | python test_replace_channels.py | ||
| </ | </ | ||
| - | | + | 3. Inspect the output: |
| <code bash> | <code bash> | ||
| cat out/ | cat out/ | ||
| Line 151: | Line 151: | ||
| If you have a list of unique channel names, use this prompt with your LLM to accelerate mapping creation: | If you have a list of unique channel names, use this prompt with your LLM to accelerate mapping creation: | ||
| - | > I will provide you a list of EEG channels | + | <code> |
| - | > I require an output in JSON format that maps channel | + | I will provide you a list of physiological signal channels (e.g., |
| - | > | + | I require an output in JSON format that maps each raw channel |
| - | > For example, this is a reference mapping. Note that replacements are applied in sequential order: | + | |
| - | > <code json> | + | |
| - | > { | + | |
| - | > " | + | |
| - | > " | + | |
| - | > " | + | |
| - | > " | + | |
| - | > } | + | |
| - | > </ | + | |
| - | > | + | |
| - | > The keys are regex patterns (case-insensitive), | + | |
| - | > | + | |
| - | > Based on the above example, generate a similar JSON mapping for the following list of channel names: | + | |
| - | > ``` | + | |
| - | > [INSERT YOUR UNIQUE CHANNEL LIST HERE] | + | |
| - | > ``` | + | |
| - | > | + | |
| - | > Provide the output within a JSON code block. | + | |
| + | Key requirements: | ||
| + | 1. **Include early normalization rules** to handle common delimiters (e.g., `_`, spaces, `.`, `/`, parentheses) by converting them to hyphens (`-`), collapsing multiple hyphens, and trimming leading/ | ||
| + | 2. All patterns must be **case-insensitive** (use `(?i)`). | ||
| + | 3. Use **physiologically meaningful, NSRR/ | ||
| + | - `EEG-C3-M2` (not `EEG-C3_A2` or ambiguous forms) | ||
| + | - `EMG-LLEG` / `EMG-RLEG` for leg EMG (not `LAT`/`RAT` as position) | ||
| + | - `RESP-AIRFLOW-THERM` or `RESP-AIRFLOW-PRES` (not generic `RESP-NASAL`) | ||
| + | - `EOG-LOC` / `EOG-ROC` for eye channels | ||
| + | - `EMG-CHIN` for chin EMG | ||
| + | - `PULSE` for heart rate or pulse signals (unless raw ECG → `ECG`) | ||
| + | 4. **Do not include a final catch-all rule** (e.g., `^(.+)$ → MISC-\1`) unless explicitly requested—most channels in the input list should be known and mapped specifically. | ||
| + | 5. Replacements are applied **in order**, with each rule operating on the result of the previous one. | ||
| + | |||
| + | Example reference snippet: | ||
| + | ```json | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | ``` | ||
| + | |||
| + | Now, generate a similar JSON mapping for the following list of channel names: | ||
| + | ``` | ||
| + | [INSERT YOUR UNIQUE CHANNEL LIST HERE] | ||
| + | ``` | ||
| + | |||
| + | Provide the output **within a JSON code block only**—no explanations. | ||
| + | </ | ||
| ===== Event Descriptions → assets/ | ===== Event Descriptions → assets/ | ||