workflow_dataset_to_rkns

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
workflow_dataset_to_rkns [2025/10/20 12:49] fabricioworkflow_dataset_to_rkns [2025/10/20 14:03] (current) – [AI-Assisted Mapping Generation] fabricio
Line 110: Line 110:
 ---- ----
  
-===== 3. Standardizing Names via Regex Mappings =====+====== 3. Standardizing Names via Regex Mappings ======
  
 Once you have a compliant TSV, normalize channel and event names non-destructively using regex mappings defined in two JSON files. Once you have a compliant TSV, normalize channel and event names non-destructively using regex mappings defined in two JSON files.
Line 152: Line 152:
  
 <code> <code>
-I will provide you a list of EEG channels extracted from the original EDFs.   +I will provide you a list of physiological signal channels (e.g., EEG, EOG, EMG, respiratory, cardiac) extracted from original EDF files.   
-I require an output in JSON format that maps channel names to new standardized names using grep-style regex replacements.  +I require an output in JSON format that maps each raw channel name to standardized name using **sequential, grep-style regex replacements**.
  
-For examplethis is reference mappingNote that replacements are applied in sequential order: +Key requirements: 
-```+1. **Include early normalization rules** to handle common delimiters (e.g.`_`, spaces, `.`, `/`, parentheses) by converting them to hyphens (`-`), collapsing multiple hyphens, and trimming leading/trailing hyphens. 
 +2. All patterns must be **case-insensitive** (use `(?i)`). 
 +3. Use **physiologically meaningful, NSRR/AASM-aligned names**, such as: 
 +   - `EEG-C3-M2` (not `EEG-C3_A2` or ambiguous forms) 
 +   - `EMG-LLEG` / `EMG-RLEG` for leg EMG (not `LAT`/`RAT` as position) 
 +   - `RESP-AIRFLOW-THERM` or `RESP-AIRFLOW-PRES` (not generic `RESP-NASAL`) 
 +   - `EOG-LOC` / `EOG-ROC` for eye channels 
 +   - `EMG-CHIN` for chin EMG 
 +   - `PULSE` for heart rate or pulse signals (unless raw ECG → `ECG`) 
 +4. **Do not include final catch-all rule** (e.g., `^(.+)$ → MISC-\1`) unless explicitly requested—most channels in the input list should be known and mapped specifically. 
 +5. Replacements are applied **in order**, with each rule operating on the result of the previous one. 
 + 
 +Example reference snippet
 +```json
 { {
-  "(?i)^ABDO\\sRES\\s*$": "RESP-ABD", +  "(?i)[\\s_\\./\\(\\),]+": "-", 
-  "(?i)^THOR\\sRES\\s*$": "RESP-CHEST", +  "-+": "-", 
-  "(?i)^EEG\\(sec\\)\\s*$": "EEG-C3-A2", +  "^-|-$": "", 
-  "_": "-"+  "(?i)^abdomen$": "RESP-ABD", 
 +  "(?i)^c3_m2$": "EEG-C3-M2", 
 +  "(?i)^lat$": "EMG-LLEG"
 } }
 ``` ```
  
-The keys are regex patterns (case-insensitive), and the values are replacement strings. Groups in the pattern can be referenced in the replacement (e.g., ''\1''). +Now, generate a similar JSON mapping for the following list of channel names:
- +
-Based on the above example, generate a similar JSON mapping for the following list of channel names:+
 ``` ```
 [INSERT YOUR UNIQUE CHANNEL LIST HERE] [INSERT YOUR UNIQUE CHANNEL LIST HERE]
 ``` ```
  
-Provide the output within a JSON code block.+Provide the output **within a JSON code block only**—no explanations.
 </code> </code>
 ===== Event Descriptions → assets/event_description_mapping.json ===== ===== Event Descriptions → assets/event_description_mapping.json =====
  • workflow_dataset_to_rkns.1760964560.txt.gz
  • Last modified: 2025/10/20 12:49
  • by fabricio