{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting Started with CML Readers" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import json\n", "import pandas as pd\n", "import cmlreaders as cml" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Finding Files on Rhino\n", "\n", "The PathFinder helper class can be used to locate files on RHINO. It's sole responsibility is to locate and return the file path of the file. In many cases, a file could be located in more than one location. In these situations, PathFinder will search over the list of possible locations and return the path where the file is first found. Implicitly, this assumes that the order of the file locations is prioritized such that the preferred location comes before a fall-back location. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# If not working on RHINO, specify the mount point.\n", "# Alternatively, set the CML_ROOT environment variable and never\n", "# have to explicitly pass the rootdir keyword argument.\n", "rhino_root = \"/mnt/rhino/\"\n", "\n", "# Instantiate the finder object\n", "finder = cml.PathFinder(subject=\"R1389J\", experiment=\"catFR5\", session=1, \n", " localization=0, montage=0, rootdir=rhino_root)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What can you request?\n", "\n", "The PathFinder has a few built-in properties to help you understand what data types are currently supported. Different file types require that the finder be instantiated with different fields. For example, if you are planning to request localization files, there is no need to specify an experiment, session, or montage. However, it is not a problem to specify too many fields, as any extraneous ones will simply be ignored if the data type does not require that it be given. The following properties are defined:\n", "\n", "- requestable_files: All supported data types\n", "- localization_files: Files related to localization\n", "- montage_files: Files associated with a specific montage\n", "- session_files: Files that are specific to a session. This files could be processed events, Ramulator files, etc.\n", "\n", "For high-level information about each of these data types, see the [Data Guide](https://pennmem.github.io/cmlreaders/html/data_guide.html) section of the documentation." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['r1_index',\n", " 'ltp_index',\n", " 'pyfr_index',\n", " 'pyfr_root',\n", " 'localization',\n", " 'voxel_coordinates',\n", " 'prior_stim_results',\n", " 'electrode_coordinates',\n", " 'jacksheet',\n", " 'area',\n", " 'electrode_categories',\n", " 'good_leads',\n", " 'leads',\n", " 'classifier_excluded_leads',\n", " 'matlab_bipolar_talstruct',\n", " 'matlab_monopolar_talstruct',\n", " 'pairs',\n", " 'contacts',\n", " 'session_summary',\n", " 'classifier_summary',\n", " 'math_summary',\n", " 'target_selection_table',\n", " 'baseline_classifier',\n", " 'all_events',\n", " 'task_events',\n", " 'math_events',\n", " 'ps4_events',\n", " 'sources',\n", " 'processed_eeg',\n", " 'experiment_log',\n", " 'session_log',\n", " 'ramulator_session_folder',\n", " 'event_log',\n", " 'experiment_config',\n", " 'raw_eeg',\n", " 'odin_config',\n", " 'used_classifier',\n", " 'excluded_pairs',\n", " 'all_pairs']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "finder.requestable_files" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('localization',)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "finder.localization_files" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('pairs',\n", " 'contacts',\n", " 'voxel_coordinates',\n", " 'prior_stim_results',\n", " 'electrode_coordinates',\n", " 'jacksheet',\n", " 'good_leads',\n", " 'leads',\n", " 'area',\n", " 'classifier_excluded_leads',\n", " 'electrode_categories',\n", " 'target_selection_file',\n", " 'baseline_classifier')" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "finder.montage_files" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('session_summary',\n", " 'classifier_summary',\n", " 'math_summary',\n", " 'used_classifier',\n", " 'excluded_pairs',\n", " 'all_pairs',\n", " 'experiment_log',\n", " 'session_log',\n", " 'event_log',\n", " 'experiment_config',\n", " 'raw_eeg',\n", " 'odin_config',\n", " 'all_events',\n", " 'task_events',\n", " 'math_events',\n", " 'ps4_events')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "finder.session_files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Finding File Paths" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/mnt/rhino/protocols/r1/subjects/R1389J/localizations/0/montages/0/neuroradiology/current_processed/pairs.json\n", "/mnt/rhino/protocols/r1/subjects/R1389J/experiments/catFR5/sessions/1/behavioral/current_processed/task_events.json\n", "/mnt/rhino/data10/RAM/subjects/R1389J/tal/VOX_coords_mother.txt\n" ] } ], "source": [ "# Find some example files\n", "example_data_types = ['pairs', 'task_events', 'voxel_coordinates']\n", "for data_type in example_data_types:\n", " print(finder.find(data_type=data_type))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Identifying Available Sessions\n", "\n", "CMLReaders contains a utility function for loading the json-formatted index files located in the protocols/ directory on RHINO as a dataframe. Once loaded, the standard pandas selection idioms can be used to answer questions such as:\n", "\n", "1. What subjects completed FR1?\n", "2. What experiments did subject R1111M complete?\n", "3. How many sessions have been colleted of PAL1?\n", "\n", "For many analyses, this will be the first step in determining the sample of subjects to be used." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "from cmlreaders import get_data_index" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Recognition | \n", "all_events | \n", "contacts | \n", "experiment | \n", "import_type | \n", "localization | \n", "math_events | \n", "montage | \n", "original_experiment | \n", "original_session | \n", "pairs | \n", "ps4_events | \n", "session | \n", "subject | \n", "subject_alias | \n", "system_version | \n", "task_events | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR1/s... | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "FR1 | \n", "build | \n", "0 | \n", "protocols/r1/subjects/R1001P/experiments/FR1/s... | \n", "0 | \n", "NaN | \n", "0 | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "NaN | \n", "0 | \n", "R1001P | \n", "R1001P | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR1/s... | \n", "
1 | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR1/s... | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "FR1 | \n", "build | \n", "0 | \n", "protocols/r1/subjects/R1001P/experiments/FR1/s... | \n", "0 | \n", "NaN | \n", "1 | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "NaN | \n", "1 | \n", "R1001P | \n", "R1001P | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR1/s... | \n", "
2 | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR2/s... | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "FR2 | \n", "build | \n", "0 | \n", "protocols/r1/subjects/R1001P/experiments/FR2/s... | \n", "0 | \n", "NaN | \n", "0 | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "NaN | \n", "0 | \n", "R1001P | \n", "R1001P | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR2/s... | \n", "
3 | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR2/s... | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "FR2 | \n", "build | \n", "0 | \n", "protocols/r1/subjects/R1001P/experiments/FR2/s... | \n", "0 | \n", "NaN | \n", "1 | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "NaN | \n", "1 | \n", "R1001P | \n", "R1001P | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/FR2/s... | \n", "
4 | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/PAL1/... | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "PAL1 | \n", "build | \n", "0 | \n", "protocols/r1/subjects/R1001P/experiments/PAL1/... | \n", "0 | \n", "NaN | \n", "0 | \n", "protocols/r1/subjects/R1001P/localizations/0/m... | \n", "NaN | \n", "0 | \n", "R1001P | \n", "R1001P | \n", "NaN | \n", "protocols/r1/subjects/R1001P/experiments/PAL1/... | \n", "
\n", " | eegoffset | \n", "category | \n", "category_num | \n", "eegfile | \n", "exp_version | \n", "experiment | \n", "intrusion | \n", "is_stim | \n", "item_name | \n", "item_num | \n", "... | \n", "recog_rt | \n", "recognized | \n", "rectime | \n", "rejected | \n", "serialpos | \n", "session | \n", "stim_list | \n", "stim_params | \n", "subject | \n", "type | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "-1 | \n", "X | \n", "-999 | \n", "\n", " | \n", " | catFR5 | \n", "-999 | \n", "False | \n", "X | \n", "-999 | \n", "... | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "1 | \n", "False | \n", "[] | \n", "R1389J | \n", "STIM_ARTIFACT_DETECTION_START | \n", "
1 | \n", "5831 | \n", "X | \n", "-999 | \n", "R1389J_catFR5_1_28Feb18_1552.h5 | \n", "\n", " | catFR5 | \n", "-999 | \n", "False | \n", "\n", " | -1 | \n", "... | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "1 | \n", "False | \n", "[{'amplitude': 500.0, 'anode_label': 'STG6', '... | \n", "R1389J | \n", "STIM_ON | \n", "
2 | \n", "7790 | \n", "X | \n", "-999 | \n", "R1389J_catFR5_1_28Feb18_1552.h5 | \n", "\n", " | catFR5 | \n", "-999 | \n", "False | \n", "\n", " | -1 | \n", "... | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "1 | \n", "False | \n", "[{'amplitude': 500.0, 'anode_label': 'STG6', '... | \n", "R1389J | \n", "STIM_ON | \n", "
3 | \n", "9786 | \n", "X | \n", "-999 | \n", "R1389J_catFR5_1_28Feb18_1552.h5 | \n", "\n", " | catFR5 | \n", "-999 | \n", "False | \n", "\n", " | -1 | \n", "... | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "1 | \n", "False | \n", "[{'amplitude': 500.0, 'anode_label': 'STG6', '... | \n", "R1389J | \n", "STIM_ON | \n", "
4 | \n", "11782 | \n", "X | \n", "-999 | \n", "R1389J_catFR5_1_28Feb18_1552.h5 | \n", "\n", " | catFR5 | \n", "-999 | \n", "False | \n", "\n", " | -1 | \n", "... | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "-999 | \n", "1 | \n", "False | \n", "[{'amplitude': 500.0, 'anode_label': 'STG6', '... | \n", "R1389J | \n", "STIM_ON | \n", "
5 rows × 28 columns
\n", "