API¶
CML Reader¶
-
class
cmlreaders.
CMLReader
(subject: str, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = None, montage: Union[int, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ Generic reader for all CML-specific files
Notes
At import, all the readers from
cmlreaders.readers
will register the data types that should correspond to that reader by updating the reader_names dictionary. reader_names is a dict whose keys are one of the data types understood bycmlreaders.PathFinder
and defined incmlreaders.constants
. Values are the name of the reader class that should be used for loading/reading the data type. When an instance ofcmlreaders.cmlreader.CMLReader
is instantiated, a new dictionary is created that maps the data types to the actual reader class, rather than just the class name. In essence,cmlreaders.cmlreader.CMLReader
is a factory that routes the requests for loading a particular data type to the reader defined to handle that data.-
static
get_data_index
(protocol: str = 'all', rootdir: Union[str, NoneType] = None) → pandas.core.frame.DataFrame[source]¶ Shortcut for the global
get_data_index()
function to only need to importCMLReader
.
-
get_reader
(data_type)[source]¶ Return an instance of the reader class for the given data type.
Notes
Reader instances get cached via
functools.lru_cache()
.
-
load
(data_type: str, **kwargs)[source]¶ Load requested data into memory.
Parameters: data_type – Type of data to load (see readers
for available options)Notes
Keyword arguments that are accepted depend on the type of data being loaded. See
load_eeg()
for details.
-
load_eeg
(events: Union[pandas.core.frame.DataFrame, NoneType] = None, rel_start: int = None, rel_stop: int = None, scheme: Union[pandas.core.frame.DataFrame, NoneType] = None)[source]¶ Load EEG data.
Keyword Arguments: - events – Events to load EEG epochs from. Incompatible with passing
epochs
. - rel_start – Start time in ms relative to passed event onsets. This parameter is required when passing events and not used otherwise.
- rel_stop – Stop time in ms relative to passed event onsets. This parameter is required when passing events and not used otherwise.
- scheme – When specified, a bipolar scheme to rereference the data with and/or filter by channel. Rereferencing is only possible if the data were recorded in monopolar (a.k.a. common reference) mode.
Returns: Return type: EEGContainer
Raises: RereferencingNotPossibleError
– When passingscheme
and the data do not support rereferencing.IncompatibleParametersError
– When bothevents
andepochs
are specified orevents
are used without passingrel_start
and/orrel_stop
.
- events – Events to load EEG epochs from. Incompatible with passing
-
classmethod
load_events
(subjects: Union[str, typing.List[str], NoneType] = None, experiments: Union[str, typing.List[str], NoneType] = None, rootdir: Union[str, NoneType] = None) → pandas.core.frame.DataFrame[source]¶ Load events from multiple sessions.
Parameters: - subjects – Subject or list of subjects.
- experiments – Experiment or list of experiments to include.
- rootdir – Path to root data directory.
-
localization
¶ Determine the localization number.
-
montage
¶ Determine the montage number.
-
path_finder
¶ Return a path finder using the proper kwargs.
-
static
Unit conversions¶
-
cmlreaders.convert.
events_to_epochs
(events: pandas.core.frame.DataFrame, rel_start: int, rel_stop: int, sample_rate: Union[int, float], basenames: Union[typing.List[str], NoneType] = None) → List[Tuple[[int, int], int]][source]¶ Convert events to epochs.
Parameters: - events – Events to read.
- rel_start – Start time relative to events in ms.
- rel_stop – Stop time relative to events in ms.
- sample_rate – Sample rate in Hz.
- basenames – EEG file basenames.
Returns: A list of tuples giving absolute start and stop times in number of samples.
Return type: epochs
-
cmlreaders.convert.
milliseconds_to_events
(onsets: List[Union[int, float]], sample_rate: Union[int, float]) → pandas.core.frame.DataFrame[source]¶ Take times and produce a minimal events
pd.DataFrame
to load EEG data with.Parameters: - onsets – Onset times in ms.
- sample_rate – Sample rate in samples per second.
Returns: A
pd.DataFrame
witheegoffset
as the only column.Return type: events
Custom Readers¶
-
class
cmlreaders.readers.readers.
BaseCSVReader
(data_type: str, subject: Union[str, NoneType] = None, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = 0, montage: Union[int, NoneType] = 0, file_path: Union[str, NoneType] = None, eeg_basename: Union[str, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ Base class for reading CSV files.
-
class
cmlreaders.readers.readers.
BaseJSONReader
(data_type: str, subject: Union[str, NoneType] = None, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = 0, montage: Union[int, NoneType] = 0, file_path: Union[str, NoneType] = None, eeg_basename: Union[str, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ Generic reader class for loading simple JSON files.
Returns a
pd.DataFrame
.
-
class
cmlreaders.readers.readers.
ClassifierContainerReader
(data_type, subject, experiment, session, localization, file_path=None, rootdir='/', **kwargs)[source]¶ Reader class for loading a serialized classifier classifier
Notes
By default, a
classiflib.container.ClassifierContainer
class is returned.
-
class
cmlreaders.readers.readers.
EventReader
(data_type: str, subject: Union[str, NoneType] = None, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = 0, montage: Union[int, NoneType] = 0, file_path: Union[str, NoneType] = None, eeg_basename: Union[str, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ Reader for all experiment events.
Returns a
pd.DataFrame
.
-
class
cmlreaders.readers.readers.
MNICoordinatesReader
(data_type: str, subject: str, **kwargs)[source]¶
-
class
cmlreaders.readers.readers.
RAMCSVReader
(data_type, subject, localization, experiment=None, file_path=None, rootdir='/', **kwargs)[source]¶ CSV reader type for RAM data.
-
class
cmlreaders.readers.readers.
RamulatorEventLogReader
(data_type, subject, experiment, session, file_path=None, rootdir='/', **kwargs)[source]¶ Reader for Ramulator event log
-
class
cmlreaders.readers.readers.
TextReader
(data_type: str, subject: str, **kwargs)[source]¶ Generic reader class for reading RAM text files
-
class
cmlreaders.readers.eeg.
BaseEEGReader
(filename: str, dtype: Type[numpy.dtype], epochs: List[Tuple[int, Union[int, NoneType]]], scheme: Union[pandas.core.frame.DataFrame, NoneType])[source]¶ Base class for actually reading EEG data. Subclasses will be used by
EEGReader
to actually read the format-specific EEG data.Parameters: - filename – Base name for EEG file(s) including absolute path
- dtype – numpy dtype to use for reading data
- epochs – Epochs to include. Epochs are defined with start and stop sample counts.
- scheme – Scheme data to use for rereferencing/channel filtering. This should be
loaded/manipulated from
pairs.json
data.
Notes
The
read()
method must be implemented by subclasses to return a tuple containing a 3-D array with dimensions (epochs x channels x time) and a list of contact numbers.-
include_contact
(contact_num: int)[source]¶ Filter to determine if we need to include a contact number when reading data.
-
rereference
(data: numpy.ndarray, contacts: List[int]) → Tuple[numpy.ndarray, List[str]][source]¶ Rereference and/or select a subset of raw channels.
Parameters: - data – Input timeseries data shaped as (epochs, channels, time).
- contacts – List of contact numbers (1-based) that index the data.
Returns: - reref – Rereferenced timeseries.
- labels – List of channel labels used (included in case some don’t get used).
Notes
This method is meant to be used when loading data and so returns a raw Numpy array. If used externally, a
EEGContainer
will need to be constructed manually.
-
scheme_type
¶ Returns “contacts” when the input scheme is in the form of monopolar contacts and “pairs” when bipolar.
Returns: Return type: The scheme type or None
is no scheme was specified.Raises: KeyError
– When the passed scheme doesn’t include any of the following keys:contact_1
,contact_2
,contact
-
class
cmlreaders.readers.eeg.
EDFReader
(filename: str, dtype: Type[numpy.dtype], epochs: List[Tuple[int, Union[int, NoneType]]], scheme: Union[pandas.core.frame.DataFrame, NoneType])[source]¶
-
class
cmlreaders.readers.eeg.
EEGMetaReader
(data_type: str, subject: Union[str, NoneType] = None, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = 0, montage: Union[int, NoneType] = 0, file_path: Union[str, NoneType] = None, eeg_basename: Union[str, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ Reads the
sources.json
orparams.txt
files which describes metainfo about EEG data.EEGMetaReader uses the following logic to combine entries in
sources.json
:- If all recordings in
sources.json
have the same value for a field, then the dictionary returned by EEGMetaReader has that value for the field - Otherwise, that field should be populated by a list of the values
present in
sources.json
- If all recordings in
-
class
cmlreaders.readers.eeg.
EEGReader
(data_type: str, subject: Union[str, NoneType] = None, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = 0, montage: Union[int, NoneType] = 0, file_path: Union[str, NoneType] = None, eeg_basename: Union[str, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ Reads EEG data.
Returns a
EEGContainer
.Examples
All examples start by defining a reader:
>>> from cmlreaders import CMLReader >>> reader = CMLReader("R1111M", experiment="FR1", session=0)
Loading a subset of EEG based on brain region (this automatically re-references):
>>> pairs = reader.load("pairs") >>> filtered = pairs[pairs["avg.region"] == "middletemporal"] >>> eeg = reader.load_eeg(scheme=pairs)
Loading EEG from -100 ms to +100 ms relative to a set of events:
>>> events = reader.load("events") >>> eeg = reader.load_eeg(events, rel_start=-100, rel_stop=100)
Loading an entire session:
>>> eeg = reader.load_eeg()
Loading multiple sessions from the same subject:
>>> events = CMLReader.load_events(["R1111M"], ["FR1"]) >>> words = events[events["type"] == "WORD"] >>> reader = CMLReader("R1111M") >>> eeg = reader.load_eeg(events=words, rel_start=-100, rel_stop=100)
-
as_recarray
()[source]¶ Return data as a numpy recarray. By default, this calls
as_dataframe()
and converts to a recarray withpd.DataFrame.to_records()
.
-
as_timeseries
(events: pandas.core.frame.DataFrame, rel_start: Union[float, int], rel_stop: Union[float, int]) → cmlreaders.eeg_container.EEGContainer[source]¶ Read the timeseries.
Parameters: - events – Events to read EEG data from
- rel_start – Relative start times in ms
- rel_stop – Relative stop times in ms
Returns: - A time series with shape (channels, epochs, time). By default, this
- returns data as it was physically recorded (e.g., if recorded with a
- common reference, each channel will be a contact’s reading referenced to
- the common reference, a.k.a. “monopolar channels”).
Raises: RereferencingNotPossibleError
– When rereferencing is not possible.
-
load
(**kwargs)[source]¶ Overrides the generic load method so as to accept keyword arguments to pass along to
as_timeseries()
.
-
-
class
cmlreaders.readers.eeg.
NumpyEEGReader
(filename: str, dtype: Type[numpy.dtype], epochs: List[Tuple[int, Union[int, NoneType]]], scheme: Union[pandas.core.frame.DataFrame, NoneType])[source]¶ Read EEG data stored in Numpy’s .npy format.
Notes
This reader is currently only used to do some testing so lacks some features such as being able to determine what contact numbers it’s actually using. Instead, it will just give contacts as a sequential list of ints.
-
class
cmlreaders.readers.eeg.
RamulatorHDF5Reader
(filename: str, dtype: Type[numpy.dtype], epochs: List[Tuple[int, Union[int, NoneType]]], scheme: Union[pandas.core.frame.DataFrame, NoneType])[source]¶ Reads Ramulator HDF5 EEG files.
PathFinder¶
The cmlreaders.PathFinder
class can be used to identify the location of
various file types on RHINO. In an ideal world, all historic data would be
processed to have consistent file names, locations, and types. However, because
this has not been the case and individuals analyzing the data have come to
expect and deal with these inconsistencies, the safer approach is to leave the
data in its original form and attempt to abstract away these underlying
inconsistencies for future users.
-
class
cmlreaders.
PathFinder
(subject: Union[str, NoneType] = None, experiment: Union[str, NoneType] = None, session: Union[int, NoneType] = None, localization: Union[int, NoneType] = 0, montage: Union[int, NoneType] = 0, eeg_basename: Union[str, NoneType] = None, rootdir: Union[str, NoneType] = None)[source]¶ -
find
(data_type)[source]¶ Given a specific file type, find the corresponding file on RHINO and return the full path
Parameters: file_type (The type of file to load. The given name should match one of) – the keys from rhino_paths Returns: path – The path of the file found based on the request Return type: str
-
localization_files
¶ All localization related files
-
montage_files
¶ All files that vary by montage number
-
requestable_files
¶ All files that can be requested with PathFinder.find()
-
session_files
¶ All files that vary by session
-
subject_files
¶ All files that vary only by subject
-
Path and File Constants¶
cmlreaders.PathFinder
internally uses the cmlreaders.constants
module. The usefulness of cmlreaders.PathFinder
relies on these
constants being well-maintained.