Extract metadata

Metadata values can be extracted from a variety of sources and assigned to the current workspace.

Description

Metadata values can be extracted from a variety of sources and assigned to the current workspace. These metadata values can subsequently be accessed in the form M{[NAME]}, where [NAME] is the metadata name. Some common file and foldername formats are included as pre-defined metadata extraction methods, while other forms can be constructed using regular expressions.

Parameters

Parameter Description
Extractor mode Data source for metadata extraction:
  • "Filename" Metadata taken from filename (not including folder path)
  • "Foldername" Metadata taken from parent foldername (incuding full system path to that folder)
  • "Metadata file" Metadata taken from a separate file, specified by the "Metadata file" parameter
  • "Series name" Metadata taken from seriesname (if not from a multi-series file, this is just the filename)
Filename extractor The format of the filename to be converted to metadata values:
  • "Generic" Name format is compiled using regular expressions defined in the "Pattern" parameter. Each group (pattern enclosed in parenthesis) specified in the pattern is assigned to a metadata value. Metadata value names are defined by the comma-separated list defined in "Groups (comma separated)".
  • "CV1000 filename" The Yokogawa CellVoyager CV1000 format (e.g. W1F001T0001Z00C1.tif)
  • "CV7000 filename" The Yokogawa CellVoyager CV7000 format (e.g. AssayPlate_Greiner_#655090_C02_T0001F001L01A01Z01C01.tif)
  • "IncuCyte long filename" The Incucyte long format, where each timepoint is stored as a separate image file and accordingly the filename records the time of acquisition (e.g. MySample1_A1_1_2021y08m06d_11h39m.tif)
  • "IncuCyte short filename" The Incucyte short format, where all timepoints are stored in a single file and the filename only records the well and field (e.g. MySample1_A1_1.tif)
  • "Opera filename" The Perkin Elmer Opera LX name format, which specifies row, column and field (e.g. 001001001.flex)
Foldername extractor The format of the foldername to be converted to metadata values:
  • "
  • "Generic" Name format is compiled using regular expressions defined in the "Pattern" parameter. Each group (pattern enclosed in parenthesis) specified in the pattern is assigned to a metadata value. Metadata value names are defined by the comma-separated list defined in "Groups (comma separated)".
  • "CV1000 foldername" The Yokogawa CV1000 format (e.g. 20210806T113905_10x_K01_MySample1)
  • "Opera measurement foldername" The Perkin Elmer Opera LX foldername format (e.g. Meas_01(2021-08-06_11-39-05))
Metadata file extractor The format of the metadata file to be converted to metadata values:
  • "CSV file" Metadata values stored in a two-column CSV file, where the first column defines an identifier to the current file being processed (e.g. filename, or series name) and the second column defines a value that will be assigned as metadata. Optionally, this value can be split into multiple metadata values using a regular expression.
  • "Opera file (.flex)" Specifically extracts the "Area name" property from the current .flex file.
Input source If extracting metadata from a CSV file, this controls whether a single, static CSV file is used or whether there is one provided (with a fixed name) in the current folder (i.e where the current image file is loaded from).
Metadata file If extracting metadata from a static CSV file (i.e. the same file for all processed images), this is the path to that metadata CSV file.
Metadata file name If extracting metadata from a dynamic CSV file (i.e. the file is in the current image folder), this is the name of that metadata CSV file (it must be the same name in all folders).
Metadata item to match For CSV-based metadata extraction, the first column of the CSV file identifies which row to read. This parameter defines the source (e.g. filename, series name, etc.). Choices are: File in input folder,Static file
Pattern Regular expression pattern to use when interpreting generic metadata formats. This pattern must contain at least one group (specified using standard regex parenthesis notation).
Groups (comma separated) When interpreting generic metadata formats, these group names will be assigned to each group matched using regular expressions. The metadata values will subsequently be accessed via these names in the form M{[NAME]}, where [NAME] is the group name.
Case insensitive When selected regular expression matches will be found irrespective of case.
Show pattern matching test When selected (and constructing a regular expression extractor), an example string can be provided and the identified groups displayed. This allows for regular expression forms to be tested during workflow assembly.
Example string If testing a regular expression form ("Show pattern matching test" selected), this is the example string which will be processed and broken down into its individual metadata values.
Identified groups If testing a regular expression form ("Show pattern matching test" selected), the extracted metadata values will be displayed here.
Split using regular expressions When selected, the metadata value taken from a CSV file will itself be broken down into multiple metadata values using a regular expressions
Metadata value name If a metadata value loaded from CSV is not to be sub-divided into multiple metadata values, the entire value loaded will be stored as a metadata item with this name.