Sequence Assays

scRNAseq-10xGenomics-v2 / scRNAseq-10xGenomics-v3 / snRNAseq-10xGenomics-v2 / snRNAseq-10xGenomics-v3 / scRNAseq / sciRNAseq / snRNAseq / SNARE2-RNAseq

Related files:

This schema is for single cell RNA sequencing (scRNAseq). v3 adds umi_* fields.

Metadata schema

Version 3 (use this one)
Shared by all types

version
description
donor_id
tissue_id
execution_datetime
protocols_io_doi
operator
operator_email
pi
pi_email
assay_category
assay_type
analyte_class
is_targeted
acquisition_instrument_vendor
acquisition_instrument_model

Unique to this type

sc_isolation_protocols_io_doi
sc_isolation_entity
sc_isolation_tissue_dissociation
sc_isolation_enrichment
sc_isolation_quality_metric
sc_isolation_cell_number
rnaseq_assay_input
rnaseq_assay_method
library_construction_protocols_io_doi
library_layout
library_adapter_sequence
library_id
is_technical_replicate
cell_barcode_read
umi_read
umi_offset
umi_size
cell_barcode_offset
cell_barcode_size
expected_cell_count
library_pcr_cycles
library_pcr_cycles_for_sample_index
library_final_yield_value
library_final_yield_unit
library_average_fragment_size
sequencing_reagent_kit
sequencing_read_format
sequencing_read_percent_q30
sequencing_phix_percent
contributors_path
data_path

Shared by all types

version

Version of the schema to use when validating this metadata.

constraint value
enum 3
required True

description

Free-text description of this assay.

constraint value
required True

donor_id

HuBMAP Display ID of the donor of the assayed tissue. Example: ABC123.

constraint value
pattern (regular expression) [A-Z]+[0-9]+
required True

tissue_id

HuBMAP Display ID of the assayed tissue. Example: ABC123-BL-1-2-3_456.

constraint value
pattern (regular expression) (([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?)(,([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?)*
required True

execution_datetime

Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros.

constraint value
type datetime
format %Y-%m-%d %H:%M
required True

protocols_io_doi

DOI for protocols.io referring to the protocol for this assay.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

operator

Name of the person responsible for executing the assay.

constraint value
required True

operator_email

Email address for the operator.

constraint value
format email
required True

pi

Name of the principal investigator responsible for the data.

constraint value
required True

pi_email

Email address for the principal investigator.

constraint value
format email
required True

assay_category

Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence.

constraint value
enum sequence
required True

assay_type

The UMI sequence length in the 10xGenomics-v2 kit is 10 base pairs and the length in the 10xGenomics-v3 kit is 12 base pairs.

constraint value
enum scRNAseq-10xGenomics-v2, scRNAseq-10xGenomics-v3, snRNAseq-10xGenomics-v2, snRNAseq-10xGenomics-v3, scRNAseq, sciRNAseq, snRNAseq, or SNARE2-RNAseq
required True

analyte_class

Analytes are the target molecules being measured with the assay.

constraint value
enum RNA
required True

is_targeted

Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay.

constraint value
type boolean
required True

acquisition_instrument_vendor

An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass.

constraint value
required True

acquisition_instrument_model

Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data.

constraint value
required True

Unique to this type

sc_isolation_protocols_io_doi

Link to a protocols document answering the question: How were single cells separated into a single-cell suspension?

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

sc_isolation_entity

The type of single cell entity derived from isolation protocol.

constraint value
enum whole cell, nucleus, cell-cell multimer, or spatially encoded cell barcoding
required True

sc_isolation_tissue_dissociation

The method by which tissues are dissociated into single cells in suspension.

constraint value
required True

sc_isolation_enrichment

The method by which specific cell populations are sorted or enriched.

constraint value
enum none or FACS
required True

sc_isolation_quality_metric

A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level.

constraint value
required True

sc_isolation_cell_number

Total number of cell/nuclei yielded post dissociation and enrichment.

constraint value
type integer
required True

rnaseq_assay_input

Number of cell/nuclei input to the assay.

constraint value
type integer
required True

rnaseq_assay_method

The kit used for the RNA sequencing assay.

constraint value
required True

library_construction_protocols_io_doi

A link to the protocol document containing the library construction method (including version) that was used, e.g. “Smart-Seq2”, “Drop-Seq”, “10X v3”.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

library_layout

State whether the library was generated for single-end or paired end sequencing.

constraint value
enum single-end or paired-end
required True

library_adapter_sequence

Adapter sequence to be used for adapter trimming. Example: CTGTCTCTTATACACATCT.

constraint value
pattern (regular expression) [ATCG]+(\+[ATCG]+)?
required True

library_id

A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked.

constraint value
required True

is_technical_replicate

Is the sequencing reaction run in replicate, TRUE or FALSE.

constraint value
type boolean
required True

cell_barcode_read

Which read file(s) contains the cell barcode. Multiple cell_barcode_read files must be provided as a comma-delimited list (e.g. file1,file2,file3). Leave blank if not applicable.

constraint value
required False

umi_read

Which read file(s) contains the UMI (unique molecular identifier) barcode. Example: R1.

constraint value
pattern (regular expression) [^/]+
required True

umi_offset

Position in the read at which the umi barcode starts.

constraint value
type integer
required True

umi_size

Length of the umi barcode in base pairs.

constraint value
type integer
required True

cell_barcode_offset

Position(s) in the read at which the cell barcode starts. Leave blank if not applicable. Example: 0,0,38,76.

constraint value
required False
pattern (regular expression) \d+(,\d+)*

cell_barcode_size

Length of the cell barcode in base pairs. Leave blank if not applicable. Example: 16,8,8,8.

constraint value
required False
pattern (regular expression) \d+(,\d+)*

expected_cell_count

How many cells are expected? This may be used in downstream pipelines to guide selection of cell barcodes or segmentation parameters. Leave blank if not applicable.

constraint value
type integer
required False

library_pcr_cycles

Number of PCR cycles to amplify cDNA.

constraint value
type integer
required True

library_pcr_cycles_for_sample_index

Number of PCR cycles performed for library indexing.

constraint value
type integer
required True

library_final_yield_value

Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul)

constraint value
type number
required True

library_final_yield_unit

Units of final library yield. Leave blank if not applicable.

constraint value
enum ng
required False
required if library_final_yield_value present

library_average_fragment_size

Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation.

constraint value
type integer
required True

sequencing_reagent_kit

Reagent kit used for sequencing.

constraint value
required True

sequencing_read_format

Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. Example: 12/34/56.

constraint value
pattern (regular expression) \d+(/\d+)+
required True

sequencing_read_percent_q30

Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + …)

constraint value
type number
required True
minimum 0
maximum 100

sequencing_phix_percent

Percent PhiX loaded to the run.

constraint value
type number
required True
minimum 0
maximum 100

contributors_path

Relative path to file with ORCID IDs for contributors for this dataset.

constraint value
required True

data_path

Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions.

constraint value
required True
Version 2

Shared by all types

version

Version of the schema to use when validating this metadata.

constraint value
enum 2
required True

description

Free-text description of this assay.

constraint value
required True

donor_id

HuBMAP Display ID of the donor of the assayed tissue. Example: ABC123.

constraint value
pattern (regular expression) [A-Z]+[0-9]+
required True

tissue_id

HuBMAP Display ID of the assayed tissue. Example: ABC123-BL-1-2-3_456.

constraint value
pattern (regular expression) (([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?)(,([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?)*
required True

execution_datetime

Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros.

constraint value
type datetime
format %Y-%m-%d %H:%M
required True

protocols_io_doi

DOI for protocols.io referring to the protocol for this assay.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

operator

Name of the person responsible for executing the assay.

constraint value
required True

operator_email

Email address for the operator.

constraint value
format email
required True

pi

Name of the principal investigator responsible for the data.

constraint value
required True

pi_email

Email address for the principal investigator.

constraint value
format email
required True

assay_category

Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence.

constraint value
enum sequence
required True

assay_type

The specific type of assay being executed.

constraint value
enum scRNAseq-10xGenomics-v2, scRNAseq-10xGenomics-v3, snRNAseq-10xGenomics-v2, snRNAseq-10xGenomics-v3, scRNAseq, sciRNAseq, snRNAseq, or SNARE2-RNAseq
required True

analyte_class

Analytes are the target molecules being measured with the assay.

constraint value
enum RNA
required True

is_targeted

Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay.

constraint value
type boolean
required True

acquisition_instrument_vendor

An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass.

constraint value
required True

acquisition_instrument_model

Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data.

constraint value
required True

Unique to this type

sc_isolation_protocols_io_doi

Link to a protocols document answering the question: How were single cells separated into a single-cell suspension?

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

sc_isolation_entity

The type of single cell entity derived from isolation protocol.

constraint value
enum whole cell, nucleus, cell-cell multimer, or spatially encoded cell barcoding
required True

sc_isolation_tissue_dissociation

The method by which tissues are dissociated into single cells in suspension.

constraint value
required True

sc_isolation_enrichment

The method by which specific cell populations are sorted or enriched.

constraint value
enum none or FACS
required True

sc_isolation_quality_metric

A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level.

constraint value
required True

sc_isolation_cell_number

Total number of cell/nuclei yielded post dissociation and enrichment.

constraint value
type integer
required True

rnaseq_assay_input

Number of cell/nuclei input to the assay.

constraint value
type integer
required True

rnaseq_assay_method

The kit used for the RNA sequencing assay.

constraint value
required True

library_construction_protocols_io_doi

A link to the protocol document containing the library construction method (including version) that was used, e.g. “Smart-Seq2”, “Drop-Seq”, “10X v3”.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

library_layout

State whether the library was generated for single-end or paired end sequencing.

constraint value
enum single-end or paired-end
required True

library_adapter_sequence

Adapter sequence to be used for adapter trimming. Example: CTGTCTCTTATACACATCT.

constraint value
pattern (regular expression) [ATCG]+(\+[ATCG]+)?
required True

library_id

A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked.

constraint value
required True

is_technical_replicate

Is the sequencing reaction run in replicate, TRUE or FALSE.

constraint value
type boolean
required True

cell_barcode_read

Which read file(s) contains the cell barcode. Multiple cell_barcode_read files must be provided as a comma-delimited list (e.g. file1,file2,file3). Leave blank if not applicable.

constraint value
required False

cell_barcode_offset

Position(s) in the read at which the cell barcode starts. Leave blank if not applicable. Example: 0,0,38,76.

constraint value
required False
pattern (regular expression) \d+(,\d+)*

cell_barcode_size

Length of the cell barcode in base pairs. Leave blank if not applicable. Example: 16,8,8,8.

constraint value
required False
pattern (regular expression) \d+(,\d+)*

expected_cell_count

How many cells are expected? This may be used in downstream pipelines to guide selection of cell barcodes or segmentation parameters. Leave blank if not applicable.

constraint value
type integer
required False

library_pcr_cycles

Number of PCR cycles to amplify cDNA.

constraint value
type integer
required True

library_pcr_cycles_for_sample_index

Number of PCR cycles performed for library indexing.

constraint value
type integer
required True

library_final_yield_value

Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul)

constraint value
type number
required True

library_final_yield_unit

Units of final library yield. Leave blank if not applicable.

constraint value
enum ng
required False
required if library_final_yield_value present

library_average_fragment_size

Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation.

constraint value
type integer
required True

sequencing_reagent_kit

Reagent kit used for sequencing.

constraint value
required True

sequencing_read_format

Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. Example: 12/34/56.

constraint value
pattern (regular expression) \d+(/\d+)+
required True

sequencing_read_percent_q30

Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + …)

constraint value
type number
required True
minimum 0
maximum 100

sequencing_phix_percent

Percent PhiX loaded to the run.

constraint value
type number
required True
minimum 0
maximum 100

contributors_path

Relative path to file with ORCID IDs for contributors for this dataset.

constraint value
required True

data_path

Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions.

constraint value
required True
Version 1

Shared by all types

version

Version of the schema to use when validating this metadata.

constraint value
enum 1
required True

description

Free-text description of this assay.

constraint value
required True

donor_id

HuBMAP Display ID of the donor of the assayed tissue. Example: ABC123.

constraint value
pattern (regular expression) [A-Z]+[0-9]+
required True

tissue_id

HuBMAP Display ID of the assayed tissue. Example: ABC123-BL-1-2-3_456.

constraint value
pattern (regular expression) (([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?)(,([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?)*
required True

execution_datetime

Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros.

constraint value
type datetime
format %Y-%m-%d %H:%M
required True

protocols_io_doi

DOI for protocols.io referring to the protocol for this assay.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

operator

Name of the person responsible for executing the assay.

constraint value
required True

operator_email

Email address for the operator.

constraint value
format email
required True

pi

Name of the principal investigator responsible for the data.

constraint value
required True

pi_email

Email address for the principal investigator.

constraint value
format email
required True

assay_category

Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence.

constraint value
enum sequence
required True

assay_type

The specific type of assay being executed.

constraint value
enum scRNAseq-10xGenomics, snRNAseq-10xGenomics-v2, snRNAseq-10xGenomics-v3, scRNAseq, sciRNAseq, snRNAseq, or SNARE2-RNAseq
required True

analyte_class

Analytes are the target molecules being measured with the assay.

constraint value
enum RNA
required True

is_targeted

Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay.

constraint value
type boolean
required True

acquisition_instrument_vendor

An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass.

constraint value
required True

acquisition_instrument_model

Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data.

constraint value
required True

Unique to this type

sc_isolation_protocols_io_doi

Link to a protocols document answering the question: How were single cells separated into a single-cell suspension?

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

sc_isolation_entity

The type of single cell entity derived from isolation protocol.

constraint value
enum whole cell, nucleus, cell-cell multimer, or spatially encoded cell barcoding
required True

sc_isolation_tissue_dissociation

The method by which tissues are dissociated into single cells in suspension.

constraint value
required True

sc_isolation_enrichment

The method by which specific cell populations are sorted or enriched.

constraint value
enum none or FACS
required True

sc_isolation_quality_metric

A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level.

constraint value
required True

sc_isolation_cell_number

Total number of cell/nuclei yielded post dissociation and enrichment.

constraint value
type integer
required True

rnaseq_assay_input

Number of cell/nuclei input to the assay.

constraint value
type integer
required True

rnaseq_assay_method

The kit used for the RNA sequencing assay.

constraint value
required True

library_construction_protocols_io_doi

A link to the protocol document containing the library construction method (including version) that was used, e.g. “Smart-Seq2”, “Drop-Seq”, “10X v3”.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

library_layout

State whether the library was generated for single-end or paired end sequencing.

constraint value
enum single-end or paired-end
required True

library_adapter_sequence

Adapter sequence to be used for adapter trimming. Example: CTGTCTCTTATACACATCT.

constraint value
pattern (regular expression) [ATCG]+(\+[ATCG]+)?
required True

library_id

A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked.

constraint value
required True

is_technical_replicate

Is the sequencing reaction run in repliucate, TRUE or FALSE.

constraint value
type boolean
required True

cell_barcode_read

Which read file contains the cell barcode.

constraint value
required True

cell_barcode_offset

Position(s) in the read at which the cell barcode starts. Example: 0,0,38,76.

constraint value
pattern (regular expression) \d+(,\d+)*
required True

cell_barcode_size

Length of the cell barcode in base pairs. Example: 16,8,8,8.

constraint value
pattern (regular expression) \d+(,\d+)*
required True

library_pcr_cycles

Number of PCR cycles to amplify cDNA.

constraint value
type integer
required True

library_pcr_cycles_for_sample_index

Number of PCR cycles performed for library indexing.

constraint value
type integer
required True

library_final_yield_value

Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul)

constraint value
type number
required True

library_final_yield_unit

Units of final library yield. Leave blank if not applicable.

constraint value
enum ng
required False
required if library_final_yield_value present

library_average_fragment_size

Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation.

constraint value
type integer
required True

sequencing_reagent_kit

Reagent kit used for sequencing.

constraint value
required True

sequencing_read_format

Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. Example: 12/34/56.

constraint value
pattern (regular expression) \d+(/\d+)+
required True

sequencing_read_percent_q30

Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + …)

constraint value
type number
required True
minimum 0
maximum 100

sequencing_phix_percent

Percent PhiX loaded to the run.

constraint value
type number
required True
minimum 0
maximum 100

contributors_path

Relative path to file with ORCID IDs for contributors for this dataset.

constraint value
required True

data_path

Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions.

constraint value
required True
Version 0

Shared by all types

donor_id

HuBMAP Display ID of the donor of the assayed tissue. Example: ABC123.

constraint value
pattern (regular expression) [A-Z]+[0-9]+
required True

tissue_id

HuBMAP Display ID of the assayed tissue. Example: ABC123-BL-1-2-3_456.

constraint value
pattern (regular expression) ([A-Z]+[0-9]+)-[A-Z]{2}\d*(-\d+)+(_\d+)?
required True

execution_datetime

Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros.

constraint value
type datetime
format %Y-%m-%d %H:%M
required True

protocols_io_doi

DOI for protocols.io referring to the protocol for this assay.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

operator

Name of the person responsible for executing the assay.

constraint value
required True

operator_email

Email address for the operator.

constraint value
format email
required True

pi

Name of the principal investigator responsible for the data.

constraint value
required True

pi_email

Email address for the principal investigator.

constraint value
format email
required True

assay_category

Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence.

constraint value
enum sequence
required True

assay_type

The specific type of assay being executed.

constraint value
enum scRNAseq-10xGenomics, snRNAseq-10xGenomics-v2, snRNAseq-10xGenomics-v3, scRNAseq, sciRNAseq, snRNAseq, or SNARE2-RNAseq
required True

analyte_class

Analytes are the target molecules being measured with the assay.

constraint value
enum RNA
required True

is_targeted

Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay.

constraint value
type boolean
required True

acquisition_instrument_vendor

An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass.

constraint value
required True

acquisition_instrument_model

Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data.

constraint value
required True

Unique to this type

sc_isolation_protocols_io_doi

Link to a protocols document answering the question: How were single cells separated into a single-cell suspension?

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

sc_isolation_entity

The type of single cell entity derived from isolation protocol.

constraint value
required True

sc_isolation_tissue_dissociation

The method by which tissues are dissociated into single cells in suspension.

constraint value
required True

sc_isolation_enrichment

The method by which specific cell populations are sorted or enriched. Leave blank if not applicable.

constraint value
required False

sc_isolation_quality_metric

A quality metric by visual inspection prior to cell lysis or defined by known parameters such as wells with several cells or no cells. This can be captured at a high level.

constraint value
required True

sc_isolation_cell_number

Total number of cell/nuclei yielded post dissociation and enrichment.

constraint value
type integer
required True

rnaseq_assay_input

Number of cell/nuclei input to the assay.

constraint value
type integer
required True

rnaseq_assay_method

The kit used for the RNA sequencing assay.

constraint value
required True

library_construction_protocols_io_doi

A link to the protocol document containing the library construction method (including version) that was used, e.g. “Smart-Seq2”, “Drop-Seq”, “10X v3”.

constraint value
required True
pattern (regular expression) 10\.17504/.*
url prefix: https://dx.doi.org/

library_layout

State whether the library was generated for single-end or paired end sequencing.

constraint value
enum single-end or paired-end
required True

library_adapter_sequence

Adapter sequence to be used for adapter trimming.

constraint value
required True

library_id

A library ID, unique within a TMC, which allows corresponding RNA and chromatin accessibility datasets to be linked.

constraint value
required True

is_technical_replicate

Is the sequencing reaction run in repliucate, TRUE or FALSE.

constraint value
type boolean
required True

cell_barcode_read

Which read file contains the cell barcode.

constraint value
required True

cell_barcode_offset

Position(s) in the read at which the cell barcode starts.

constraint value
required True

cell_barcode_size

Length of the cell barcode in base pairs.

constraint value
required True

library_pcr_cycles

Number of PCR cycles to amplify cDNA.

constraint value
type integer
required True

library_pcr_cycles_for_sample_index

Number of PCR cycles performed for library indexing.

constraint value
type integer
required True

library_final_yield_value

Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul)

constraint value
type number
required True

library_final_yield_unit

Units of final library yield. Leave blank if not applicable.

constraint value
enum ng
required False
required if library_final_yield_value present

library_average_fragment_size

Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation.

constraint value
type integer
required True

sequencing_reagent_kit

Reagent kit used for sequencing.

constraint value
required True

sequencing_read_format

Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. Example: 12/34/56.

constraint value
pattern (regular expression) \d+(/\d+)+
required True

sequencing_read_percent_q30

Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + …)

constraint value
type number
required True
minimum 0
maximum 100

sequencing_phix_percent

Percent PhiX loaded to the run.

constraint value
type number
required True
minimum 0
maximum 100

contributors_path

Relative path to file with ORCID IDs for contributors for this dataset.

constraint value
required True

data_path

Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions.

constraint value
required True


Directory schemas

Version 0 (use this one)
pattern required? description
[^/]+\.fastq\.gz Compressed FastQ file
extras\/.*   Folder for general lab-specific files related to the dataset. [Exists in all assays]