senotype_editor

Senotype Editor

User Documentation

Introduction

Senotype

A senotype is a senescent cell subtype with distinct transcriptional, spatial, and functional characteristics. Each senotype is described by a cellular senescence functional definition and associated multidimensional characteristics. A senotype refers to the complex interplay of features that associates a cell phenotype with other characteristics that vary widely by tissue type, cell lineage, inducing stimulus, and senescence hallmarks such as cell cycle arrest, SASP production, DNA damage, and chromatin remodeling.

A senotype associates a phenotype with characteristics that include:

A senotype can be represented as a knowledge graph of assertions–i.e., discrete relationships between the senotype and other entities. For example, a senotype can assert a in_taxon relationship with an entity representing “human”.

img.png

The entities and relationships of a senotype can be encoded–i.e., represented with codes from standard biomedical vocabularies.

For example, the gene and protein markers associated with a senotype can be identified with codes from HGNC and UniProt, with a relationship encoded by the Relations Ontology.

The SenLib Database

Senotype definitions are maintained in the Senotype Library (SenLib). Senotype definitions are stored in SenLib in JSON format and conform to the schema defined in the senlib GitHub repository.

The Senotype Editor

The Senotype Editor application allows users to manage senotype definitions in the SenLib database. Because the majority of the assertions in a senotype definition involve categorical data, most of the work of defining a senotype will involve the selection of values from lists.

img_29.png

With the Senotype Editor, users can:

  1. review a senotype definition
  2. create a new senotype definition
  3. revise a senotype definition

User Authentication

Only users with Globus accounts with proper authentication will be able to use the Senotype Editor.

Upon startup, the SenoType Editor will display a login page. The user will be authenticated in Globus. If the user’s Globus account has the necessary privileges, the user will be able to continue to the Edit page. img_30.png

Components of the Senotype Editor

The Senotype Editor offers two types of tools:

  1. the Senotype Library Navigator a tool to view the Senotype library of senotype submissions
  2. tools to maintain senotype submissions

img_23.png

Senotype Library Navigator

The Senotype Library Navigator represents the Senotype library with an expandable tree view. The Navigator organizes the different submissions of a senotype definition in version order.

Senotype Definition

The Senotype Definition composes the remainder of the Senotype Editor. The Senotype Definition section allows the user to build or revise a senotype definition. The components of the Definition section are linked to both the data of a particular senotype submission and to possible values for data.

Because most of a senotype definition is encoded, another role of the Definition section is to translate the codes of a senotype definition into descriptive terms.

Business Rules of the Senotype Editor

A user can only edit a senotype submission file for which they are authorized. Authorization is controlled by means of the SenNet Consortium’s Globus environment.

Using the Senotype Library Navigator

A senotype definition is represented in the Navigator tree with a folder node ( img_1.png ). Expanding the node for a senotype definition displays the senotype submissions associated with the definition.

Senotype submission files are represented with a file node (img_2.png ). Submission nodes are organized hierarchically in terms of provenance in the SenLib database. If a submission has a predecessor, the node for the submission can be expanded to view the predecessor.

When the user selects a submission file node in the Navigator, the Senotype Editor will load data for the submission into the Definition section.

img_32.png

Edit state

The icon next to a submission node indicates whether the submission can be edited.

A pencil icon (img_6.png) indicates that the submission can be edited.

Authorization state

Only authorized users may edit a senotype. If a senotype shows a “prohibited” icon ( img_8.png), the user is not authorized to edit the submission file. Although the data for the submission will still display in the Definition section, editing controls that can add, delete, or change data will be disabled.

img_7.png

Creating a new senotype

The img_28.png node in the Navigator allows the user to define a new senotype. The Editor will load into the Definition section defaults for the new submission, including:

  1. a new SenNet ID
  2. the Globus name and email of the user, as submitter

Using the Senotype Definition tools

Required data

A senotype definition requires a minimal set of data. Data that are required for a senotype are indicated with a red asterisk.

Submission information: name and description

The name and description of the senotype definition are required data.

Both the name and description fields are free text.

The Senotype Library Navigator tree view will use the value of name for the latest version of a senotype definition in the representation of the senotype definition.

img_37.png

Assertions

The Definition section allows the definition of assertions between a senotype and its characteristics.

Each input in the Definition section corresponds to an assertion in the senotype definition. For example, the Taxon input corresponds to the senotype definition’s in_taxon assertion.

There are four types of assertions. Each type has its own tool for selection.

Valueset-based assertions

Most of a senotype definition’s assertions will be categorical, in which the possible values for the object of the assertion will be in a valueset. For example, the categories for a senotype definition’s taxon might be the valueset {human, mouse}.

The elements of valuesets are codes from biomedical vocabularies.

Valueset-based assertions include:

type of assertion source vocabulary
taxon NCBI
hallmark SENOTYPE
microenvironment SENOTYPE
inducer SENOTYPE
assay OBI
sex SNOMED_CT

Because there can be multiple assertions of the same type, the inputs for valueset-based assertions are lists. For example, a senotype can assert relationships with multiple taxa:

img_14.png

List buttons

Add

The img_9.png button opens a select window that displays a list with the members of the valueset associated with the assertion. img_15.png Selecting an element in the list in the select window adds the selected value to the list for the assertion.

Remove

The img_13.png button removes an element from the assertion list.

Context assertions

Context assertions are allow the user to define numerical ranges for an assertion. img_16.png For example, the age assertion can be bounded to apply only to the range of 18 to 89 years.

The current types of context assertions are

External assertions

A number of the objects of assertions are encoded with codes from large vocabularies or ontologies, such as Cell Ontology. Instead of using internal valuesets for these inputs, the Senotype Editor obtains codes by searching external sources.

assertion type source vocabulary
location (organ) UBKG Uberon
cell type UBKG Cell Ontology (CL)
diagnosis UBKG Disease Ontology (DOID)
citation NCBI PubMed PMID
origin (RRID) SciCrunch RRID
dataset SenNet Data Portal SenNet ID
gene marker UBKG HGNC
protein marker UBKG UniProtKB

Edit page

Translation of codes

Senotype definitions represent entities with codes. For example, if a senotype asserts a relationship with cell type “fibroblast of lung”, its definition represents the cell type with the code CL:0002553.

When the Edit page loads information for an existing senotype, it translates codes into terms by searching external sources. The terms are only to provide human-readable descriptions: they are not stored in SenLib.

img_33.png

Add button

The img_9.png button in a list opens a select window that displays a list with the members of the valueset associated with the assertion.

Details button

The img_12.png button next to an assertion object links to a detail page in the source that corresponds to the object.

For example, the details button next to a cell type opens the corresponding page for the cell type in EMBL-EBI’s Ontology Lookup Service: img_34.png

assertion object type reference page
location SenNet Organs detail
cell type EMBL-EBI Ontology Lookup Service (OLS)
diagnosis Disease Ontology
citation PubMed
origin RRID Portal (see below)
dataset SenNet Data Portal
gene marker HGNC
protein marker UniprotKB

Search window

In the Search window for an external assertion object, entering input into the search box triggers a search against an external source. The results of a search appear below the search box as a list of links. Selecting a link from the result list will add the result to the corresponding object list in the Edit page.

img_25.png

The forms of search terms depend on the external source. For example, to search for a cell type, a user can enter either the Cell Ontology ID or a portion of the preferred term (e.g,. “fibroblast”)

input type search source forms of search term example
citation NCBI EUtils API exact PMID; portion of publication title 41247924; predictors
origin RRID Portal (SciCrunch) exact RRID 4850064
dataset SenNet entity-api; SenNet Data Portal exact SenNet ID SNT699.FVQD.882
location hs-ontology API (UBKG API) portion of SenNet organ name lung
diagnosis hs-ontology API (UBKG API) exact DOID; exact match for a preferred term or synonym DOID:3083; 3083; chronic obstructive pulmonary disease; copd
celltype hs-ontology API (UBKG API) exact CLID; portion of a preferred term CL:4006000; 4006000; fibroblast
gene hs-ontology API (UBKG API) exact: HGNC ID; approved symbol; alias; prior approved symbol; or prior alias HGNC:1100; 1100; BRCA1; BRCC1
protein hs-ontology API (UBKG API) exact: UniProtKB ID; UniProtKB entry name UNPROTKB:Q13201; Q13201; MMRN1_HUMAN

Explore button

In an external search window, the img_12.png button opens the corresponding home page of an external site to facilitate finding an appropriate identifier. For example, the Explore button for the cell type selection opens the page of the OLS Search for the Cell Ontology.

The Senotype Editor is not integrated with these external sites. Once the user finds an appropriate identifier or search term, they can enter the id or term into the search box of the Select window.

Details for specific external assertions

SenNet datasets

The Search feature img_12.png for datasets will direct to the Senotype Consortium pages, including the Data Portal.

By default, the Senotype Data Portal displays only published datasets. If the user wishes to associate a senotype with a dataset that is not published, they will need to log into the Data Portal again with their SenNet consortium user ID.

Origin (SciCrunch RRID)

A senotype definition can be associated with an origin, or Research Resource Identifier (RRID). RRIDs are managed in the RRID Portal. The RRID Portal uses the SciCrunch API for searches.

RRIDs are organized by resource type:

The Search feature img_12.png of the Origin section directs to the RRID Portal. Once the user identifies a RRID, they must enter the ID in the search window. The ‘RRID:’ part of the ID is optional; however, any other letters are required (e.g., “SCR”, “AddGene”, etc.).

Antibodies and cell lines have identifiers with a higher level of resolution than the RRID because they also identify vendors. The Search feature displays the names provided by vendors to distinguish different offerings of the same origin.

img_26.png

The Details page for an origin will open a link to the SciCrunch detail page for the origin. Depending on the type of origin, the SciCrunch detail page may redirect to another site, such as that of a vendor.

Marker assertions

img_18.png

A senotype definition can have two types of markers. The association of a senotype definition with a marker is a statement of tentative expert opinion. Associations are not just the result of a single experiment, but also not necessarily clear evidence.

Specified markers

These gene or protein markers are a list an investigator might recommend be used to describe the markers that would help identify or characterize a senescent cell of this senotype–e.g., to use as a gene panel for a probed assay.

Regulating markers

These are typically a longer list of gene or protein markers that have been tested for the senotype. The investigator observes these markers to be up-regulated; down-regulated; or tested but inconclusive whether up- or down- regulated (e.g., using log2FC and p-value).

Collapsible section

Because it is anticipated that a senotype will be associated with many markers, the entire Markers section is collapsible.

Marker assertion management for both specified and regulating markers is similar:

  1. The img_9.png button opens a search window that allows the addition of a single marker.
  2. The img_19.png button opens a window for bulk import of a group of markers.

Search windows (individual addition)

A marker Search window searches for gene or protein markers.

The user can enter different types of identifiers for markers, including:

  1. HGNC numbers (e.g., 7178)
  2. HGNC approved symbols (e.g., MMRN1)
  3. HGNC aliases (ECM)
  4. HGNC prior symbols (e.g., MMRN)
  5. UniProtKB ID (e.g., Q13201)
  6. UniProtKB entry name (e.g., MMRN1_HUMAN)

img_35.png

The Search window for regulating markers includes a field for type of regulating:

img_36.png

Both types of Search window feature an Explore button ( img_12.png ) that will open a home page for a marker vocabulary (HGNC or UniProtKB), based on the selected type of marker.

Bulk Addition windows

Bulk addition windows allow the user to load a large number of markers from a local CSV file that the user specifies.

The bulk addition windows will only add information from a CSV if:

Specified markers

The CSV used for bulk upload of specified markers must have the following structure:

column values
type either gene or protein
id * if a gene, the HGNC symbol
* if a protein, the UniProtKB ID

Example:

type,id
gene,BRCA1
protein,Q13201

Regulating markers

The CSV used for bulk upload of regulating markers must have the following structure:

column values
type either gene or protein
id * if a gene, the HGNC symbol
* if a protein, the UniProtKB ID
action one of the following:
* 1 for up regulation
* 0 for inconclusive regulation
* -1 for down regulation

Example:

type,id,action
gene,BRCA1,1
protein,Q13201,0
gene,BRAF,-1

Validation

When the user clicks the Update/Create button at the bottom of the Definition section, the Editor validates the inputs.

If the data is invalid or incomplete, the Editor:

img_24.png

The Editor verifies that:

  1. All required fields have at least one value.
  2. Context assertions are ordered such that lowerbound <= value <= upperbound.

If the data passed validation, the Editor: