Skip to contents

Imports an intercellular network by combining intercellular annotations and protein interactions. First imports a network of protein-protein interactions. Then, it retrieves annotations about the proteins intercellular communication roles, once for the transmitter (delivering information from the expressing cell) and second, the receiver (receiving signal and relaying it towards the expressing cell) side. These 3 queries can be customized by providing parameters in lists which will be passed to the respective methods (import_omnipath_interactions for the network and import_omnipath_intercell for the annotations). Finally the 3 data frames combined in a way that the source proteins in each interaction annotated by the transmitter, and the target proteins by the receiver categories. If undirected interactions present (these are disabled by default) they will be duplicated, i.e. both partners can be both receiver and transmitter.

Usage

import_intercell_network(
  interactions_param = list(),
  transmitter_param = list(),
  receiver_param = list(),
  resources = NULL,
  entity_types = NULL,
  ligand_receptor = FALSE,
  high_confidence = FALSE,
  simplify = FALSE,
  unique_pairs = FALSE,
  consensus_percentile = NULL,
  loc_consensus_percentile = NULL,
  omnipath = TRUE,
  ligrecextra = TRUE,
  kinaseextra = !high_confidence,
  pathwayextra = !high_confidence,
  ...
)

Arguments

interactions_param

a list with arguments for an interactions query: import_omnipath_interactions, import_pathwayextra_interactions, import_kinaseextra_interactions, import_ligrecextra_interactions

transmitter_param

a list with arguments for import_omnipath_intercell, to define the transmitter side of intercellular connections

receiver_param

a list with arguments for import_omnipath_intercell, to define the receiver side of intercellular connections

resources

A character vector of resources to be applied to both the interactions and the annotations. For example, resources = 'CellChatDB' will download the transmitters and receivers defined by CellChatDB, connected by connections from CellChatDB.

entity_types

Character, possible values are "protein", "complex" or both.

ligand_receptor

Logical. If TRUE, only ligand and receptor annotations will be used instead of the more generic transmitter and receiver categories.

high_confidence

Logical: shortcut to do some filtering in order to include only higher confidence interactions. The intercell database of OmniPath covers a very broad range of possible ways of cell to cell communication, and the pieces of information, such as localization, topology, function and interaction, are combined from many, often independent sources. This unavoidably result some weird and unexpected combinations which are false positives in the context of intercellular communication. This option sets some minimum criteria to remove most (but definitely not all!) of the wrong connections. These criteria are the followings: 1) the receiver must be plasma membrane transmembrane; 2) the curation effort for interactions must be larger than one; 3) the consensus score for annotations must be larger than the 50 percentile within the generic category (you can override this by consensus_percentile). 4) the transmitter must be secreted or exposed on the plasma membrane. 5) The major localizations have to be supported by at least 30 percent of the relevant resources ( you can override this by loc_consensus_percentile). 6) The datasets with lower level of curation (kinaseextra and pathwayextra) will be disabled. These criteria are of medium stringency, you can always tune them to be more relaxed or stringent by filtering manually, using filter_intercell_network.

simplify

Logical: keep only the most often used columns. This function combines a network data frame with two copies of the intercell annotation data frames, all of them already having quite some columns. With this option we keep only the names of the interacting pair, their intercellular communication roles, and the minimal information of the origin of both the interaction and the annotations.

unique_pairs

Logical: instead of having separate rows for each pair of annotations, drop the annotations and reduce the data frame to unique interacting pairs. See unique_intercell_network for details.

consensus_percentile

Numeric: a percentile cut off for the consensus score of generic categories in intercell annotations. The consensus score is the number of resources supporting the classification of an entity into a category based on combined information of many resources. Here you can apply a cut-off, keeping only the annotations supported by a higher number of resources than a certain percentile of each category. If NULL no filtering will be performed. The value is either in the 0-1 range, or will be divided by 100 if greater than 1. The percentiles will be calculated against the generic composite categories and then will be applied to their resource specific annotations and specific child categories.

loc_consensus_percentile

Numeric: similar to consensus_percentile for major localizations. For example, with a value of 50, the secreted, plasma membrane transmembrane or peripheral attributes will be TRUE only where at least 50 percent of the resources support these.

omnipath

Logical: shortcut to include the omnipath dataset in the interactions query.

ligrecextra

Logical: shortcut to include the ligrecextra dataset in the interactions query.

kinaseextra

Logical: shortcut to include the kinaseextra dataset in the interactions query.

pathwayextra

Logical: shortcut to include the pathwayextra dataset in the interactions query.

...

If simplify or unique_pairs is TRUE, additional column names can be passed here to dplyr::select on the final data frame. Otherwise ignored.

Value

A dataframe containing information about protein-protein interactions and the inter-cellular roles of the protiens involved in those interactions.

Details

By default this function creates almost the largest possible network of intercellular interactions. However, this might contain a large number of false positives. Please refer to the documentation of the arguments, especially high_confidence, and the filter_intercell_network function. Note: if you restrict the query to certain intercell annotation resources or small categories, it's not recommended to use the consensus_percentile or high_confidence options, instead filter the network with filter_intercell_network for more consistent results.

Examples

intercell_network <- import_intercell_network(
    interactions_param = list(datasets = 'ligrecextra'),
    receiver_param = list(categories = c('receptor', 'transporter')),
    transmitter_param = list(categories = c('ligand', 'secreted_enzyme'))
)