Skip to contents

Imports the dataset from: which contains transcription factor (TF)-target interactions from DoRothEA DoRothEA is a comprehensive resource of transcriptional regulation, consisting of 16 original resources, in silico TFBS prediction, gene expression signatures and ChIP-Seq binding site analysis.


  resources = NULL,
  organism = 9606,
  dorothea_levels = c("A", "B"),
  fields = NULL,
  default_fields = TRUE,
  references_by_resource = TRUE,
  exclude = NULL,
  strict_evidences = TRUE,
  genesymbol_resource = NULL,



interactions not reported in these databases are removed. See get_interaction_resources for more information.


Character or integer: Name or NCBI Taxonomy ID of one or organisms. The web service currently provides interactions for human, mouse and rat. For other organisms, the data will be translated by orthologous gene pairs from human. In this case, only one organism can be provided.


Vector detailing the confidence levels of the interactions to be downloaded. In dorothea, every TF-target interaction has a confidence score ranging from A to E, being A the most reliable interactions. By default we take A and B level interactions (c(A, B)). It is to note that E interactions are not available in OmnipathR.


The user can define here the fields to be added. If used, set the next argument, `default_fields`, to FALSE.


whether to include the default fields (columns) for the query type. If FALSE, only the fields defined by the user in the `fields` argument will be added.


if FALSE, removes the resource name prefixes from the references (PubMed IDs); this way the information which reference comes from which resource will be lost and the PubMed IDs will be unique.


Character: datasets or resources to exclude.


Logical: restrict the evidences to the queried datasets and resources. If set to FALSE, the directions and effect signs and references might be based on other datasets and resources. In case of DoRothEA this is not desirable for most of the applications. For most of the other interaction querying functions it is `FALSE` by default.


Character: either "uniprot" or "ensembl". The former leaves intact the gene symbols returned by the web service, originally set from UniProt. The latter updates the gene symbols from Ensembl, which uses a slightly different gene symbol standard. In this case a few records will be duplicated, where Ensembl provides ambiguous translation.


optional additional arguments


A dataframe of TF-target interactions from DoRothEA


dorothea_grn <- dorothea(
    resources = c('DoRothEA', 'ARACNe-GTEx_DoRothEA'),
    organism = 9606,
    dorothea_levels = c('A', 'B', 'C')
#> # A tibble: 32,617 × 16
#>    source target source_genesymbol target_genesymbol is_directed is_stimulation
#>    <chr>  <chr>  <chr>             <chr>                   <int>          <int>
#>  1 P01106 O14746 MYC               TERT                        1              1
#>  2 P84022 P05412 SMAD3             JUN                         1              1
#>  3 Q13485 P05412 SMAD4             JUN                         1              1
#>  4 P08047 P04075 SP1               ALDOA                       1              1
#>  5 P04637 P08069 TP53              IGF1R                       1              0
#>  6 Q05516 P20248 ZBTB16            CCNA2                       1              0
#>  7 Q01196 P08700 RUNX1             IL3                         1              0
#>  8 P42224 P38936 STAT1             CDKN1A                      1              1
#>  9 P40763 P38936 STAT3             CDKN1A                      1              1
#> 10 Q04206 P08183 RELA              ABCB1                       1              1
#> # ℹ 32,607 more rows
#> # ℹ 10 more variables: is_inhibition <int>, consensus_direction <int>,
#> #   consensus_stimulation <int>, consensus_inhibition <int>, sources <chr>,
#> #   references <chr>, curation_effort <int>, dorothea_level <chr>,
#> #   n_references <int>, n_resources <int>