Skip to contents

Downloads one pathway diagram from the KEGG Pathways database in KGML format and processes the XML to extract the interactions.

Usage

kegg_pathway_download(
  pathway_id,
  process = TRUE,
  max_expansion = NULL,
  simplify = FALSE
)

Arguments

pathway_id

Character: a KEGG pathway identifier, for example "hsa04350".

process

Logical: process the data or return it in raw format. processing means joining the entries and relations into a single data frame and adding UniProt IDs.

max_expansion

Numeric: the maximum number of relations derived from a single relation record. As one entry might represent more than one molecular entities, one relation might yield a large number of relations in the processing. This happens in a combinatorial way, e.g. if the two entries represent 3 and 4 entities, that results 12 relations. If NULL, all relations will be expanded.

simplify

Logical: remove KEGG's internal identifiers and the pathway annotations, keep only unique interactions with direction and effect sign.

Value

A data frame (tibble) of interactions if process is

TRUE, otherwise a list with two data frames: "entries" is a raw table of the entries while "relations" is a table of relations extracted from the KGML file.

Examples

tgf_pathway <- kegg_pathway_download('hsa04350')
tgf_pathway
#> # A tibble: 55 × 12
#>    source target type  effect arrow relation_id kegg_id_source genesymbol_source
#>    <chr>  <chr>  <chr> <chr>  <chr> <chr>       <chr>          <chr>            
#>  1 51     49     PPrel activ… -->   hsa04350:1  hsa:7040 hsa:… TGFB1            
#>  2 57     55     PPrel activ… -->   hsa04350:2  hsa:151449 hs… GDF7             
#>  3 34     32     PPrel activ… -->   hsa04350:3  hsa:3624 hsa:… INHBA            
#>  4 20     17     PPrel activ… -->   hsa04350:4  hsa:4838       NODAL            
#>  5 60     46     PPrel activ… -->   hsa04350:5  hsa:4086 hsa:… SMAD1            
#>  6 43     41     PPrel activ… -->   hsa04350:6  hsa:4087 hsa:… SMAD2            
#>  7 22     16     PPrel activ… -->   hsa04350:7  hsa:4087 hsa:… SMAD2            
#>  8 19     15     PPrel activ… -->   hsa04350:8  hsa:4087 hsa:… SMAD2            
#>  9 27     26     PPrel inhib… --|   hsa04350:11 hsa:4609       MYC              
#> 10 47     43     PPrel inhib… --|   hsa04350:13 hsa:4091 hsa:… SMAD6            
#> # ℹ 45 more rows
#> # ℹ 4 more variables: uniprot_source <chr>, kegg_id_target <chr>,
#> #   genesymbol_target <chr>, uniprot_target <chr>
# # A tibble: 50 x 12
#    source target type  effect arrow relation_id kegg_id_source
#    <chr>  <chr>  <chr> <chr>  <chr> <chr>       <chr>
#  1 51     49     PPrel activ. -->   hsa04350:1  hsa:7040 hsa:.
#  2 57     55     PPrel activ. -->   hsa04350:2  hsa:151449 hs.
#  3 34     32     PPrel activ. -->   hsa04350:3  hsa:3624 hsa:.
#  4 20     17     PPrel activ. -->   hsa04350:4  hsa:4838
#  5 60     46     PPrel activ. -->   hsa04350:5  hsa:4086 hsa:.
# # . with 45 more rows, and 5 more variables: genesymbol_source <chr>,
# #   uniprot_source <chr>, kegg_id_target <chr>,
# #   genesymbol_target <chr>, uniprot_target <chr>