Downloads one pathway diagram from the KEGG Pathways database in KGML format and processes the XML to extract the interactions.
Arguments
- pathway_id
Character: a KEGG pathway identifier, for example "hsa04350".
- process
Logical: process the data or return it in raw format. processing means joining the entries and relations into a single data frame and adding UniProt IDs.
- max_expansion
Numeric: the maximum number of relations derived from a single relation record. As one entry might represent more than one molecular entities, one relation might yield a large number of relations in the processing. This happens in a combinatorial way, e.g. if the two entries represent 3 and 4 entities, that results 12 relations. If
NULL
, all relations will be expanded.- simplify
Logical: remove KEGG's internal identifiers and the pathway annotations, keep only unique interactions with direction and effect sign.
Value
A data frame (tibble) of interactions if process
is
TRUE
, otherwise a list with two data frames: "entries" is
a raw table of the entries while "relations" is a table of relations
extracted from the KGML file.
Examples
tgf_pathway <- kegg_pathway_download('hsa04350')
tgf_pathway
#> # A tibble: 55 × 12
#> source target type effect arrow relation_id kegg_id_source genesymbol_source
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 51 49 PPrel activ… --> hsa04350:1 hsa:7040 hsa:… TGFB1
#> 2 57 55 PPrel activ… --> hsa04350:2 hsa:151449 hs… GDF7
#> 3 34 32 PPrel activ… --> hsa04350:3 hsa:3624 hsa:… INHBA
#> 4 20 17 PPrel activ… --> hsa04350:4 hsa:4838 NODAL
#> 5 60 46 PPrel activ… --> hsa04350:5 hsa:4086 hsa:… SMAD1
#> 6 43 41 PPrel activ… --> hsa04350:6 hsa:4087 hsa:… SMAD2
#> 7 22 16 PPrel activ… --> hsa04350:7 hsa:4087 hsa:… SMAD2
#> 8 19 15 PPrel activ… --> hsa04350:8 hsa:4087 hsa:… SMAD2
#> 9 27 26 PPrel inhib… --| hsa04350:11 hsa:4609 MYC
#> 10 47 43 PPrel inhib… --| hsa04350:13 hsa:4091 hsa:… SMAD6
#> # ℹ 45 more rows
#> # ℹ 4 more variables: uniprot_source <chr>, kegg_id_target <chr>,
#> # genesymbol_target <chr>, uniprot_target <chr>
# # A tibble: 50 x 12
# source target type effect arrow relation_id kegg_id_source
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 51 49 PPrel activ. --> hsa04350:1 hsa:7040 hsa:.
# 2 57 55 PPrel activ. --> hsa04350:2 hsa:151449 hs.
# 3 34 32 PPrel activ. --> hsa04350:3 hsa:3624 hsa:.
# 4 20 17 PPrel activ. --> hsa04350:4 hsa:4838
# 5 60 46 PPrel activ. --> hsa04350:5 hsa:4086 hsa:.
# # . with 45 more rows, and 5 more variables: genesymbol_source <chr>,
# # uniprot_source <chr>, kegg_id_target <chr>,
# # genesymbol_target <chr>, uniprot_target <chr>