Reads the contents of an OBO file and processes it into data frames or a list based data structure.
Usage
obo_parser(
path,
relations = c("is_a", "part_of", "occurs_in", "regulates", "positively_regulates",
"negatively_regulates"),
shorten_namespace = TRUE,
tables = TRUE
)
Arguments
- path
Path to the OBO file.
- relations
Character vector: process only these relations.
- shorten_namespace
Logical: shorten the namespace to a single letter code (as usual for Gene Ontology, e.g. cellular_component = "C").
- tables
Logical: return data frames (tibbles) instead of nested lists.
Value
A list with the following elements: 1) "names" a list with
terms as names and names as values; 2) "namespaces" a list with
terms as names and namespaces as values; 3) "relations" a list with
relations between terms: terms are keys, values are lists with
relations as names and character vectors of related terms as
values; 4) "subsets" a list with terms as keys and character
vectors of subset names as values (or NULL
if the term
does not belong to any subset); 5) "obsolete" character vector
with all the terms labeled as obsolete. If the tables
parameter is TRUE
, "names", "namespaces", "relations"
and "subsets" will be data frames (tibbles).
Examples
goslim_url <-
"http://current.geneontology.org/ontology/subsets/goslim_generic.obo"
path <- tempfile()
httr::GET(goslim_url, httr::write_disk(path, overwrite = TRUE))
#> Response [http://current.geneontology.org/ontology/subsets/goslim_generic.obo]
#> Date: 2024-04-07 15:00
#> Status: 200
#> Content-Type: text/obo
#> Size: 115 kB
#> <ON DISK> /tmp/RtmpgsvP1J/file165a6038fedfc0
obo <- obo_parser(path, tables = FALSE)
unlink(path)
names(obo)
#> [1] "names" "namespaces" "relations" "subsets" "obsolete"
#> [6] "rel_lst_c2p"
# [1] "names" "namespaces" "relations" "subsets" "obsolete"
head(obo$relations, n = 2)
#> $`GO:0000228`
#> $`GO:0000228`$is_a
#> [1] "GO:0005694"
#>
#> $`GO:0000228`$part_of
#> [1] "GO:0005634"
#>
#>
#> $`GO:0000278`
#> $`GO:0000278`$has_part
#> [1] "GO:0140014"
#>
#>
# $`GO:0000001`
# $`GO:0000001`$is_a
# [1] "GO:0048308" "GO:0048311"
#
# $`GO:0000002`
# $`GO:0000002`$is_a
# [1] "GO:0007005"