Skip to contents

Reads the contents of an OBO file and processes it into data frames or a list based data structure.

Usage

obo_parser(
  path,
  relations = c("is_a", "part_of", "occurs_in", "regulates", "positively_regulates",
    "negatively_regulates"),
  shorten_namespace = TRUE,
  tables = TRUE
)

Arguments

path

Path to the OBO file.

relations

Character vector: process only these relations.

shorten_namespace

Logical: shorten the namespace to a single letter code (as usual for Gene Ontology, e.g. cellular_component = "C").

tables

Logical: return data frames (tibbles) instead of nested lists.

Value

A list with the following elements: 1) "names" a list with terms as names and names as values; 2) "namespaces" a list with terms as names and namespaces as values; 3) "relations" a list with relations between terms: terms are keys, values are lists with relations as names and character vectors of related terms as values; 4) "subsets" a list with terms as keys and character vectors of subset names as values (or NULL if the term does not belong to any subset); 5) "obsolete" character vector with all the terms labeled as obsolete. If the tables

parameter is TRUE, "names", "namespaces", "relations" and "subsets" will be data frames (tibbles).

Examples

goslim_url <-
    "http://current.geneontology.org/ontology/subsets/goslim_generic.obo"
path <- tempfile()
httr::GET(goslim_url, httr::write_disk(path, overwrite = TRUE))
#> Response [http://current.geneontology.org/ontology/subsets/goslim_generic.obo]
#>   Date: 2024-04-07 15:00
#>   Status: 200
#>   Content-Type: text/obo
#>   Size: 115 kB
#> <ON DISK>  /tmp/RtmpgsvP1J/file165a6038fedfc0
obo <- obo_parser(path, tables = FALSE)
unlink(path)
names(obo)
#> [1] "names"       "namespaces"  "relations"   "subsets"     "obsolete"   
#> [6] "rel_lst_c2p"
# [1] "names"      "namespaces" "relations"  "subsets"    "obsolete"
head(obo$relations, n = 2)
#> $`GO:0000228`
#> $`GO:0000228`$is_a
#> [1] "GO:0005694"
#> 
#> $`GO:0000228`$part_of
#> [1] "GO:0005634"
#> 
#> 
#> $`GO:0000278`
#> $`GO:0000278`$has_part
#> [1] "GO:0140014"
#> 
#> 
# $`GO:0000001`
# $`GO:0000001`$is_a
# [1] "GO:0048308" "GO:0048311"
#
# $`GO:0000002`
# $`GO:0000002`$is_a
# [1] "GO:0007005"