Retrieves NCBI HomoloGene data without any processing. Processed tables are more useful for most purposes, see below other functions that provide those. Genes of various organisms are grouped into homology groups ("hgroup" column). Organisms are identified by NCBI Taxonomy IDs, genes are identified by four different identifier types.
Examples
hg <- homologene_raw()
hg
#> # A tibble: 275,237 × 6
#> hgroup ncbi_taxid entrez genesymbol gi refseqp
#> <int> <int> <chr> <chr> <chr> <chr>
#> 1 3 9606 34 ACADM 4557231 NP_000007.1
#> 2 3 9598 469356 ACADM 160961497 NP_001104286.1
#> 3 3 9544 705168 ACADM 109008502 XP_001101274.1
#> 4 3 9615 490207 ACADM 545503811 XP_005622188.1
#> 5 3 9913 505968 ACADM 115497690 NP_001068703.1
#> 6 3 10090 11364 Acadm 6680618 NP_031408.1
#> 7 3 10116 24158 Acadm 292494885 NP_058682.2
#> 8 3 7955 406283 acadm 390190229 NP_998175.2
#> 9 3 7227 38864 CG12262 24660351 NP_648149.1
#> 10 3 7165 1276346 AgaP_AGAP005662 58387602 XP_315683.2
#> # ℹ 275,227 more rows
# # A tibble: 275,237 × 6
# hgroup ncbi_taxid entrez genesymbol gi refseqp
# <int> <int> <chr> <chr> <chr> <chr>
# 1 3 9606 34 ACADM 4557231 NP_000007.1
# 2 3 9598 469356 ACADM 160961497 NP_001104286.1
# 3 3 9544 705168 ACADM 109008502 XP_001101274.1
# 4 3 9615 490207 ACADM 545503811 XP_005622188.1
# 5 3 9913 505968 ACADM 115497690 NP_001068703.1
# # . with 275,232 more rows
# which organisms are available?
common_name(unique(hg$ncbi_taxid))
#> [1] "Human" "Chimpanzee"
#> [3] "Macaque" "Dog"
#> [5] "Cow" "Mouse"
#> [7] "Rat" "Zebrafish"
#> [9] "Drosophila melanogaster (Fruit fly)" NA
#> [11] "Caenorhabditis elegans (PRJNA13758)" "Tropical clawed frog"
#> [13] "Chicken" NA
#> [15] NA NA
#> [17] NA NA
#> [19] NA NA
#> [21] NA
# [1] "Human" "Chimpanzee" "Macaque" "Dog" "Cow" "Mouse" "Rat" "Zebrafish"
# [9] "D. melanogaster" "Caenorhabditis elegans (PRJNA13758)"
# [11] "Tropical clawed frog" "Chicken"
# ...and 9 more organisms with missing English names.