Skip to contents

Orthologous gene pairs from Ensembl


  organism_a = 9606,
  organism_b = 10090,
  attrs_a = NULL,
  attrs_b = NULL,
  colrename = TRUE



Character or integer: organism name or identifier for the left side organism. We query the Ensembl dataset of this organism and add the orthologues of the other organism to it. Ideally this is the organism you translate from.


Character or integer: organism name or identifier for the right side organism. We add orthology information of this organism to the gene records of the left side organism.


Further attributes about organism_a genes. Will be simply added to the attributes list.


Further attributes about organism_b genes (orthologues). The available attributes are: "associated_gene_name", "chromosome", "chrom_start", "chrom_end", "wga_coverage", "goc_score", "perc_id_r1", "perc_id", "subtype". Attributes included by default: "ensembl_gene", "ensembl_peptide", "canonical_transcript_protein", "orthology_confidence" and "orthology_type".


Logical: replace prefixes from organism_b attribute column names, so the returned table always have the same column names, no matter the organism. E.g. for mouse these columns all have the prefix "mmusculus_homolog_", which this option changes to "b_".


A data frame of orthologous gene pairs with gene, transcript and peptide identifiers and confidence values.


Only the records with orthology information are returned. The order of columns is the following: defaults of organism_a, extra attributes of organism_b, defaults of organism_b, extra attributes of organism_b.


if (FALSE) {
sffish <- ensembl_orthology(
    organism_b = 'Siamese fighting fish',
    attrs_a = 'external_gene_name',
    attrs_b = 'associated_gene_name'
# # A tibble: 175,608 × 10
#    ensembl_gene_id ensembl_transcript_id ensembl_peptide. external_gene_n.
#    <chr>           <chr>                 <chr>            <chr>
#  1 ENSG00000277196 ENST00000621424       ENSP00000481127  NA
#  2 ENSG00000277196 ENST00000615165       ENSP00000482462  NA
#  3 ENSG00000278817 ENST00000613204       ENSP00000482514  NA
#  4 ENSG00000274847 ENST00000400754       ENSP00000478910  MAFIP
#  5 ENSG00000273748 ENST00000612919       ENSP00000479921  NA
# # . with 175,603 more rows, and 6 more variables:
# #   b_ensembl_peptide <chr>, b_ensembl_gene <chr>,
# #   b_orthology_type <chr>, b_orthology_confidence <dbl>,
# #   b_canonical_transcript_protein <chr>, b_associated_gene_name <chr>