Orthologous gene pairs from Ensembl
Usage
ensembl_orthology(
organism_a = 9606,
organism_b = 10090,
attrs_a = NULL,
attrs_b = NULL,
colrename = TRUE
)
Arguments
- organism_a
Character or integer: organism name or identifier for the left side organism. We query the Ensembl dataset of this organism and add the orthologues of the other organism to it. Ideally this is the organism you translate from.
- organism_b
Character or integer: organism name or identifier for the right side organism. We add orthology information of this organism to the gene records of the left side organism.
- attrs_a
Further attributes about organism_a genes. Will be simply added to the attributes list.
- attrs_b
Further attributes about organism_b genes (orthologues). The available attributes are: "associated_gene_name", "chromosome", "chrom_start", "chrom_end", "wga_coverage", "goc_score", "perc_id_r1", "perc_id", "subtype". Attributes included by default: "ensembl_gene", "ensembl_peptide", "canonical_transcript_protein", "orthology_confidence" and "orthology_type".
- colrename
Logical: replace prefixes from organism_b attribute column names, so the returned table always have the same column names, no matter the organism. E.g. for mouse these columns all have the prefix "mmusculus_homolog_", which this option changes to "b_".
Value
A data frame of orthologous gene pairs with gene, transcript and peptide identifiers and confidence values.
Details
Only the records with orthology information are returned. The order of columns is the following: defaults of organism_a, extra attributes of organism_b, defaults of organism_b, extra attributes of organism_b.
Examples
if (FALSE) {
sffish <- ensembl_orthology(
organism_b = 'Siamese fighting fish',
attrs_a = 'external_gene_name',
attrs_b = 'associated_gene_name'
)
sffish
# # A tibble: 175,608 × 10
# ensembl_gene_id ensembl_transcript_id ensembl_peptide. external_gene_n.
# <chr> <chr> <chr> <chr>
# 1 ENSG00000277196 ENST00000621424 ENSP00000481127 NA
# 2 ENSG00000277196 ENST00000615165 ENSP00000482462 NA
# 3 ENSG00000278817 ENST00000613204 ENSP00000482514 NA
# 4 ENSG00000274847 ENST00000400754 ENSP00000478910 MAFIP
# 5 ENSG00000273748 ENST00000612919 ENSP00000479921 NA
# # . with 175,603 more rows, and 6 more variables:
# # b_ensembl_peptide <chr>, b_ensembl_gene <chr>,
# # b_orthology_type <chr>, b_orthology_confidence <dbl>,
# # b_canonical_transcript_protein <chr>, b_associated_gene_name <chr>
#
}