Translate a column of identifiers by orthologous gene pairs
Source:R/orthology.R
orthology_translate_column.Rd
Translate a column of identifiers by orthologous gene pairs
Usage
orthology_translate_column(
data,
column,
id_type = NULL,
target_organism = "mouse",
source_organism = "human",
resource = "oma",
replace = FALSE,
one_to_many = NULL,
keep_untranslated = FALSE,
translate_complexes = FALSE,
uniprot_by_id_type = "entrez"
)
Arguments
- data
A data frame with the column to be translated.
- column
Name of a character column with identifiers of the source organism of type `id_type`.
- id_type
Type of identifiers in `column`. Available ID types include "uniprot", "entrez", "ensg", "refseq" and "swissprot" for OMA, and "uniprot", "entrez", "genesymbol", "refseq" and "gi" for NCBI HomoloGene. If you want to translate an ID type not directly available in your preferred resource, use first
translate_ids
to translate to an ID type directly available in the orthology resource. If not provided, it is assumed the column name is the ID type.- target_organism
Name or NCBI Taxonomy ID of the target organism.
- source_organism
Name or NCBI Taxonomy ID of the source organism.
- resource
Character: source of the orthology mapping. Currently Orthologous Matrix (OMA) and NCBI HomoloGene are available, refer to them by "oma" and "homologene", respectively.
- replace
Logical or character: replace the column with the translated identifiers, or create a new column. If it is character, it will be used as the name of the new column.
- one_to_many
Integer: maximum number of orthologous pairs for one gene of the source organism. Genes mapping to higher number of orthologues will be dropped.
- keep_untranslated
Logical: keep records without orthologous pairs. If `replace` is TRUE, this option is ignored, and untranslated records will be dropped. Genes with more than `one_to_many` orthologues will always be dropped.
- translate_complexes
Logical: translate the complexes by translating their components.
- uniprot_by_id_type
Character: translate NCBI HomoloGene to UniProt by this ID type. One of "genesymbol", "entrez", "refseq" or "gi".