Skip to contents

The "locus-to-gene" (L2G) model derives features to prioritize likely causal genes at each GWAS locus based on genetic and functional genomics features. The main categories of predictive features are:

  • Distance: Distance from credible set variants to the gene.

  • Molecular QTL colocalization: Colocalization with molecular QTLs.

  • Chromatin interaction: Interactions, such as promoter-capture Hi-C.

  • Variant pathogenicity: Pathogenicity scores from VEP (Variant Effect Predictor).

Usage

studiesAndLeadVariantsForGeneByL2G(gene, l2g = NA, pvalue = NA, vtype = NULL)

Arguments

gene

Character: Gene ENSEMBL ID (e.g. ENSG00000169174) or gene symbol (e.g. PCSK9). This argument can take a list of genes too.

l2g

Numeric: Locus-to-gene (L2G) cutoff score. (Default: NA)

pvalue

Character: P-value cutoff. (Default: NA)

vtype

Character: Most severe consequence to filter the variant types, including "intergenic_variant", "upstream_gene_variant", "intron_variant", "missense_variant", "5_prime_UTR_variant", "non_coding_transcript_exon_variant", "splice_region_variant". (Default: NULL)

Value

Returns a data frame containing the input gene ID and its data for the L2G model. The table consists of the following columns:

  • yProbaModel: Numeric. L2G score.

  • yProbaDistance: Numeric. Distance.

  • yProbaInteraction: Numeric. Chromatin interaction.

  • yProbaMolecularQTL: Numeric. Molecular QTL.

  • yProbaPathogenicity: Numeric. Pathogenicity.

  • pval: Numeric. P-value.

  • beta.direction: Character. Beta direction.

  • beta.betaCI: Numeric. Beta confidence interval.

  • beta.betaCILower: Numeric. Lower bound of the beta confidence interval.

  • beta.betaCIUpper: Numeric. Upper bound of the beta confidence interval.

  • odds.oddsCI: Numeric. Odds ratio confidence interval.

  • odds.oddsCILower: Numeric. Lower bound of the odds ratio confidence interval.

  • odds.oddsCIUpper: Numeric. Upper bound of the odds ratio confidence interval.

  • study.studyId: Character. Study ID.

  • study.traitReported: Character. Reported trait.

  • study.traitCategory: Character. Trait category.

  • study.pubDate: Character. Publication date.

  • study.pubTitle: Character. Publication title.

  • study.pubAuthor: Character. Publication author.

  • study.pubJournal: Character. Publication journal.

  • study.pmid: Character. PubMed ID.

  • study.hasSumstats: Logical. Indicates if the study has summary statistics.

  • study.nCases: Integer. Number of cases in the study.

  • study.numAssocLoci: Integer. Number of associated loci.

  • study.nTotal: Integer. Total number of samples in the study.

  • study.traitEfos: Character. Trait EFOs.

  • variant.id: Character. Variant ID.

  • variant.rsId: Character. Variant rsID.

  • variant.chromosome: Character. Variant chromosome.

  • variant.position: Integer. Variant position.

  • variant.refAllele: Character. Variant reference allele.

  • variant.altAllele: Character. Variant alternate allele.

  • variant.nearestCodingGeneDistance: Integer. Distance to the nearest coding gene.

  • variant.nearestGeneDistance: Integer. Distance to the nearest gene.

  • variant.mostSevereConsequence: Character. Most severe consequence.

  • variant.nearestGene.id: Character. Nearest gene ID.

  • variant.nearestCodingGene.id: Character. Nearest coding gene ID.

  • ensembl_id: Character. Ensembl ID.

  • gene_symbol: Character. Gene symbol.

Details

The function also provides additional filtering parameters to narrow the results based following parameters (see below)

Examples

if (FALSE) { # \dontrun{
result <- studiesAndLeadVariantsForGeneByL2G(genes = c("ENSG00000163946",
     "ENSG00000169174", "ENSG00000143001"), l2g = 0.7)
result <- studiesAndLeadVariantsForGeneByL2G(genes = "ENSG00000169174",
     l2g = 0.6, pvalue = 1e-8, vtype = c("intergenic_variant", "intron_variant"))
result <- studiesAndLeadVariantsForGeneByL2G(genes = "TMEM61")
} # }