• About
  • Documentation

  • More Universes
  • Recent Updates
  • Leader board

  • All repositories
  • All packages
  • All articles
  • All datasets
  • All system Libraries
ryansunwork
  • Builds
  • Packages
  • Articles
  • Datasets
  • Contribution
  • Badges
  • API
  • Feed

Links toryansunwork

GBJ - Generalized Berk-Jones Test for Set-Based Inference in Genetic Association Studies

Offers the Generalized Berk-Jones (GBJ) test for set-based inference in genetic association studies. The GBJ is designed as an alternative to tests such as Berk-Jones (BJ), Higher Criticism (HC), Generalized Higher Criticism (GHC), Minimum p-value (minP), and Sequence Kernel Association Test (SKAT). All of these other methods (except for SKAT) are also implemented in this package, and we additionally provide an omnibus test (OMNI) which integrates information from each of the tests. The GBJ has been shown to outperform other tests in genetic association studies when signals are correlated and moderately sparse. Please see the vignette for a quickstart guide or Sun and Lin (2017) <arXiv:1710.02469> for more details.

Last updated

cpp

4.69 score 2 stars 1 dependents 33 scripts 236 downloads

csmGmm - Conditionally Symmetric Multidimensional Gaussian Mixture Model

Implements the conditionally symmetric multidimensional Gaussian mixture model (csmGmm) for large-scale testing of composite null hypotheses in genetic association applications such as mediation analysis, pleiotropy analysis, and replication analysis. In such analyses, we typically have J sets of K test statistics where K is a small number (e.g. 2 or 3) and J is large (e.g. 1 million). For each one of the J sets, we want to know if we can reject all K individual nulls. Please see the vignette for a quickstart guide. The paper describing these methods is "Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome-Wide Studies" by Sun R, McCaw Z, & Lin X (Journal of the American Statistical Association 2025, <doi:10.1080/01621459.2024.2422124>).

Last updated

3.83 score 135 scripts 521 downloads

ICSKAT - Interval-Censored Sequence Kernel Association Test

Implements the Interval-Censored Sequence Kernel Association (ICSKAT) test for testing the association between interval-censored time-to-event outcomes and groups of single nucleotide polymorphisms (SNPs). Interval-censored time-to-event data occur when the event time is not known exactly but can be deduced to fall within a given interval. For example, some medical conditions like bone mineral density deficiency are generally only diagnosed at clinical visits. If a patient goes for clinical checkups yearly and is diagnosed at, say, age 30, then the onset of the deficiency is only known to fall between the date of their age 29 checkup and the date of the age 30 checkup. Interval-censored data include right- and left-censored data as special cases. This package also implements the interval-censored Burden test and the ICSKATO test, which is the optimal combination of the ICSKAT and Burden tests. Please see the vignette for a quickstart guide. The paper describing these methods is " Inference for Set-Based Effects in Genetic Association Studies with Interval-Censored Outcomes" by Sun R, Zhu L, Li Y, Yasui Y, & Robison L (Biometrics 2023, <doi:10.1111/biom.13636>).

Last updated

cpp

2.48 score 1 dependents 3 scripts 225 downloads

reconstructKM - Reconstruct Individual-Level Data from Published KM Plots

Functions for reconstructing individual-level data (time, status, arm) from Kaplan-MEIER curves published in academic journals (e.g. NEJM, JCO, JAMA). The individual-level data can be used for re-analysis, meta-analysis, methodology development, etc. This package was used to generate the data for commentary such as Sun, Rich, & Wei (2018) <doi:10.1056/NEJMc1808567>. Please see the vignette for a quickstart guide.

Last updated

2.30 score 2 scripts 259 downloads

GEint - Misspecified Models for Gene-Environment Interaction

The first major functionality is to compute the bias in regression coefficients of misspecified linear gene-environment interaction models. The most generalized function for this objective is GE_bias(). However GE_bias() requires specification of many higher order moments of covariates in the model. If users are unsure about how to calculate/estimate these higher order moments, it may be easier to use GE_bias_normal_squaredmis(). This function places many more assumptions on the covariates (most notably that they are all jointly generated from a multivariate normal distribution) and is thus able to automatically calculate many of the higher order moments automatically, necessitating only that the user specify some covariances. There are also functions to solve for the bias through simulation and non-linear equation solvers; these can be used to check your work. Second major functionality is to implement the Bootstrap Inference with Correct Sandwich (BICS) testing procedure, which we have found to provide better finite-sample performance than other inference procedures for testing GxE interaction. More details on these functions are available in Sun, Carroll, Christiani, and Lin (2018) <doi:10.1111/biom.12813>.

Last updated

2.30 score 20 scripts 139 downloads