Statistical Methodological Publications (In chronological order):
- Garcia T, Marder K, Wang Y∗. (2017). Time-varying Proportional Odds Model for Mega- analysis of Clustered Event Times. Biostatistics. In press.
- Qiu X, Zeng D, Wang Y*. (2017). Estimation and Evaluation of Linear Individualized Treat- ment Rules to Guarantee Performance. Biometrics. In press.
- Li X, Xie S, Zeng D, Wang Y*. (2017). Efficient Method to Optimally Identify Important Biomarkers for Disease Outcomes with High-dimensional Data. Statistics in Medicine. In press.
- Lee A, Marder K, Alcalay R, Bressman S, Orr-Urtreger A, Giladi N, Wang Y*. (2017). Es- timation of Genetic Risk Function with Covariates in the Presence of Missing Genotypes. Statistics in Medicine. In press.
- Xu K, Ma Y, Wang Y. (2017). Nonparametric Distribution Estimation in the Presence of Familial Correlation and Censoring. Electronic Journal of Statistics. 11(1), 1928-1948.
- Garcia T, Ma Y, Marder K, Wang Y*. (2017). Robust mixed-effects model for clustered failure time data: Application to Huntington’s disease event measures. Annals of Applied Statistics.11(2), 1085-1116.
- Wang Y∗, Fu H, Zeng D. (2017). Learning Optimal Personalized Treatment Rules under Risk Constraint. Journal of the American Statistical Association. In press. Paper.
- Chen H, Zeng D, Wang Y*. (2017) Penalized Nonlinear Mixed Effects Model to Identify Biomarkers that Predict Disease Progression. Biometrics. In press. PMID: 28182831 DOI:10.1111/biom.12663
- Wang Q, Ma Y, Wang Y. (2017). Predicting disease Risk by Transformation Models in the Presence of Unspecified Subgroup Membership. Statistica Sinica. 27(4), 1857.
- Liu Y, Wang Y*, Huang C, Zeng D. (2017). Estimating Individualized Diagnostic Rules in the Era of Personalized Medicine. Statistics in Medicine. 36(7):1099-1117. PMID: 27917508
- Wang Y*, Wu P, Liu Y, Weng C, and Zeng D. (2016). Learning optimal individualized treat- ment rules from electronic health records data. IEEE International Conference on Healthcare Informatics: ICHI 2016 Proceedings: 4-7 October 2016, Chicago, Illinois, USA., In press.
- Liu Y, Wang Y, Zeng D (2016). Sequential Multiple Assignment Randomization Trials withEnrichment for Dynamic Treatment Regimes. Biometrics. DOI: 10.1111/biom.12576. PMID:27598622 PMCID: PMC5339073
- Wang Y*, Chen T, Zeng D (2016). Support Vector Hazards Machine: A Counting Process Framework for Learning Risk Scores for Censored Outcomes. Journal of Machine Learning Research. 17(167):1-37. Paper.
- Liang B, Tong X, Zeng D, Wang Y (2016). Semiparametric Regression Analysis of Repeated Current Status Data. Statistica Sinica. In press. NIHMSID: 796289.
- Liu Y, Wang Y*, Yang Feng*, Melanie Wall (2016). Variable Selection and Prediction with Incomplete High-dimensional Data. Annals of Applied Statistics. 10:418-450.
- Chen T, Zeng D, Wang Y* (2015). Multiple kernel learning with random effects for predicting longitudinal outcomes and data integration. Biometrics. 71:918-928. (An earlier version won the ASA Statistical Learning and Data Mining Section Student Paper Award). PMID: 26177419.
- Wang Y*, Liang B, Tong X, Marder K, Bressman S, Orr-Urtreger A, Giladi N, Zeng D (2015). Efficient Estimation of Nonparametric Genetic Risk Function with Censored Data. Biometrika. 102(3):515-532. PubMed PMID: 26412864; PubMed Central PMCID: PMC4581539. Paper.
- Chen T, Ma Y, Wang Y* (2015).Predicting Cumulative Risk of Disease Onset by Redistributing Weights. Statistics in Medicine. 34(16):2427-43. PMID: 25847392; PMCID: PMC4457675.
- Jiang F, Ma Y, Wang Y (2015). Fused Kernel-Spline Smoothing for Repeatedly Measured Outcomes in a Generalized Partially Linear Model with Functional Single Index. Annals of Statistics. 1929-1958. 43(5). (An earlier version won ENAR Distinguished Student Paper Competition, 2014) NIHMSID: 686160.
- Chen T, Wang Y*, Chen H, Marder K, Zeng D. (2014). Targeted local support vector machine for age-dependent classification. Journal of the American Statistical Association. In press.
- Qin J, Garcia TP, Ma Y, Tang, M, Marder K, and Wang Y* (2014). Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint and unknon genotypes. Annals of Applied Statistics. In press.
- Chen H, Wang Y, Li, R., Shear K. (2014). On testing a nonparametric function through penalized splines. Statistica Sinica. 24, 1143-1160.
- Ma Y and Wang Y*. (2014). Nonparametric modeling and analysis of association between Huntington’s disease onset and CAG repeats. Statistics in Medicine. 33(8): 1369-1382. PubMed PMID: 24027120; PubMed Central PMCID: PMC3947445.
- Ma Y and Wang Y*. (2014). Estimating Disease Distribution Functions from Censored Mix- ture Data. Journal of the Royal Statistical Society, Series C. 63(1), 1-23.
- Wang Y*, Chen H, Zeng D, Mauro C, Duan N, and Shear K. (2013). Auxiliary marker-assisted classification in the absence of class labels. Journal of the American Statistical Association. 108 (502), 553-565.
- Chen H, Wang Y*, Paik CM, Choi A. (2013). A marginal approach to reduced-rank penalized spline smoothing for multilevel data. Journal of the American Statistical Association. 108(504): 1216-1229. Paper.
- Zeng D, and Wang Y. (2013). Discussion on ”Statistical Learning With Time Series Depen- dence: An Application to Scoring Sleep in Mice” by McShane et al. Journal of the American Statistical Association. 108(504): 1154.
- Wang Y*, Garcia T, and Ma Y. (2012). Nonparametric estimation for censored mixture data with application to the Cooperative Huntington’s Observational Research Trial. Journal of the American Statistical Association. 107 (500), 1324-1338. Paper.
- Fan R, Zhang Y, Albert P, Liu A, Wang Y, and Xiong M. (2012). Longitudinal genetic analysis of quantitative traits. Genetic Epidemiology. 36: 856-869.
- Ma Y, Wang Y* (2012). Efficient Distribution Estimation for Data with Unobserved Sub-population Identifiers. Electronic Journal of Statistics. 6, 710-737. Paper.
- Wang Y*, Chen H (2012). On testing a variance component in a linear mixed effects model with multiple variance components. Biometrics. In press. Paper. Sample Code.
- Wang Y*, Chen Y, Yang Q. (2012). Joint rare variant association test of the average and individual effects for sequencing studies. PLoS ONE. In press. Paper. Code-SAS
- Yang Q, Wang Y. (2012). Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies. Journal of Probability and Statistics. In press.
- Wang Y*, Huang C. (2012). Semiparametric variance components models for genetic studies with longitudinal phenotypes. Biostatistics. In press. Paper.
- Chen T, Wang Y*, Ma Y, Marder K, Langbehn D. (2012). Predicting disease onset from mutation status using proband and family data with applications to Huntington’s disease. Journal of Probability and Statistics. In press. Paper.
- Wang Y*, Huang C, Fang Y, Qiong Y, and Li R. (2012). Flexible semiparametric analysis of longitudinal genetic studies by reduced rank smoothing. Applied Statistics: Journal of the Royal Statistical Society, Series C. 61, 1-24. Paper. Appendix.
- Wang Y*. (2011). Flexible estimation of covariance function by penalized spline with application to longitudinal family data. Statistsics in Medicine. 30(15), 1883-1897. Paper. Code.
- Chen H and Wang Y*. (2011). A penalized spline approach to functional mixed effects model analysis. Biometrics. 67, 861-870. Paper. Appendix. Code
- Wang Y*, Yang Q, and Rabinowitz D (2011). Unbiased and efficient estimation of the effect of candidate genes on quantitative traits in the presence of population admixture. Biometrics. 67, 331-343. Paper. Code.
- Wang Y*, Chen H, Li R., Duan N and Lewis-Fernandez R (2011). Prediction based structured variable selection through penalized support vector machine. Biometrics. Paper. 67, 896-905.
- Wang Y*, Chen H, Schwartz T, Duan N, Parcesepe A, and Lewis-Fernandez R (2011). Assessment of a disease screener by hierarchical all subset selection using area under the receiver operating characteristic curves. Statistics in Medicine. 30, 1751-1760. Paper.
- Wang Y*, Fang Y (2011). Adjusting for treatment effects when estimating or testing genetic effects is the main interest. Journal of Data Science. 9, 127-138.
- Wang Y*, Rabinowitz D (2010). Efficient non-parametric estimation from kin-cohort data. Communications in Statistics: Theory and Methods. 39, 3622-3634.
- Fang Y, Wang Y*. (2009). Testing for genetic effect on functional traits by functional principal components analysis based on heritability. Statistics in Medicine. 28(29), 3611-3625. Paper.
- Fang Y, Wang Y, Sha N. (2009). Armitage’s trend test for genomewide association analysis: one-sided or two-sided? BMC Genetics. 3(Suppl 7): S37.
- Wang Y*, Fang Y (2009). Least square and empirical Bayes approaches for estimating random change points. Journal Data Science. 7, 1-12
- Wang Y*, Sha N, Fang Y. (2009). Analysis of genome-wide association data by large-scale Bayesian logistic regression. BMC Genet. 3(Suppl 7): S16.
- Beyene J, Tritchler D, Bull SB, Cartier KC, Jonasdottir G, Kraja AT, Li N, Nock NL, Parkhomenko E, Rao JS, Stein CM, Sutradhar R, Waaijenborg S, Wang KS, Wang Y and Wolkow P (2007). Multivariate analysis of complex gene expression and clinical phenotypes with genetic marker data. Genetic Epidemiology. 31 Suppl 1:S103-9.
- Wang S, Zheng T, Wang Y. (2007). Transcription activity hotspot, is it real or an artifact? BMC Genet. Suppl 1: S94
- Wang Y*, Clark LN, Marder K and Rabinowitz D (2007). Non-parametric estimation of genotype-specific age-at-onset distributions from censored kin-cohort data. Biometrika, 94(2): 403-414. Paper.
- Wang Y*, Fang Y, Jin M. (2007). A ridge penalized principal-components approach based on heritability for high-dimensional data. Human Heredity. 64(3), 182-91. Paper. Code
- Wang Y*, Fang Y, Wang S. (2007). Clustering and principal component analysis for mapping co-regulated genome-wide variation using family data. BMC Genet. Suppl 1:S121
- Wang Y*, Ottman R, and Rabinowitz D. (2006). A method for estimating penetrance from families sampled for linkage analysis. Biometrics. 62, 1081-1088. Paper.