Statistical Methodology Research

Statistical Methodological Publications (In chronological order):

  1. Garcia T, Marder  K, Wang  Y∗.   (2017).  Time-varying Proportional Odds Model  for Mega- analysis of Clustered Event  Times. Biostatistics. In press.
  2. Qiu X, Zeng  D, Wang  Y*. (2017).  Estimation and  Evaluation of Linear  Individualized Treat- ment  Rules to Guarantee Performance. Biometrics. In press.
  3. Li X,  Xie S,  Zeng   D,  Wang  Y*. (2017). Efficient  Method   to  Optimally  Identify  Important Biomarkers for Disease Outcomes with High-dimensional Data.    Statistics in Medicine.   In press.
  4. Lee A, Marder  K, Alcalay  R, Bressman S,  Orr-Urtreger A, Giladi N, Wang  Y*.   (2017). Es- timation  of Genetic Risk  Function with Covariates in the  Presence of Missing  Genotypes. Statistics in Medicine. In press.
  5. Xu K, Ma  Y, Wang  Y. (2017).   Nonparametric Distribution  Estimation in the  Presence  of Familial Correlation and  Censoring. Electronic Journal  of Statistics. 11(1),  1928-1948.
  6. Garcia T, Ma Y, Marder  K, Wang  Y*. (2017). Robust mixed-effects model  for clustered failure time data: Application  to Huntington’s disease event measures. Annals of Applied  Statistics.11(2),  1085-1116.
  7. Wang Y∗, Fu H, Zeng  D. (2017). Learning Optimal  Personalized Treatment Rules under Risk Constraint. Journal  of the American Statistical Association. In press. Paper.
  8. Chen H,  Zeng   D,  Wang  Y*.  (2017)   Penalized  Nonlinear Mixed  Effects Model  to  Identify Biomarkers that  Predict Disease Progression.  Biometrics. In press. PMID: 28182831  DOI:10.1111/biom.12663
  9. Wang  Q, Ma Y, Wang  Y. (2017). Predicting disease  Risk  by Transformation Models  in the Presence of Unspecified Subgroup Membership. Statistica Sinica. 27(4),  1857.
  10. Liu Y, Wang  Y*, Huang C, Zeng  D. (2017). Estimating Individualized Diagnostic Rules in the Era of Personalized Medicine. Statistics in Medicine. 36(7):1099-1117. PMID: 27917508
  11. Wang Y*, Wu P, Liu Y, Weng C, and Zeng D. (2016). Learning optimal individualized treat- ment rules from electronic health records data. IEEE International Conference on Healthcare Informatics: ICHI 2016 Proceedings: 4-7 October 2016, Chicago, Illinois, USA., In press.
  12. Liu Y, Wang  Y, Zeng  D (2016).  Sequential Multiple Assignment Randomization Trials  withEnrichment for Dynamic  Treatment Regimes. Biometrics. DOI: 10.1111/biom.12576.  PMID:27598622 PMCID: PMC5339073
  13. Wang Y*, Chen T, Zeng D (2016). Support Vector Hazards Machine: A Counting Process Framework for Learning Risk Scores for Censored Outcomes. Journal of Machine Learning Research. 17(167):1-37. Paper.
  14. Liang B, Tong X, Zeng D, Wang Y (2016). Semiparametric Regression Analysis of Repeated Current Status Data. Statistica Sinica. In press. NIHMSID: 796289.
  15. Liu Y, Wang Y*, Yang Feng*, Melanie Wall (2016). Variable Selection and Prediction with Incomplete High-dimensional Data. Annals of Applied Statistics. 10:418-450.
  16. Chen T, Zeng D, Wang Y* (2015). Multiple kernel learning with random effects for predicting longitudinal outcomes and data integration. Biometrics. 71:918-928. (An earlier version won the ASA Statistical Learning and Data Mining Section Student Paper Award). PMID: 26177419.
  17. Wang Y*, Liang B, Tong X, Marder K, Bressman S, Orr-Urtreger A, Giladi N, Zeng D (2015). Efficient Estimation of Nonparametric Genetic Risk Function with Censored Data. Biometrika. 102(3):515-532. PubMed PMID: 26412864; PubMed Central PMCID: PMC4581539. Paper.
  18. Chen T, Ma Y, Wang Y* (2015).Predicting Cumulative Risk of Disease Onset by Redistributing Weights. Statistics in Medicine. 34(16):2427-43. PMID: 25847392; PMCID: PMC4457675.
  19. Jiang F, Ma Y, Wang Y (2015). Fused Kernel-Spline Smoothing for Repeatedly Measured Outcomes in a Generalized Partially Linear Model with Functional Single Index. Annals of Statistics. 1929-1958. 43(5). (An earlier version won ENAR Distinguished Student Paper Competition, 2014) NIHMSID: 686160.
  20.  Chen T, Wang Y*, Chen H, Marder K, Zeng D. (2014). Targeted local support vector machine for age-dependent classification. Journal of the American Statistical Association. In press.
  21. Qin J, Garcia TP, Ma Y, Tang, M, Marder K, and Wang Y* (2014). Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint and unknon genotypes. Annals of Applied Statistics. In press.
  22. Chen H, Wang Y, Li, R., Shear K. (2014). On testing a nonparametric function through penalized splines. Statistica Sinica. 24, 1143-1160.
  23. Ma Y and Wang Y*. (2014). Nonparametric modeling and analysis of association between Huntington’s disease onset and CAG repeats. Statistics in Medicine. 33(8): 1369-1382. PubMed PMID: 24027120; PubMed Central PMCID: PMC3947445.
  24. Ma Y and Wang Y*. (2014). Estimating Disease Distribution Functions from Censored Mix- ture Data. Journal of the Royal Statistical Society, Series C. 63(1), 1-23.
  25. Wang Y*, Chen H, Zeng D, Mauro C, Duan N, and Shear K. (2013). Auxiliary marker-assisted classification in the absence of class labels. Journal of the American Statistical Association. 108 (502), 553-565.
  26. Chen H, Wang Y*, Paik CM, Choi A. (2013). A marginal approach to reduced-rank penalized spline smoothing for multilevel data. Journal of the American Statistical Association. 108(504): 1216-1229. Paper.
  27. Zeng D, and Wang Y. (2013). Discussion on ”Statistical Learning With Time Series Depen- dence: An Application to Scoring Sleep in Mice” by McShane et al. Journal of the American Statistical Association. 108(504): 1154.
  28. Wang Y*, Garcia T, and Ma Y. (2012). Nonparametric estimation for censored mixture data with application to the Cooperative Huntington’s Observational Research Trial. Journal of the American Statistical Association. 107 (500), 1324-1338. Paper.
  29. Fan R, Zhang Y, Albert P, Liu A, Wang Y, and Xiong M. (2012). Longitudinal genetic analysis of quantitative traits. Genetic Epidemiology. 36: 856-869.
  30. Ma Y, Wang Y* (2012). Efficient Distribution Estimation for Data with Unobserved Sub-population Identifiers. Electronic Journal of Statistics. 6, 710-737. Paper.
  31. Wang Y*, Chen H (2012). On testing a variance component in a linear mixed effects model with multiple variance components. Biometrics. In press. Paper. Sample Code.
  32. Wang Y*, Chen Y, Yang Q. (2012). Joint rare variant association test of the average and individual effects for sequencing studies. PLoS ONE. In press. Paper. Code-SAS
  33. Yang Q, Wang Y. (2012). Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies. Journal of Probability and Statistics. In press.
  34.   Wang Y*, Huang C. (2012). Semiparametric variance components models for genetic studies with longitudinal phenotypes. Biostatistics. In press. Paper.
  35. Chen T, Wang Y*, Ma Y, Marder K, Langbehn D. (2012). Predicting disease onset from mutation status using proband and family data with applications to Huntington’s disease. Journal of Probability and Statistics. In press. Paper.
  36. Wang Y*, Huang C, Fang Y, Qiong Y, and Li R. (2012). Flexible semiparametric analysis of longitudinal genetic studies by reduced rank smoothing. Applied Statistics: Journal of the Royal Statistical Society, Series C. 61, 1-24. Paper. Appendix.
  37. Wang Y*. (2011). Flexible estimation of covariance function by penalized spline with application to longitudinal family data. Statistsics in Medicine. 30(15), 1883-1897. Paper. Code.
  38. Chen H and Wang Y*. (2011). A penalized spline approach to functional mixed effects model analysis. Biometrics. 67, 861-870. Paper. Appendix. Code
  39. Wang Y*, Yang Q, and Rabinowitz D (2011). Unbiased and efficient estimation of the effect of candidate genes on quantitative traits in the presence of population admixture. Biometrics. 67, 331-343. Paper. Code.
  40. Wang Y*, Chen H, Li R., Duan N and Lewis-Fernandez R (2011). Prediction based structured variable selection through penalized support vector machine. Biometrics. Paper. 67, 896-905.
  41. Wang Y*, Chen H, Schwartz T, Duan N, Parcesepe A, and Lewis-Fernandez R (2011). Assessment of a disease screener by hierarchical all subset selection using area under the receiver operating characteristic curves. Statistics in Medicine. 30, 1751-1760. Paper.
  42. Wang Y*, Fang Y (2011). Adjusting for treatment effects when estimating or testing genetic effects is the main interest. Journal of Data Science. 9, 127-138.
  43. Wang Y*, Rabinowitz D (2010). Efficient non-parametric estimation from kin-cohort data. Communications in Statistics: Theory and Methods. 39, 3622-3634.
  44. Fang Y, Wang Y*. (2009). Testing for genetic effect on functional traits by functional principal components analysis based on heritability.  Statistics in Medicine. 28(29), 3611-3625. Paper.
  45. Fang Y, Wang Y, Sha N. (2009). Armitage’s trend test for genomewide association analysis:  one-sided or two-sided? BMC Genetics. 3(Suppl 7): S37.
  46. Wang Y*, Fang Y (2009). Least square and empirical Bayes approaches for estimating random change points. Journal Data Science. 7, 1-12
  47. Wang Y*, Sha N, Fang Y. (2009). Analysis of genome-wide association data by large-scale Bayesian logistic regression. BMC Genet. 3(Suppl 7): S16.
  48. Beyene J, Tritchler D, Bull SB, Cartier KC, Jonasdottir G, Kraja AT, Li N, Nock NL, Parkhomenko E, Rao JS, Stein CM, Sutradhar R, Waaijenborg S, Wang KS, Wang Y and Wolkow P (2007). Multivariate analysis of complex gene expression and clinical phenotypes with genetic marker data. Genetic Epidemiology. 31 Suppl 1:S103-9.
  49. Wang S, Zheng T, Wang Y. (2007). Transcription activity hotspot, is it real or an artifact? BMC Genet. Suppl 1: S94
  50. Wang Y*, Clark LN, Marder K and Rabinowitz D (2007). Non-parametric estimation of genotype-specific age-at-onset distributions from censored kin-cohort data.  Biometrika, 94(2): 403-414. Paper.
  51. Wang Y*, Fang Y, Jin M. (2007). A ridge penalized principal-components approach based on heritability for high-dimensional data. Human Heredity. 64(3), 182-91. Paper. Code
  52. Wang Y*, Fang Y, Wang S. (2007). Clustering and principal component analysis for mapping co-regulated genome-wide variation using family data. BMC Genet. Suppl 1:S121
  53. Wang Y*, Ottman R, and Rabinowitz D. (2006). A method for estimating penetrance from families sampled for linkage analysis. Biometrics. 62, 1081-1088. Paper.