光華講壇——社會(huì)名流與企業(yè)家論壇第6756期
主題:Bayesian Knockoff Filter for False Discovery Control用于錯(cuò)誤發(fā)現(xiàn)控制的貝葉斯 Knockoff 篩選方法
主講人:香港大學(xué)計(jì)算與數(shù)據(jù)科學(xué)學(xué)院副院長(zhǎng) 尹國(guó)圣教授
主持人:統(tǒng)計(jì)與數(shù)據(jù)科學(xué)學(xué)院 林華珍教授
時(shí)間:6月9日10:30-11:30
地點(diǎn):柳林校區(qū)弘遠(yuǎn)樓408會(huì)議室
主辦單位:統(tǒng)計(jì)與數(shù)據(jù)科學(xué)學(xué)院 科研處
主講人簡(jiǎn)介:
Guosheng Yin is Patrick Poon Endowed Chair Professor in Department of Statistics and Actuarial Science and Associate Director in School of Computing and Data Science at University of Hong Kong. After receiving Ph.D. in Biostatistics from University of North Carolina at Chapel Hill in 2003, he worked as Assistant/Associate Professor in Department of Biostatistics at University of Texas M.D. Anderson Cancer Center as well as Chair in Statistics in Department of Mathematics at Imperial College London. He was Head of Department of Statistics and Actuarial Science at University of Hong Kong in 2017-2023. He was elected as a fellow of American Statistical Association and a fellow of Institute of Mathematical Statistics. He served as associate editor for Journal of American Statistical Association, Bayesian Analysis, Contemporary Clinical Trials etc. He has published over 260 peer-reviewed papers in statistical, medical journals and AI and machine learning conferences, as well as two books on clinical trial designs.
尹國(guó)圣教授是香港大學(xué)統(tǒng)計(jì)與精算學(xué)系潘燊昌基金講席教授,同時(shí)擔(dān)任香港大學(xué)計(jì)算與數(shù)據(jù)科學(xué)學(xué)院副院長(zhǎng)。他于2003年在北卡羅來(lái)納大學(xué)教堂山分校獲得生物統(tǒng)計(jì)學(xué)博士學(xué)位后,曾在德克薩斯大學(xué)MD安德森癌癥中心生物統(tǒng)計(jì)學(xué)系擔(dān)任助理/副教授,并曾在帝國(guó)理工大學(xué)數(shù)學(xué)系擔(dān)任統(tǒng)計(jì)學(xué)講座教授。2017年至2023年期間,他曾擔(dān)任香港大學(xué)統(tǒng)計(jì)與精算學(xué)系系主任。尹教授被選為美國(guó)統(tǒng)計(jì)協(xié)會(huì)會(huì)士和國(guó)際數(shù)理統(tǒng)計(jì)學(xué)會(huì)會(huì)士。他曾擔(dān)任《美國(guó)統(tǒng)計(jì)協(xié)會(huì)雜志》《貝葉斯分析》《當(dāng)代臨床試驗(yàn)》等期刊的副主編。至今,他在統(tǒng)計(jì)學(xué)、醫(yī)學(xué)期刊以及人工智能與機(jī)器學(xué)習(xí)會(huì)議上發(fā)表了260余篇同行評(píng)審論文,并出版了兩本關(guān)于臨床試驗(yàn)設(shè)計(jì)的專(zhuān)著。
內(nèi)容提要:
In many scientific fields, researchers are interested in discovering important features with substantial effect on the response from a large number of features while controlling the proportion of false discoveries. By incorporating the knockoff procedure in a fully Bayesian framework, we develop the Bayesian knockoff filter (BKF) for selecting features that have important effect on the response. In contrast to the fixed knockoff variables in a frequentist procedure, we allow the knockoff variables to be continuously updated in each iteration of the Markov chain Monte Carlo. Based on the posterior samples and the elaborated greedy selection procedure, our method can distinguish the truly important features from unimportant ones and the Bayesian false discovery rate can be controlled at a desirable level. Numerical experiments on both synthetic and real data demonstrate the advantages of our BKF over existing knockoff methods and Bayesian variable selection approaches, i.e., the BKF possesses higher power and yields a lower false discovery rate, especially for weak signals.
在許多科學(xué)領(lǐng)域,研究人員關(guān)注于從大量特征中發(fā)現(xiàn)對(duì)響應(yīng)變量具有顯著影響的重要特征,同時(shí)控制錯(cuò)誤發(fā)現(xiàn)比例。我們?cè)谝粋€(gè)完全貝葉斯的框架中引入了 knockoff 程序,提出了貝葉斯 knockoff 篩選方法(Bayesian Knockoff Filter, BKF),用于選擇對(duì)響應(yīng)變量有重要影響的特征。與頻率學(xué)派方法中固定的 knockoff 變量不同,我們的方法允許在馬爾可夫鏈蒙特卡洛(MCMC)迭代的每一步中持續(xù)更新 knockoff 變量?;诤篁?yàn)樣本和精心設(shè)計(jì)的貪婪選擇過(guò)程,我們的方法能夠區(qū)分真正重要的特征與不重要的特征,并且可以在期望的水平上控制貝葉斯錯(cuò)誤發(fā)現(xiàn)率(Bayesian FDR)。在合成數(shù)據(jù)和真實(shí)數(shù)據(jù)上的數(shù)值實(shí)驗(yàn)表明,與現(xiàn)有的 knockoff 方法和貝葉斯變量選擇方法相比,BKF 具有更高的檢測(cè)能力(power)和更低的錯(cuò)誤發(fā)現(xiàn)率,尤其在識(shí)別弱信號(hào)方面表現(xiàn)更為優(yōu)越。