As part of ENAR's education initiative, our webinars promote continuing education for professional and student statisticians by disseminating cutting-edge knowledge to our membership. An ENAR webinar (or "webENAR") can strengthen your background in methodology and software, provide an opportunity to learn about a topic outside of your primary area of specialization, or deepen your understanding of an area in which you already work. We invite you to participate and benefit from the expertise of some of North America's leading statisticians and biostatisticians.
The Webinar Committee of the ENAR Regional Advisory Board (RAB) is coordinating this ongoing series of 1- to 2-hour webinars given by renowned experts. Registration fees are by membership category, with a reduced fee for student members. The webinars are planned to be broadly available and we encourage groups at your institution or workplace to participate together. WebENARs provide excellent learning opportunities for students and professionals alike.
Registration fees are determined by membership category.
(Almost) All of Entity Resolution
October 2, 2020
10 a.m. to 12 p.m. Eastern
Rebecca C. Steorts
Assistant Professor, Department of Statistical Science
Rebecca C. Steorts received her B.S. in Mathematics in 2005 from Davidson College, her MS in Mathematical Sciences in 2007 from Clemson University, and her PhD in 2012 from the Department of Statistics at the University of Florida under the supervision of Malay Ghosh, where she was a U.S. Census Dissertation Fellow and was a recipient for Honorable Mention (second place) for the 2012 Leonard J. Savage Thesis Award in Applied Methodology. Rebecca was a Visiting Assistant Professor in 2012--2015, where she worked closely with Stephen E. Fienberg.
Rebecca is currently an Assistant Professor in the Department of Statistical Science at Duke University. She is affiliated faculty in the Departments of Computer Science and Biostatics and Bioinformatics, the information initiative at Duke (iiD), and the Social Science Research Institute.
Rebecca was named to MIT Technology Review's 35 Innovators Under 35 for 2015 as a humanitarian in the field of software. Her work was profiled in the September/October issue of MIT Technology Review and she was recognized with an invited talk at EmTech in November 2015. In addition, Rebecca is a recipient of a NSF CAREER award, a collaborative NSF award, a collaborative grant with the Laboratory of Analytic Sciences (LAS) at NC State University, a Metaknowledge Network Templeton Foundation Grant, the University of Florida (UF) Graduate Alumni Fellowship Award, the U.S. Census Bureau Dissertation Fellowship Award, and the UF Innovation through Institutional Integration Program (I-Cubed) and NSF for development of an introductory Bayesian course for undergraduates. Her research interests are in large scale clustering, record linkage (entity resolution or de-duplication), privacy, network analysis, and machine learning for computational social science applications.
Whether the goal is to estimate the number of people that live in a congressional district, to estimate the number of individuals that have died in an armed conflict, or to disambiguate individual authors using bibliographic data, all these applications have a common theme - integrating information from multiple sources. Before such questions can be answered, databases must be cleaned and integrated in a systematic and accurate way, commonly known as record linkage, de-duplication, or entity resolution. In this article, we review motivational applications and seminal papers that have led to the growth of this area. Specifically, we review the foundational work that began in the 1940's and 50's that have led to modern probabilistic record linkage. We review clustering approaches to entity resolution, semi- and fully supervised methods, and canonicalization, which are being used throughout industry and academia in applications such as human rights, official statistics, medicine, citation networks, among others. Finally, we discuss current research topics of practical importance.
Role of Statisticians in a Pandemic
Friday, November 13, 2020
10 a.m. to 12 p.m. Eastern
Bhramar Mukherjee, PhD
Department of Biostatistics, School of Public Health
University of Michigan
Bhramar Mukherjee is John D. Kalbfleisch Collegiate Professor and Chair, Department of Biostatistics; Professor, Department of Epidemiology, University of Michigan (UM) School of Public Health; Research Professor and Core Faculty Member, Michigan Institute of Data Science (MIDAS), University of Michigan. She also serves as the Associate Director for Quantitative Data Sciences, The University of Michigan Rogel Cancer Center. She is the cohort development core co-director in the University of Michigan's institution-wide Precision Health Initiative. Her research interests include statistical methods for analysis of electronic health records, studies of gene-environment interaction, Bayesian methods, shrinkage estimation, analysis of multiple pollutants. Collaborative areas are mainly in cancer, cardiovascular diseases, reproductive health, exposure science and environmental epidemiology. She has co-authored more than 240 publications in statistics, biostatistics, medicine and public health and is serving as PI on NSF and NIH funded methodology grants. She is the founding director of the University of Michigan's summer institute on Big Data. Bhramar is a fellow of the American Statistical Association and the American Association for the Advancement of Science. She is the recipient of many awards for her scholarship, service and teaching at the University of Michigan and beyond. Including the Gertrude Cox Award, from the Washington Statistical Society in 2016 and most recently the L. Adrienne Cupples Award, from Boston University in 2020.
Jeffrey S. Morris, PhD
Department of Biostatistics, Epidemiology and Informatics
Perelman School of Medicine, University of Pennsylvania
Jeffrey S. Morris is Professor and Director of the Division of Biostatistics at the Perelman School of Medicine at the University of Pennsylvania, moving in 2019 after 19 years at the University of Texas M.D. Anderson Cancer Center. He obtained his PhD in Statistics from Texas A&M University under the supervision of Raymond J. Carroll in 2000. His research involves a combination of biomedical collaborative research and statistical methodological research, with a focus on developing flexible methods for integrating information across modern, complex big data including multi-platform genomics data, biomedical imaging data, and wearable devices, with statistical focus in functional data analysis and Bayesian modeling. Additionally, he has gotten involved in numerous COVID-19 related research projects at University of Pennsylvania, and authors the website http://covid-datascience.com. This website contains a blog in which he attempts to use his perspective and skills as a statistical data science to evaluate constantly emerging COVID-19 information, filter out biases, aggregate information together, identify key insights along with a sense of their uncertainty, and communicate them in an accessible balanced way. This blog contains more than 160 posts with upward of 100k views.
Xihong Lin, PhD
Department of Biostatistics
Harvard T.H. Chan School of Public Health
Xihong Lin is Professor and Former Chair of Biostatistics, Coordinating Director of the Program in Quantitative Genomics of Harvard TH Chan School of Public Health, and Professor of Statistics at Harvard University, and Associate Member of the Broad Institute of MIT and Harvard. Dr. Lin's research interests lie in development and application of scalable statistical and computational methods for analysis of massive data from genome, exposome and phenome, such as large scale Whole Genome Sequencing studies, integrative analysis of different types of data, biobanks, and complex epidemiological and observational studies. She is an elected member of the US National Academy of Medicine. Dr. Lin received the 2002 Mortimer Spiegelman Award from the American Public Health Association, the 2006 Presidents' Award and the 2017 FN David Award from the Committee of Presidents of Statistical Societies (COPSS). She is the PI of the Outstanding Investigator Award (R35) from the National Cancer Institute, and the contact PI of the Harvard Analysis Center of the Genome Sequencing Program of the National Human Genome Research Institute. She has been active in COVID-19 research.
Usha Govindarajulu, PhD
Center for Biostatistics
Icahn School of Medicine at Mount Sinai
Usha Govindarajulu is an Associate Professor in the Center for Biostatistics in the Department of Population Health Sciences of the Icahn School of Medicine at Mount Sinai. She earned an AB from Cornell University, an MS in Natural Resources from University of Michigan, and MS in Biostatistics from George Washington University, and a PhD in Biostatistics from Boston University After this she spent two years as a postdoctoral fellow at Harvard School of Public Health. She then worked for a year as research faculty at Yale University before moving back to Boston and working at Brigham & Women's and Harvard Medical School. After being there about 5 years, she moved to New York and took as a position as an Assistant Professor of Biostatistics at SUNY Downstate School of Public Health. She was there approximately 7 years before leaving to be in her current position. Her research interests are in survival analysis, frailty models, causal inference, genetic epidemiology, and machine learning. She is currently the 2020 Chair-Elect of the Section on Statistical Computing of the American Statistical Association.
Natalie Dean, PhD
Department of Biostatistics, College of Health & Health Professions
University of Florida
Dr. Natalie Dean is an assistant professor in the Department of Biostatistics at the University of Florida specializing in infectious disease epidemiology and study design. She is principal investigator on an NIH R01 to develop and evaluate innovative trial and observational study designs for assessing the efficacy of vaccines targeting emerging pathogens. Dr. Dean received her PhD in Biostatistics from Harvard University. She has been active in science communications during the COVID-19 pandemic, with recently published pieces in the New York Times, Washington Post, Medscape, Boston Review, and BMJ Opinion.
The format would be for each of four speakers to provide a 10 minute opening statements on their topic, followed by a series of predetermined questions posed by our moderator for perhaps 30 minutes, and then 20-30 minutes of open questions and discussion.
While the topic is very broad, we shall try to: (1) highlight some specific unique challenges based on the nature of the pandemic, e.g. our lack of knowledge about the virus coming in, the urgency to learn act quickly, yet the necessity to think careful and rigorously to avoid false steps and conclusions. (2) clearly communicate the importance of our profession and people with our quantitative skill sets to engage and have a seat at the table to have our perspective heard, both by policymakers and the media, during this crisis.
People in our profession need to have better communication with policymakers, and many in our field might not recognize their potential or the importance of our skillset and perspective to the big decisions going on in society. We hope our panel discussion can inspire more statisticians to get engaged in this way.
Registration coming soon!