March 27-30, 2022

ENAR 2022 Educational Program | TUTORIALS

Tutorials are roughly 2 hours in length and focus on a particular topic or software package. The sessions are more interactive than a standard lecture, often encouraging participants active engagement and hands-on participation.


Monday, March 28 | 8:30 am – 10:15 am
T1 | Incorporating functional data into statistical models

Todd Ogden, Department of Biostatistics, Columbia University

Course Description:

Functional data is all around us. Vast quantities of data are routinely gathered from accelerometers and other wearable devices, smartphones, imaging modalities, and many other sources. The term “functional data” refers to any data that can be thought of as multiple observations over some continuum. Examples includes near-infrared spectra, growth curves, 2D or 3D images, time series of ecological momentary assessments (EMA), density functions or histograms, data from electroencephalography (EEG) or magnetic resonance spectroscopy (MRS) studies, and many others. To make the best use of such data, it is necessary to adapt statistical models and techniques to take advantage of the particular structure of functional data. This tutorial will survey some of the advances that have been made and provide examples of analyses that make good use of functional data. It will focus primarily on concepts and interpretation rather than on mathematical or computational details.

Todd Ogden is a professor in the Department of Biostatistics at Columbia University, where he has worked on various aspects of functional data analysis for more than 20 years. He has worked extensively with brain imaging data in collaboration with colleagues in psychiatry and neurology but also has broad interest in real-world applications involving functional data.

Monday, March 28 | 10:30 am – 12:15 pm
T2 | Reproducible Workflow in R: Ready to Share

Andrew Brown, Indiana University School of Public Health-Bloomington

Course Description:

Importing and harmonizing varied data and file formats is difficult, which makes editing the data files themselves tempting. However, editing the data files risks introducing unreproducible steps or, worse, alterations of the raw data. Most of the time, formatting and editing the data can be accomplished programmatically, allowing a reproducible pipeline from raw data to analytical data and, ultimately, analysis. Developed with the Indiana University Biostatistics Consulting Center, this tutorial will share examples of cleaning and harmonizing data in R in the spirit of Wickham’s ‘tidy data,’ creating well-documented and human-readable code and variables, and using R Markdown to avoid copy/paste and typographical errors in moving from analysis to sharing. We will conclude with a brief discussion of public data and code sharing. A working understanding of R is required, with familiarity of R Markdown helpful.

Andrew W Brown, PhD, is an Assistant Professor with the Indiana University School of Public Health-Bloomington. Formally trained in nutrition, biochemistry, and statistics, he has conducted research using simulation, in vitro, ex vivo, animal, and observational and interventional human models. In addition, he conducts research on research through qualitative and quantitative research summaries, characterizing reporting practices that may perpetuate scientific misinformation, and evaluating methodological and statistical choices that may result in ambiguous or misinterpreted results. He is a PI of three R25 grants to strengthen the scientific enterprise, including one focused on encouraging and facilitating reproducible data and code sharing. This tutorial was developed with the IU Biostatistics Consulting Center under the direction of Stephanie Dickinson, MS. The Consulting Center works with investigators generating data from fruit flies to elephants, human interventions to observational studies, and biomarkers to social constructs, and brings practical experience handling the varied data, formats, and norms across the spectrum of biomedical and public health sciences.

Monday, March 28 | 1:45 pm – 3:30 pm
T3 | Data Visualization for Biomedical Research

Susan Mayo, Office of Biostatistics, Center for Drug Evaluation, FDA

Course Description:

There is both an art and science to making impactful graphs. What are the human brain’s visual superpowers? How can a graph be more impactful with its audience? The objective of this tutorial is to address some overlooked factors beyond the technical aspects of graphing data, including concepts and examples from Statistics in Medicine, 2015, “Seeing is believing: Good graphic design principles for medical research.” This tutorial will cover:

Susan Mayo is a senior mathematical statistician at the Food and Drug Administration, Center for Drug Evaluation’s Office of Biostatistics, with a demonstrated interest and impact in areas that help to make sound regulatory and drug development decisions: graphical design, drug safety and benefit-risk assessment, and the estimand framework. She has been with FDA for 4 years, and previously worked as an industry statistician and internal company consultant in biotech and big pharma for a few more than that.

Monday, March 28 | 3:45 pm – 5:30 pm
T4 | Snakes and Ladders: Strategies for Professional Success

David Banks, Dept. of Statistical Science, Duke University

Course Description:

Everyone wants to climb the ladder, but we all encounter obstacles. The tutorial provides a number of tips for presenting yourself and your work in ways that favor your interests. It also describes some useful habits and strategies to grow one’s career. Not all comments are applicable to all people, but as Eisenhower said, “It isn’t the plan—it’s the planning.”

David Banks obtained his PhD from Virginia Tech in 1984, then did a postdoc at Berkeley. He was a visiting assistant lecturer at Cambridge, and then an assistant/associate professor at Carnegie Mellon. In 1997 he joined the federal government and worked at NIST, DOT, and the FDA. In 2003 he joined Duke University. He has been the coordinating editor of the Journal of the American Statistical Association and the Director of the Statistical and Applied Mathematical Sciences Institute. He was also cofounder of the journal Statistics and Public Policy.

Tuesday, March 29 | 1:45 pm – 3:30 pm
T5 | Spatial Disease Modeling and Visualization using INLA and R

Paula Moraga, King Abdullah University of Science and Technology (KAUST)

Course Description:

Disease risk models are essential to inform public health and policy. These models can be used to quantify disease burden, understand geographic and temporal patterns, identify risk factors, and measure inequalities. In this tutorial, we will learn how to estimate disease risk and quantify risk factors using areal and geostatistical data. We will learn how to fit and interpret spatial models using the INLA and SPDE approaches ( in different settings. We will also create interactive maps of disease risk, and introduce presentation options such as interactive maps and dashboards. We will work through two disease mapping examples using data of malaria in The Gambia and cancer in Pennsylvania, USA. We will provide clear descriptions of the R code for data analysis and visualization. The tutorial materials are drawn from the book "Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny" by Paula Moraga (2019, Chapman & Hall/CRC).

Dr. Paula Moraga is an Assistant Professor of Statistics at King Abdullah University of Science and Technology (KAUST) and the Principal Investigator of the GeoHealth research group. Paula's research focuses on the development of innovative statistical methods and computational tools for geospatial data analysis and health surveillance, and the impact of her work has directly informed strategic policy in reducing disease burden in several countries. She has developed modeling architectures to understand the spatial and spatio-temporal patterns and identify targets for intervention of diseases such as malaria in Africa, leptospirosis in Brazil, and cancer in Australia, and has worked on the development of a number of R packages for Bayesian risk modeling, detection of disease clusters, and risk assessment of travel-related spread of disease. Paula has published extensively in leading journals and is the author of the book "Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny" (2019, Chapman & Hall/CRC) which is available at Paula received her Ph.D. in Mathematics from the University of Valencia, and her Master's in Biostatistics from Harvard University.

Tuesday, March 29 | 3:45 pm – 5:30 pm
T6 | A Primer for Meta-analysis with Real-world Applications

Houssein Assaad, StataCorp, Principal Statistician and Software Developer

Course Description:

This tutorial will cover the use of meta-analysis (MA) for combining the results of multiple studies such as clinical trials and will demonstrate how to do this using several real-world applications. MA is a statistical technique for combining the results from several similar studies often available in the literature. Some of these studies report inconclusive or even conflicting results. The goal of MA is to provide a unified conclusion or explain why such a conclusion cannot be reached.

The tutorial will use the meta suite in Stata 17 for demonstration, but no prior knowledge of Stata is required. Participants will receive a temporary Stata license in advance, and those who bring their own laptops will be able to interactively follow along provided they have Stata 17 installed and up to date. Interactive participation is not required. The notes will provide sufficient information to reproduce all analyses at the attendees' convenience.

Houssein Assaad is a Principal Statistician and Software Developer at StataCorp LLC and the primary developer of Stata's meta-analysis suite. Houssein's other contributions to Stata include nonlinear mixed-effects models and zero-inflation models. Houssein has a PhD degree in statistics from the University of Texas at Dallas. He is a former research assistant professor at Texas A&M University, where his research focused on longitudinal and functional data analysis. Preferred day or time: Prefers afternoon, no day preference