Analysis of Survey Data from Complex Sample Designs

Analysis of Survey Data from Complex Sample Designs

taught by Brady West

Aim of Course:

In order to extract maximum information at minimum cost, sample designs are typically more complex than simple random samples. Cluster sampling and stratified designs are common. But how do you analyze the resulting data - in particular, how do you determine margins of error? This online course, "Analysis of Survey Data from Complex Sample Designs" teaches you how to estimate variances when analyzing survey data from complex samples, and also how to fit linear and logistic regression models to complex sample survey data.

This course may be taken individually (one-off) or as part of a certificate program.

Course Program:

WEEK 1: Overview

  • Applied Survey Data Analysis: An Overview
  • Important terms, concepts, and notation
  • Software Overview
  • Getting to Know the Complex Sample Design
    • Classification of Sample Designs
    • Target Populations and Survey Populations
    • Simple Random Sampling
    • Complex Sample Design Effects
    • Complex Samples: Clustering and Stratification
    • Weighting in Analysis of Survey Data
    • Multi-stage Area Probability Sample Designs

WEEK 2: Overview continued

  • Foundations and Techniques for Design-based Estimation and Inference
  • Finite Populations and Superpopulation Models
  • Confidence Intervals for Population Parameters
  • Weighted Estimation of Population Parameters
  • Probability Distributions and Design-based Inference
  • Variance Estimation
  • Hypothesis Testing in Survey Data Analysis
  • Total Survey Error
  • Preparation for Complex Sample Survey Data Analysis
    • Analysis Weights: Review by the Data User
    • Understanding and Checking the Sampling Error Calculation Model
    • Addressing Item Missing Data in Analysis Variables
    • Preparing to Analyze Data from Sample Subclasses
    • A Final Checklist for Data Users

WEEK 3: Descriptive Statistics

  • Descriptive Analysis for Continuous Variables
  • Special Considerations in Descriptive Analysis of Complex Sample Survey Data
  • Simple Statistics for Univariate Continuous Distruibutions
  • Bivariate Relationships between Two Continuous Variables
  • Descriptive Statistics for Subpopulations
  • Linear Functions of Descriptive Estimates and Differences of Means
  • Categorical Data Analysis
    • A Framework for Analysis of Categorical Survey Data
    • Univariate Analysis of Categorical Data
    • Bivariate Analysis of Categorical Data
    • Analysis of Multivariate Categorical Data

WEEK 4: Regression Models

  • Linear Regression Models
  • The Linear Regression Model
    • Fitting linear regression models to survey data
  • Four Steps in Linear Regression Analysis
  • Some Practical Considerations and Tools
  • Application: Modeling Diastolic Blood Pressure with the NHANES Data
  • Logistic Regression and Generalized Linear Models for Binary Survey Variables
    • Generalized Linear Models (GLMs) for Binary Survey Responses
    • Building the Logistic Regression Model: Stage 1-Model Specification
    • Building the Logistic Regression Model: Stage 2-Estimation of Model Parameters and Standard Errors
    • Building the Logistic Regression Model: Stage 3-Evaluation of the Fitted Model
    • Building the Logistic Regression Model: Stage 4-Interpretation and Inference
    • Analysis Application
    • Comparing the Logistic, Probit, and Complementary-Log-Log (C-L-L) GLMs for Binary Dependent Variables



The homework in this course consists of short answer questions to test concepts, guided exercises in writing code and guided data analysis problems using software.

In addition to assigned readings, this course also has example software codes, supplemental readings available online, and an end of course data modeling project.

Analysis of Survey Data from Complex Sample Designs

Who Should Take This Course:

Anyone designing surveys or analyzing survey data.




Students should also have some familiarity with generalized linear models (GLM), in specific, logistic regression. These topics are covered as part of the Categorical Data Analysis course, and in greater depth in the GLM and Logistic Regression courses.

Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:
About 15 hours per week, at times of  your choosing.

Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:
  1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course,  CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

The course text is Applied Survey Data Analysis by Steve Heeringa, Brady West, and Pat Berglund.



The course will be driven by learning how to use specialized software procedures for the analysis of complex sample survey data, using real data sets, and exercises will be selected from the book chapters. Participants could use R, WesVar, or IVEware (free packages) or SAS, Stata, SUDAAN, or SPSS (commercial packages, with SPSS users required to purchase the Complex Samples Module).  If you plan to use other software, check to be sure that it can analyze data from complex survey designs (clustered, stratified, multistage, etc.).



October 14, 2016 to November 11, 2016 October 13, 2017 to November 10, 2017

Analysis of Survey Data from Complex Sample Designs


October 14, 2016 to November 11, 2016 October 13, 2017 to November 10, 2017

Course Fee: $589

Do you meet course prerequisites? What about book & software? (Click here to learn more)

Group rates: Click here to get information on group rates. 

First time student or academic? Click here for an introductory offer on select courses. Academic affiliation?  You may be eligible for a discount at checkout.

Register Now

Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise.

The Institute for Statistics Education is certified to operate by the State Council of Higher Education in Virginia (SCHEV).

Want to be notified of future courses?

Student comments