Skip to content

Explore Courses | Elder Research | Contact | LMS Login

Statistics.com Logo
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Menu
  • Courses
    • See All Courses
    • Calendar
    • Intro stats for college credit
    • Faculty
    • Group training
    • Credit & Credentialing
    • Teach With Us
  • Programs/Degrees
    • Certificates
      • Analytics for Data Science
      • Biostatistics
      • Programming For Data Science – Python (Experienced)
      • Programming For Data Science – Python (Novice)
      • Programming For Data Science – R (Experienced)
      • Programming For Data Science – R (Novice)
      • Social Science
    • Undergraduate Degree Programs
    • Graduate Degree Programs
    • Massive Open Online Courses (MOOC)
  • Partnerships
    • Higher Education
    • Enterprise
  • Resources
    • About Us
    • Blog
    • Word Of The Week
    • News and Announcements
    • Newsletter signup
    • Glossary
    • Statistical Symbols
    • FAQs & Knowledge Base
    • Testimonials
    • Test Yourself
Student Login

Blog

Home Blog Mixed Models – When to Use

Mixed Models – When to Use

Companies now have a lot of data on their customers at an individual level.  Suppose you are tasked with forecasting customer spending at a grocery chain, and you want to understand how customer attributes, local economic factors, and store issues affect customer spending. You could design your study with hierarchical and mixed linear modeling methods in mind.

These methods had their antecedents as far back as 1861, when the British Royal Astronomer George Airy gathered a set of telescopic observations on multiple nights, with multiple observations each night.  Describing the data, he noted the different variance components – within-nights and between-nights. It wasn’t until 1925, though, that R.A. Fisher presented a general method for dealing with different variance components (in his classic Statistical Methods for Research Workers).

Clustering and Hierarchy

In your grocery project, there are also different components of variance.  Customer attributes operate at an individual customer level. This might include demographic data, prior spending and the like.  Customers are clustered at stores, so factors varying by store (e.g. store size and employee turnover, which might reflect store quality) should be modeled at the store level, by including store as an explanatory variable.  Further hierarchy is introduced by economic factors such as income levels and unemployment, which might operate at a more regional level, encompassing many stores. 

Clustering is one way in which data departs from the simple model of independent observations (where one can think of observations as being picked randomly from a box.)  Other grouping occurs when you have repeated measurements (at the same time) for each subject, or when you have longitudinal data – variables recorded repeatedly over time for each subject. 

Fixed and Random Effects

You’ve probably run across the terms fixed effects and random effects.  Google these terms and you will see a lot of information and some definitions that are not consistent with each other.  Here are a few comments, paraphrased, from Linear Mixed Models by West, Welch and Galecki (Brady West and Andzrej Galecki developed our Mixed and Hierarchical Linear Models course):

Fixed factors are categorical variables, typically those being studied (e.g. gender, age group, treatment method).  Data on all categories are included, and are chosen to represent specific conditions that yield useful contrasts in a study.  In the company-wide sales study, for example, if we were specifically interested in the effect that individual stores have on turnover, all stores would need to be included in the study and this would be a fixed factor.

A random factor is a predictor (categorical or continuous) with levels that can be thought of as being randomly sampled from a population of levels being studied.  Not all possible levels of the factor are present in the data set, but the researcher intends to make inference to the population of these levels. Individual subjects might be a random factor; so would the factor “store” and “region” in a study of company-wide sales data, where only some regions and stores are modeled.  Variation in the outcome variable across different levels of the random factor is assessed as part of the model fitting.  

Typically, when specifying a mixed model in software, both fixed and random effects are included as explanatory (predictor) variables, using an additional argument to specify that an effect is random, as opposed to fixed (the default).  Additional arguments are used to identify other elements of data structure, e.g. nested effects, where levels of one factor, e.g. individual stores, exist only within a level of another factor (e.g. region).

Additional issues

From the potential complexities in the structure of the data, it is easy to see that the fitting of models can be somewhat involved. Issues such as estimation of coefficients, specification of covariance structures, interpretation of residuals, and diagnostics, all need to be dealt with.  For a more detailed tour, see our course Mixed and Hierarchical Linear Models.  

Recent Posts

  • Oct 6: Ethical AI: Darth Vader and the Cowardly Lion
    /
    0 Comments
  • Oct 19: Data Literacy – The Chainsaw Case
    /
    0 Comments
  • Data Literacy – The Chainsaw Case
    /
    0 Comments

About Statistics.com

Statistics.com offers academic and professional education in statistics, analytics, and data science at beginner, intermediate, and advanced levels of instruction. Statistics.com is a part of Elder Research, a data science consultancy with 25 years of experience in data analytics.

 The Institute for Statistics Education is certified to operate by the State Council of Higher Education for Virginia (SCHEV)

Our Links

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team
  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

Social Networks

Facebook Twitter Youtube Linkedin

Contact

The Institute for Statistics Education
2107 Wilson Blvd
Suite 850 
Arlington, VA 22201
(571) 281-8817

ourcourses@statistics.com

  • Contact Us
  • Site Map
  • Explore Courses
  • About Us
  • Management Team

© Copyright 2023 - Statistics.com, LLC | All Rights Reserved | Privacy Policy | Terms of Use

By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

Accept