THE Secret Weapon for People Analytics: Quasi-Experiments

The most underutilized method in people analytics

May 23, 2023

Preface: This article serves as a primer on quasi-experiments, which might be the most underutilized method in people analytics. It will focus on providing a mental toolkit for quasi-experiments: how to think about them, how to spot opportunities to use them, and how to leverage them successfully.

In people analytics, causality matters

I have a soft spot for “data mining”: using any and all available data to optimize predictive accuracy for some target variable. There’s nothing like plugging a bunch of data into an algorithm, watching your computer go “brrrrr,” and experiencing the modern day magic of good predictions.

On the other hand, in order for people analytics projects to be successful, we often need to be able to explain the “why”. Our projects typically seek to influence decisions about how programs, policies, and processes are run, and in order to make these interventions effective, we need to understand the “true relationships” between variables. This is easier said than done. For example, if we found a correlation between X and Y, we still face the following barriers to inferring a causal relationship:

Confounding variables: The apparent relationship between X and Y could actually be caused by the presence of some outside factor Z
Spurious correlations: There are a myriad of ways that a correlation between X and Y could be an artifact of something else. The Simpson’s paradox is a good example of a correlation that disappears after accounting for another factor
Temporal precedence: In order for X to have caused Y, changes in X should be observed before we see changes in Y

This list is not exhaustive, and I highly recommend checking out this blog post by Eduardo Valencia Tirapu if you’re interested in diving deeper into the challenges of establishing causal relationships.

Academic researchers can use randomized experiments to get around these barriers, but so-called field experiments are not always feasible in real-life organizations. In many cases, treating a randomized group of employees differently may be unethical or even illegal. The effectiveness of field experiments can also suffer from issues like subject turnover, “outside events like organizational changes, or leakage (employees talk to each other, so the control group might find out about the treatment).

So, do we just give up on experiments and run correlational analyses? Absolutely not. Quasi-experiments are our secret weapon.

A quasi-experiment is a type of technique that shares the same goal as a true experiment: to infer a causal relationship between two variables. The difference is that quasi-experiments are the tool of choice when random assignment of people into experimental groups is not possible. This is particularly useful in people analytics for two reasons. First, it’s often not feasible to assign people to different conditions randomly. Second, it’s often not feasible to manipulate the variables we want to study, such as working remotely or receiving training. For example, we might use a quasi-experiment to understand the impact of a training program on job performance. In doing so, we can accommodate the business imperative to roll out the training to all relevant workers while still evaluating the training rigorously. This ability to balance flexibility with methodological rigor is what makes quasi-experiments such a valuable tool.

Spotting opportunities to use quasi-experiments

Quasi-experiments are an underutilized tool in people analytics, which is a significant missed opportunity. This may be partly due to the limited emphasis on quasi-experiments in I/O psychology and business analytics degree programs. Another challenge with quasi-experiments is conceptual accessibility. Rather than the statistical methods themselves, the difficulty lies in identifying suitable opportunities to leverage these methods effectively. These methods are as much art as they are science, to reference Adam Grant’s paper on the subject.

Here are some key indicators to help identify situations where a quasi-experiment may be useful:

Have your ears up: In ongoing discussions among leaders, when there is a debate regarding the causes and effects of specific decisions or data points, being plugged into these conversations and identifying the right questions is the first step to identify opportunities to create insights.
Notice big events or initiatives: Reorganizations, layoffs, or the implementation of new interventions like trainings, programs, or processes can each serve as potential opportunities for quasi-experiments. I mentioned events like these as potential confounding factors, but they also present chances to study impacts, because they often affect some workers more than others.
Look out for natural experiments: When otherwise similar people are inadvertently exposed to different conditions, this is called a natural experiment. These can be tricky to spot but are very powerful for causal inference. In one famous example, a law was passed in one state but not in another, allowing for comparisons between similar individuals across state borders. This situation is commonplace in organizations, where different departments or divisions may subject similar workers to distinct policies or work environments, creating variation that can be utilized to establish quasi-treatment and control groups.

Go-to methods for quasi-experiments

Once you’ve identified an opportunity to use a quasi-experiment to answer a research question, you’ll need to determine an appropriate quasi-experimental design. There are many such designs, so the choice really depends on the nature of the situation, data availability, and potential confounds.

Here are a few of the most popular quasi-experimental methods and why I’ve found them useful:

Matching - one of the most basic methods, the idea is to identify potential confounding variables (often demographic variables) and match subjects in the “treatment group” to otherwise similar subjects who are not subject to the treatment, creating a control group. Matching algorithms are simple to use and work really well at making the groups as similar as possible on the characteristics you select. Any remaining difference in the outcome variable is then assumed to be due to the treatment effect.
Discontinuity - an elegant method that uses regression with a twist: in situations where there is some cutoff point for a treatment or intervention, a regression discontinuity design compares observations that lie closely on either side of the cutoff. In a famous example, researchers looked at the difference in lifetime earnings between people who had received 3.5 years of college education but not graduated, and people who completed 4 years and did graduate. Spoiler: the people who graduated ended up earning a ton more, despite being only slightly more educated. This is called the “diploma effect”.
Difference in differences (DID) - The DID technique is a simple, yet powerful technique that allows you to compare differences in outcomes over time between two groups: one that receives the treatment and one that does not. The difference between these two changes becomes the estimated causal effect of the treatment. This technique is extremely helpful when you know that there is some underlying trend already. For example, if you want to study the effects of a new TA program on time-to-fill across time, this method would allow you to adjust for other events, programs, or outside factors that are impacting the time-to-fill trend.

The parallel trends assumption for differences-in-differences. In the absence of treatment, the difference between the ‘treatment’ and ‘control’ group is a constant. Plotting both lines in a temporal graph like this can help check the validity of the assumption. Credits to Youcef Msaid.

Visualization of the difference-in-differences technique. Credit

This is by no means an exhaustive breakdown of quasi-experimental methods, so I highly encourage you to do some more research. If you’d like to learn how to perform one of these techniques, you can google and find programming tutorials for any of them.

Limitations

Quasi-experiments do have their limitations. Remember, inferring causality is different from establishing it, full stop. If you’ve run a brilliant quasi-experiment, it’s still possible that you missed a confounding factor that would nullify your findings. The results of the quasi-experiment also may not be generalizable beyond the population that you examined.

Even if executed flawlessly, results can be difficult to explain. When presenting the results of quasi-experiments, it’s best to be clear and concise as well as use visuals where possible. Like a good product, good analytics are complicated behind the scenes, but simple on the outside. It's hard to strike this balance, and as mentioned, there is a real art to it, but that's what makes it fun.

Share People Analytics Blog