Chapter 4 of 6

Control Variables (Covariates) in DiD

Difference-in-Differences

In DiD, control variables serve a different purpose than you might expect.

Normally, we add control variables so that we can focus on the pathway between treatment and outcome. But in a DiD setting, the research design itself is supposed to do that — unit (group) fixed effects handle within unit (group) time-invariant differences, and the comparison across groups handles time effects.

So what are covariates for? In DiD, they serve one purpose: supporting the parallel trends assumption.

Conditional Parallel Trends

"I don't think parallel trends holds in my setting. But it does hold after you account for the influence of this covariate."

If that's not what you're saying, then you don't want the control. Don't just add controls because it feels like you should.

Source: The Effect Introduction to Research Design and Causality

Assumption — Conditional Parallel Trends

The average change in untreated potential outcome from pre-treatment to post-treatment is the same for treated and control units that share the same covariate values. (Source: Difference-in-Differences Designs: A Practitioner's Guides)

Conditional parallel trends, through an example.

Imagine a new study: the effect of adopting novel data-tracking and analytic technology on club performance. Some clubs adopt it and some others don't. Let's say that the clubs that adopt tend to be the more ambitious and those clubs also tend to have younger squads. While the non-adopters have older squad. (Note: For the sake of this explanation, let's assume there were no transfer of players, so the players remain the same for every club.)

Now then, that would be a problem if young squads tend to improve season-over-season as players mature into their prime, while older squads plateau or decline. So even if no club had adopted the technology, adopter and non-adopter performance would have drifted apart anyway.

Raw performance — parallel trends is violated

Look at the pre-treatment seasons: adopter clubs are already climbing faster than non-adopters. The two groups weren't on parallel paths to begin with.

So how do we salvage this design? We condition on the initial squad age (baseline), compare adopter clubs to non-adopter clubs of similar age profile, and ask: within those age-matched comparisons, do trends look parallel?

After conditioning on squad age — parallel trends restored

Now the pre-treatment slopes of the two groups look the same. Post-treatment, adopters pull ahead — and that gap is a credible estimate of the technology's effect.

Therefore, when differing baseline covariate (initial squad age) can predict outcome trend (squad performance), we need to use conditional parallel trends.

That's conditional parallel trends. We don't claim the raw trends are parallel, because they clearly aren't. We claim they're parallel once we account for squad age. The credibility of the whole DiD estimate now rests on the covariate: did we pick the right one, and did we use it correctly?

Not every baseline difference needs a control — beware of 'Bad Controls'

A common instinct is: "the two groups differ on X at baseline, so I should control for X." In DiD, we have to be careful. There are baseline differences that are fine, or even better, to leave alone.

Staying with the data-tracking technology adoption scenario:

✓

Fine to ignore — covariate is balanced across adopters and non-adopters

Adopter and non-adopter clubs have roughly the same average wage bill per player. Even though wage bill plausibly predicts how performance evolves, the distribution looks similar in both groups — so it isn't driving a gap in trends between them, and we don't need to condition on it.

✓

Fine to ignore — covariate (in baseline or in its trend) unrelated to outcome evolution

Adopter clubs tend to have newer stadiums than non-adopters. If stadium age doesn't meaningfully shape how on-pitch performance evolves over time, this imbalance isn't relevant for parallel trends and we don't need to control for it.

Advised to ignore — covariate levels changes over time (time-varying covariates) but not related to the treatment

Fixture congestion — the number of matches a club plays per month — swings season to season depending on cup runs and international tournament schedules. It moves over time, but it isn't influenced by whether a club adopted the data-tracking technology. It might still predict performance, yet conditioning on a time-varying covariate within TWFE quietly changes what the coefficient means, and can introduce its own bias. Safer to leave it out.

Advised to ignore — time-varying covariates related to the treatment

Player injury rates are one of the things the data-tracking technology is designed to reduce — adopter clubs use it to monitor player load and manage fitness. So injury rate moves because of the treatment. Controlling for it absorbs part of the pathway "tech → fewer injuries → better performance" into the covariate, and shrinks the measured treatment effect. Don't condition on something the treatment itself is moving.

To get conditional parallel trends, consider about covariates to control for:

Treated and comparison groups have different covariate distributions (imbalance in composition)
Covariate values predict different outcome trends
But within groups defined by the same covariate values, trends are parallel

The first step is to check whether your covariates are imbalanced across groups.

Before deciding which covariates to condition on, it helps to see which ones actually differ between the treated and control group. A covariate that is roughly balanced across groups is unlikely to be driving divergent trends, even if it is theoretically related to the outcome.

A useful measure for this is the normalized difference in means. Unlike a t-test, it doesn't depend on sample size — it simply describes how far apart the two groups are on a covariate, in units of the pooled standard deviation.

ND = (X̄_T − X̄_C) / √((s²_T + s²_C) / 2)

X̄_T, X̄_C — sample means for treated and control

s²_T, s²_C — sample variances for treated and control

A common rule of thumb is that a normalized difference above 0.25 in absolute value suggests meaningful imbalance — the kind that may be worth investigating further. Below that, the groups are reasonably similar on that characteristic. Although this threshold would vary case by case, depending on the importance of the covariate.

Covariate	Norm. difference	Reading
Average squad age	0.09	Balanced — likely fine to ignore
Stadium capacity	0.06	Balanced — likely fine to ignore
Average player value	0.41	Imbalanced — worth investigating
League revenue	0.34	Imbalanced — worth investigating

Imbalance alone doesn't mean you need to add a control. It just flags which covariates are candidates. The next question — from the previous panel — is whether the imbalanced covariate is also plausibly related to how the outcome would have trended over time. If both are true, conditioning on it is worth considering.

Once you've identified which covariates to include, the next step is to bring them into the regression — which turns out to have a few pitfalls of its own.

How to include covariates? TWFE with an added control only adjusts for the covariate's level — not its trend.

Once you've picked a covariate worth conditioning on, a natural instinct is to drop it into the TWFE regression and move on. But it isn't as straightforward.

Simplest solution

Use covariates that are fixed over time or measured at baseline, and interact them with a post-treatment dummy. That lets each covariate value have its own pre/post change, so the covariate can influence the outcome's trend, not just its level.

More advanced approaches

Outcome regression
Inverse probability weighting (IPW)
Doubly robust DiD

Source: Difference-in-Differences Designs: A Practitioner's Guides

After adding controls, check that you actually fixed the problem.

The whole point of adding covariates was to fix a parallel trends violation. So after you've added them, go back and check.

1. Re-examine prior trends

Estimate the dynamic treatment effects (treatment effect in each pre-treatment period) with your controls included. The pre-treatment effects should now be closer to zero. If they're not, your controls didn't fix the problem.

2. Check common support

If you're using matching, make sure your treated and control groups actually overlap on the matching variables. If few control observations have similar covariate values as the treated observations, you're relying heavily on extrapolation.

3. Show results with and without controls

It's good practice to present your DiD estimates both with and without covariates. If the results change dramatically, that tells you (and your reader) something important about how much the controls matter.

Covariates in DiD exist to make parallel trends more plausible — not to close back doors. Use them intentionally, check they worked, and present your results transparently.