Student’s Tea Chat: Unsolicited Advice Based on What we Saw in the Job Market Last Year

by Jishnu Das (Georgetown University, CGD & NBER)

hello people. Jobs await. ChatGPT's here. Interviews are on the cards. Choices, disappointment, exhilaration. All around the corner.

Last year we worked a lot with students to help them navigate the market. We learned some stuff from doing that.

These posts are a kind-of summary. Some technical, some process, some random thoughts. Take what you will.

This one is technical: It relates to discussions I have had over the last 4 years on trying to understand the link between the way we analyze data and policy. That link is subtle and I am still learning how to best think about it.

It draws heavily on a course I teach on causal inference. I spend 6 lectures on inference and 8 on the Big 4 causal methods. The choice was driven by (a) the fact that our students will have a big role as consumers of research and (b) consuming research in a world that produces upwards of 80K papers a day is not trivial. Many of the policy questions the course could help with come up again and again in job interviews for those thinking of going into data/analysis/policy fields.

Here is a summary of big ideas from the inference part of that class. I pulled out multiple email and WhatsApp discussions over the year and asked ChatGPT to make the summary.

If there are parts of that summary you don't understand, it may be worth going back to your notes!!!


Also, if the writing is more formal than usual, blame the GPT, not me. Seriously. In response to my prompt, the GPT has the gall to tell me:

"Got it. Let me strip this down into clean, careful teaching notes, written for students, with zero email drama, no side rants, and no assumed sophistication beyond a standard econometrics course. Think of this as something you could hand out after a lecture and say: “This is the conceptual map.”"

I now know what Trurl and Klapaucius felt like when they started kicking that infernal machine...


What Do Experiments, Estimates, and Confidence Intervals Actually Tell Us?

A short guide to avoiding common statistical mistakes


1. What Question Are We Answering?

Statistical methods answer different kinds of questions, and confusion happens when we mix them up.

Frequentist inference asks:

If the true effect were X, how likely is it that I would observe data like this?

Policy and decision-making ask:

Given the data I observed, what should I believe about the size of the effect, and what should I do?

These are not the same question.


2. What a Hypothesis Test Does (and Does Not Do)

In a randomized experiment, a hypothesis test evaluates a null hypothesis (often “no effect”).

  • A p-value tells us how surprising the data would be if the null were true
  • A small p-value allows us to reject the null

That is all.

A hypothesis test does not tell us:

  • the probability that the true effect equals the estimate
  • the probability that the effect lies in any interval
  • how large the effect “really is”

3. What a Confidence Interval Really Means

A 95% confidence interval is often misinterpreted.

Correct interpretation (before the experiment):

If we were to repeat this experiment many times and compute a confidence interval each time, 95% of those intervals would contain the true effect.

Important implications:

  • The interval is random before the experiment
  • After the experiment:

Incorrect interpretation:

“There is a 95% probability that the true effect lies in this interval.”

That statement is not meaningful in frequentist statistics.

A NOTE: This is the slide that leads to the *most* confusion. The correct interpretation sounds circular—shouldn’t it be “compute an “effect” each time? The answer is “No”—it is compute a confidence interval each time. This is a key idea—the Confidence Interval is a random variable (interval) that changes each time a sample is taken. What then is “the confidence interval’? Well, there is no single confidence interval, but there is a procedure for generating a confidence interval based on every sample. What the interpretation tells us is that if our procedure is correct, then the CI that I have computed for each sample will have contained the true effect in 95% of the samples. The Wikipedia article on Confidence Intervals is actually really good if you are still having trouble!

A Second Note: You can invoke a Bayesian perspective to justify the `bad’ claim that there is a 95% probability that the correct confidence interval contains the true effect from a single sample. This will happen if the prior is uninformative over its support, in that, it attributes each probability to every possible outcome. Then, the 95% frequentist CI and the 95% Bayesian credible interval will coincide. That’s fine, but my point is that if we are going to use a Bayesian workaround the justify a frequentist procedure, why not then account for the fact that in many cases, I should *not* use an uninformative prior. What you can’t do is claim a Bayesian antecedent for your policy analysis, but then cry foul when people don’t agree with the implicit assumption of an uninformative prior.


4. Why the Point Estimate Is Not “the True Effect”

The estimated coefficient (β̂) is:

  • a random variable
  • one realization from a sampling process

The true effect (β) is:

  • fixed
  • unknown

Frequentist theory shows that β̂ has good properties under repeated experiments (e.g., consistency), but:

Once a single experiment is complete, frequentist methods do not assign probabilities to different values of β.


5. Repetition Is the Key to Frequentist Logic

All frequentist guarantees rely on repetition:

  • repeated samples
  • repeated estimates
  • repeated confidence intervals

Policy decisions are not repeated experiments. You observe:

  • one study
  • one estimate
  • one confidence interval

Frequentist inference alone does not tell you how to convert that single result into a belief about the true effect size.


6. Why This Matters for Policy Analysis (CEA / CBA)

Cost-effectiveness and cost-benefit analysis require:

  • beliefs about effect sizes
  • expected outcomes
  • trade-offs and losses

Frequentist outputs (p-values, confidence intervals) do not provide these directly.

When economists use estimated effects for policy:

  • they are making additional assumptions
  • often implicitly
  • often without acknowledging them

7. The Bayesian Alternative (Briefly)

Bayesian analysis allows statements like:

“There is a 95% probability the effect lies in this range.”

But only because:

  • effects are treated as random variables
  • beliefs depend on a prior
  • different priors lead to different conclusions

There is no “prior-free” way to do this.


8. The Bottom Line

  • Experiments are powerful tools for rejecting hypotheses
  • Confidence intervals describe procedures, not beliefs
  • Point estimates are not probabilistic statements about truth
  • Using evidence for policy requires:

A rule of thumb

Frequentist statistics tells us when data are inconsistent with a hypothesis. It does not tell us what to believe instead.

About the Author

Jishnu Das  is a distinguished professor of public policy at the McCourt School of Public Policy and the Walsh School of Foreign Service at Georgetown University.

Jishnu’s work focuses on health and education in low and middle-income countries. He has co-developed one of the largest and longest-running cohort study on learning outcomes in low-income countries and led an agenda on quality of healthcare in low-income countries. To learn more about his work, visit his website

PAGE TOP