Populations Considered As Samples From Meta-populations
Hi hi.

I'm still not sure where to look up the idea you were talking about in Skype this week, but I thought I'd check what the idea is.

When we have a question about a sample from a population, if we want to we can model that population as itself produced by a random process. We then have to specify what that random process is; and then the sample we started with becomes a sample from a population which itself is being treated as a sample from a meta-population. (I believe all random processes can be treated as sample/population problems, although that's not obvious.) Then we might want the null hypothesis to be a hypothesis about the meta-population, and the question is whether people actually model things in this way.

And the answer must surely be yes. I just don't know how frequently people do this or where there's a good discussion of it. The book I was suggesting looking in is Kendall's Advanced Theory of Statistics — http://library.anu.edu.au/record=b1862570 — but it might be worth waiting to see whether I can think of what keyword to look under in the index. Which I haven't thought of yet.

There is plenty on what's called hierarchical models, which gives you the maths you need, but I don't think (could be wrong) that the population itself is usually considered a sample from a hypothetical thing.

I think this is a slight generalisation of what you said: I don't think we don't need to say anything about causality in the meta-population, at this level of generality. The causal structure can be part of our model of how the actual population is related to the meta-population, if we want it to be.

Is that right?


[The technical term for what we were discussing is definitely 'superpopulation.' I think it's a relatively new statistical concept (i.e. theorists started seriously working on it in the early 70s) which is maybe why it's not explicitly covered in Kendall and Stuart. I've tried to read a few articles about superpopulation models. I understand that superpopulation approaches to finite population sampling are meant to assimilate finite population statistics with predictive statistics. This is meant to solve problems with other approaches (e.g. Neymanian) to finite populations. But that's all I've gotten as I find the technical language in these articles impenetrable! Anyway, I don't think it's important that I understand superpopulation modeling inside and out.

And I think everything you've said is basically correct. I know the basics of hierarchical models (specifically multilevel regression) so that's fine. At first blush, I think the 'superpopulation-modeling argument' for significance testing has some issues. I touch on those in my preliminary thoughts on superpopulation modeling. Mitch 17-08-23]