Replying to davidsarah:
My conclusion at this point is that we need to know why the problem is only occurring in subprocesses, before we can write a reliable test.
This is due to a different order for entries in __requires__. With help from abadger1999 (Toshio Kuratomi), we worked out a reliable test, using only three dists (two versions of one package and one of another), that will be independent of any globally installed packages. I'll prepare a patch for that.
I'm now sure that this is the same issue as #1258, so I'm marking it as a duplicate (but all of the discussion in this ticket from comment:4 on is still relevant).