stat: Mann-Whitney U test and fractional ranks
Created by: kshedden
Merge request reports
Activity
1287 floats.Argsort(sorted, inds) 1288 1289 for i := 0; i < n; { 1290 var j int 1291 f := float64(i) 1292 for j = i + 1; j < n && sorted[j] == sorted[i]; j++ { 1293 f += float64(j) 1294 } 1295 f /= float64(j - i) 1296 for ; i < j; i++ { 1297 ranks[inds[i]] = f 1298 } 1299 } 1300 } 1301 1302 // MannWhitneyU conducts a two-sided Mann-Whitney U test for the null hypothesis Created by: btracey
We've talked in the past about having a specific package for doing statistical testing. I still think that's a better idea than having everything be in the base
stat
. Thoughts @kortschak , @sbinet ?
1261 1261 variance = (ss - compensation*compensation/sumWeights) / (sumWeights - 1) 1262 1262 return 1263 1263 } 1264 1265 // FractionalRank calculates the ranks of the elements in the input slice, using Created by: kshedden
I think this is the relevant code:
https://github.com/golang/perf/blob/6e6d33e29852650cab301f4dbeac9b7d67c6d542/internal/stats/utest.go
1261 1261 variance = (ss - compensation*compensation/sumWeights) / (sumWeights - 1) 1262 1262 return 1263 1263 } 1264 1265 // FractionalRank calculates the ranks of the elements in the input slice, using Created by: kshedden
Ranks are estimates of quantiles at specific probability points, but using a different estimator (in the case of ties). The gonum empirical Quantile function always returns one of the observed values, but fractional ranks are averages over several observed values (in the case of ties). So there are connections between ranks and quantiles, but there enough differences that I think it makes more sense to implement them independently. There are other use cases for fractional ranks, but the benchmark implementation doesn't separate the ranking from the U-test calculation.
1261 1261 variance = (ss - compensation*compensation/sumWeights) / (sumWeights - 1) 1262 1262 return 1263 1263 } 1264 1265 // FractionalRank calculates the ranks of the elements in the input slice, using 1287 floats.Argsort(sorted, inds) 1288 1289 for i := 0; i < n; { 1290 var j int 1291 f := float64(i) 1292 for j = i + 1; j < n && sorted[j] == sorted[i]; j++ { 1293 f += float64(j) 1294 } 1295 f /= float64(j - i) 1296 for ; i < j; i++ { 1297 ranks[inds[i]] = f 1298 } 1299 } 1300 } 1301 1302 // MannWhitneyU conducts a two-sided Mann-Whitney U test for the null hypothesis 1287 floats.Argsort(sorted, inds) 1288 1289 for i := 0; i < n; { 1290 var j int 1291 f := float64(i) 1292 for j = i + 1; j < n && sorted[j] == sorted[i]; j++ { 1293 f += float64(j) 1294 } 1295 f /= float64(j - i) 1296 for ; i < j; i++ { 1297 ranks[inds[i]] = f 1298 } 1299 } 1300 } 1301 1302 // MannWhitneyU conducts a two-sided Mann-Whitney U test for the null hypothesis