Skip to content
Snippets Groups Projects

stat: Mann-Whitney U test and fractional ranks

Open Sebastien Binet requested to merge kshedden:mannwhitney into master

Created by: kshedden

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
1287 floats.Argsort(sorted, inds)
1288
1289 for i := 0; i < n; {
1290 var j int
1291 f := float64(i)
1292 for j = i + 1; j < n && sorted[j] == sorted[i]; j++ {
1293 f += float64(j)
1294 }
1295 f /= float64(j - i)
1296 for ; i < j; i++ {
1297 ranks[inds[i]] = f
1298 }
1299 }
1300 }
1301
1302 // MannWhitneyU conducts a two-sided Mann-Whitney U test for the null hypothesis
  • Created by: btracey

    We've talked in the past about having a specific package for doing statistical testing. I still think that's a better idea than having everything be in the base stat. Thoughts @kortschak , @sbinet ?

  • Sebastien Binet
    Sebastien Binet @sbinet started a thread on commit 3793d641
  • 1261 1261 variance = (ss - compensation*compensation/sumWeights) / (sumWeights - 1)
    1262 1262 return
    1263 1263 }
    1264
    1265 // FractionalRank calculates the ranks of the elements in the input slice, using
    • Created by: btracey

      This is the same thing as Quantile, right, except it does it for all of them (and has some other side effects)?

  • Created by: kortschak

    I think a "test" package would be worthwhile.

    Also note that Austin Clements did a lot of work on Mann-Whitney U for benchstat. There's no reason we couldn't use that since we already have a license dependency on Go.

  • Sebastien Binet
    Sebastien Binet @sbinet started a thread on commit 3793d641
  • 1261 1261 variance = (ss - compensation*compensation/sumWeights) / (sumWeights - 1)
    1262 1262 return
    1263 1263 }
    1264
    1265 // FractionalRank calculates the ranks of the elements in the input slice, using
    • Created by: kshedden

      Ranks are estimates of quantiles at specific probability points, but using a different estimator (in the case of ties). The gonum empirical Quantile function always returns one of the observed values, but fractional ranks are averages over several observed values (in the case of ties). So there are connections between ranks and quantiles, but there enough differences that I think it makes more sense to implement them independently. There are other use cases for fractional ranks, but the benchmark implementation doesn't separate the ranking from the U-test calculation.

  • Sebastien Binet
    Sebastien Binet @sbinet started a thread on commit 3793d641
  • 1261 1261 variance = (ss - compensation*compensation/sumWeights) / (sumWeights - 1)
    1262 1262 return
    1263 1263 }
    1264
    1265 // FractionalRank calculates the ranks of the elements in the input slice, using
    • Created by: btracey

      The empirical estimator does, but there are other estimators. One of them is #605 which would have the behavior you suggest.

  • Created by: kshedden

    I don't think the linearly interpolated quantiles can be easily used to get the fractional ranks. But it's a moot point if the preference is to use the benchmark code instead of what we have here.

  • Sebastien Binet
    Sebastien Binet @sbinet started a thread on commit 3793d641
  • 1287 floats.Argsort(sorted, inds)
    1288
    1289 for i := 0; i < n; {
    1290 var j int
    1291 f := float64(i)
    1292 for j = i + 1; j < n && sorted[j] == sorted[i]; j++ {
    1293 f += float64(j)
    1294 }
    1295 f /= float64(j - i)
    1296 for ; i < j; i++ {
    1297 ranks[inds[i]] = f
    1298 }
    1299 }
    1300 }
    1301
    1302 // MannWhitneyU conducts a two-sided Mann-Whitney U test for the null hypothesis
    • I am agnostic about that. I'd err on the "small, focused package" side though. (also b/c it's easier to merge packages than to split them out.)

  • ping?

  • Sebastien Binet
    Sebastien Binet @sbinet started a thread on commit 3793d641
  • 1287 floats.Argsort(sorted, inds)
    1288
    1289 for i := 0; i < n; {
    1290 var j int
    1291 f := float64(i)
    1292 for j = i + 1; j < n && sorted[j] == sorted[i]; j++ {
    1293 f += float64(j)
    1294 }
    1295 f /= float64(j - i)
    1296 for ; i < j; i++ {
    1297 ranks[inds[i]] = f
    1298 }
    1299 }
    1300 }
    1301
    1302 // MannWhitneyU conducts a two-sided Mann-Whitney U test for the null hypothesis
    Please register or sign in to reply
    Loading