Scheduled service maintenance on November 22


On Friday, November 22, 2024, between 06:00 CET and 18:00 CET, GIN services will undergo planned maintenance. Extended service interruptions should be expected. We will try to keep downtimes to a minimum, but recommend that users avoid critical tasks, large data uploads, or DOI requests during this time.

We apologize for any inconvenience.

get_p.m 2.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
  1. function y= get_p(datumM, distM, tails, high)
  2. % p= stats.get_p(datum, dist)
  3. % p= stats.get_p(datum, dist, tails, high)
  4. % p is the prob that datum comes from dist.
  5. % tails is by default 2
  6. % if tails == 1 , then p is the prob that datum
  7. % is higher (if high==1) or lower (if high==0) than dist.
  8. % datum can be a single value or a 1 by n vector
  9. % dist can be a vector or an m by n matrix
  10. % p is the probability of datum in dist
  11. % check that inputs are the right size
  12. if nargin<3
  13. tails=2;
  14. end
  15. if nargin<4
  16. high=1;
  17. end
  18. reuse_flag=0;
  19. if numel(datumM)==1 && isvector(distM)
  20. distM=distM(:);
  21. elseif numel(datumM)>1 && isvector(distM);
  22. reuse_flag=1;
  23. elseif isscalar(datumM) && ~isvector(distM)
  24. datumM=repmat(datumM,1,size(distM,2));
  25. elseif numel(datumM)~=size(distM,2)
  26. error('GET_P:BADINPUTS','Number of columns of distM must equal lenght of datumM or distM must be a vector')
  27. end
  28. y=ones(size(datumM));
  29. for dx=1:numel(datumM)
  30. datum=datumM(dx);
  31. if reuse_flag
  32. dist=distM(:);
  33. else
  34. dist=distM(:,dx);
  35. end
  36. if isnan(datum) || all(isnan(dist))
  37. y(dx)=nan;
  38. continue;
  39. end
  40. dist=dist(~isnan(dist));
  41. ps=linspace(0,100,numel(dist)); % this limits the lowest p value it is possible to return. Maybe this should be relative to the size of dist
  42. sd_ps=prctile(dist,ps);
  43. closest= stats.qfind(sd_ps,datum);
  44. if tails==2
  45. if closest<=0 % datum out of range
  46. others=1;
  47. else
  48. others=find(sd_ps==sd_ps(closest));
  49. end
  50. if ps(others(1))<50 && ps(others(end))>50
  51. % if the datum stradles the mean.
  52. sd_p=1;
  53. elseif datum<sd_ps(1)||datum>sd_ps(end)
  54. % if the datum is outside the range of the bootstrapped distro
  55. sd_p=2/size(distM,1);
  56. elseif ps(others(1))>50
  57. sd_p=ps(others(1))/100;
  58. sd_p=max(2*(1-sd_p),2/size(distM,1));
  59. else
  60. sd_p=ps(others(end))/100;
  61. sd_p=2*sd_p;
  62. end
  63. y(dx)=sd_p;
  64. elseif tails==1
  65. % if there are repeat values in sd_ps, closest returns the max of the indices
  66. % of these. But we actually want the min of these indices so we find the
  67. % others which have the same value and take the min of these.
  68. if closest<=0
  69. if high
  70. y(dx)=1;
  71. else
  72. y(dx)=1/size(distM,1);
  73. end
  74. else
  75. others=find(sd_ps==sd_ps(closest));
  76. y(dx)=abs(high-ps(others(1))/100);
  77. end
  78. else
  79. end
  80. end