Calculation percentage of my data for different number ranges [closed]
By Sarah Scott •
I am trying to find out how to calculate percentage of my data for different number ranges. So I have a data that looks like this:
0.81761
0.255319
0.359551
0.210191
0.374046
0.188406
0.179487
0.265152
0.207792
0.202614
0.150943..and I have these ranges:
0-0.3
0.3-0.7
0.7-1I want to know out of my data, what is the percentage that fall into a specific number range. So, for example:
0-0.3 -> 72.7%
0.3-0.7 -> 18.18%
0.7-1 -> 9.09%Does anybody knows how to do this calculation?
32 Answers
Using awk:
awk ' # Count occurencies { if ($1 < 0.3) a++ else if ($1 > 0.7) c++ else b++ } # Print Percentage of count/NR (num records) END { printf "< 0.3: %.2f%%\n",a/NR*100 printf "> 0.3 and < 0.7: %.2f%%\n",b/NR*100 printf "> 0.7: %.2f%%\n",c/NR*100 }
' file 0 You can use the histogram function from numpy
Ex.
$ python
Python 2.7.12 (default, Nov 12 2018, 14:36:49)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import numpy as np
>>>
>>> data = np.loadtxt('datafile')
>>> hist = np.histogram(data,[0,0.3,0.7,1.0])
>>> print 100.0 * hist[0]/sum(hist[0])
[ 72.72727273 18.18181818 9.09090909]
>>>See for example NumPy - Histogram Using Matplotlib (of course, you don't have to plot the result).
2