Keith Devens .com |
Saturday, November 22, 2008 | ![]() |
| Of course, that's just my opinion. I could be wrong. – Dennis Miller | ||
|
| ← Adam Langley on Factor | James White: A Sad, Sobering Statistic → |

Jeff (http://www.scraprap.com) wrote:
Joseph Scott (http://joseph.randomnetworks.com/) wrote:
To get the list sorted by count:
sort strings | uniq -c | sort -rn
2 foo
2 bar
1 baz
I agree with Jeff, getting the numbers after the string takes more work. A little bit of awk will do the trick:
sort strings | uniq -c | sort -rn | awk '{print $2 " " $1}'
foo 2
bar 2
baz 1
Jeff (http://www.scraprap.com) wrote:
Ah yes the final sort and awk. I rarely have reason to use awk and should have spent a minute in the man page. swaping fields is the second example. :-)
Joseph, apparently you don't need to define the space between the fields. awk '{print $2, $1}' seems to preserve the existing space. Probably because it is being treated as the delimiter between fields.
Keith (http://keithdevens.com/) wrote:
You guys rock.
Keith (http://keithdevens.com/) wrote:
Hmm, something's not quite right.
$ sort strings.txt | uniq -c | sort -rn
2 foo
2 bar
1 baz
$ sort strings.txt | uniq -c | sort -rn | awk '{print $2, $1}'
2o
2r
1z
That's weird. (Note: I'm using cygwin).
Also, any clue how to get it to sort reverse numerically, then alphabetically, so it's "bar 2, foo 2, baz 1"? 
Davd wrote:
I'd love to see this in perl and python. Why? Because I've done this manually alot and my brain hurts this morning.
Keith (http://keithdevens.com/) wrote:
Ok, well here's a very short and dirty Python script that does it:
import sys
result = {}
for line in [line.rstrip() for line in open(sys.argv[1]).readlines()]:
result[line] = result.get(line,0)+1
for tup in sorted(result.items(), lambda a,b: -cmp(a[1],b[1]) or cmp(a[0],b[0])):
print tup[0],"\t",tup[1]
Probably fairly straightforward to translate into Perl.
Jeff (http://www.scraprap.com) wrote:
I don't know why it chops it that way on cygwin. You could try Joseph's awk syntax awk '{print $2 " " $1}' and see if that makes a difference.
I tested on OS X, FreeBSD, and RedHat with the same results.
On my systems to do the sorting the way you want:
sort strings.txt | uniq -c | sort -t " " -k1rn -k2 | awk '{print $2, $1}'
bar 2
foo 2
baz 1
Keith (http://keithdevens.com/) wrote:
You could try Joseph's awk syntax awk '{print $2 " " $1}' and see if that makes a difference.
Sorry, I should have mentioned that I already had and it didn't. It's weird.
And, thanks for the lesson in shell-fu.
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):
Generated in about 0.189s.
(Used 8 db queries)

sort file | uniq -c
2 bar
1 baz
2 foo
more work needed to get the line count after...