POSTS
Use ack instead of grep to parse text files
BlogYou know this guy?
grep needle haystack |grep special_needle
or his inverted cousin:
grep needle haystack |grep -v unspecial_needle
These are common staples of searching through text files.
You should stop using them. Now.
You can do much better by writing more consice regular expressions
and using ack
or one of its relatives
(ack-grep, or rak
).
The primary virtue of these commands is that they use the Perl regular expression engine. Most programmers with experience in any of the major scripting languages will find this more comfortable than grep’s use of the GNU regex syntax.
I recently encountered a need to search through many files based on a complex regular expression that required lookahead and lookbehind asserions. I have no idea how that would work in GNU regex land where, honestly, I still have a hard time getting simple capture and alternation. After learning look around syntax, I was glad to know that ack could directly implement it.
Given haystack
, a text file:
tin needle
silver needle
lead needle
ocelot
monkey
Find the needles (this is where most grep users get to and never leave):
$ ack needle haystack
tin needle
silver needle
lead needle
Look at that one character shorter than grep and just as easy. Now if you want the silver needle, the unsophisticated, greppy way of doing this would be:
$ grep needle haystack|grep silver
silver needle
This sucks. Try:
$ ack '(?=silver).*needle' haystack
silver needle
Or, “all things in the haystack that are needles, but not the lead one”
$ ack '^(?!^lead).*needle.*$' haystack
tin needle
silver needle
I know there’s a lot more to the power of the lookaround assertion, but if I can re-train myself out of this habit I think it’ll be a big win. Granted, for smaller searches the “double-grep” method is probably fine, but any time you’re doing a recursive descent and are looking for true needles in the haystack, ack’s is the superior approach.
The case that I was working on was where I needed to search all my Rails models
for all named scopes that did not use the lambda
form and I wanted five
lines of context around the matches so that I could be understand the behavior.
ack -C5 'scope(?!.*lambda)' app/models