[SLL] grep and wc help needed

(Ted Harding) Ted.Harding at manchester.ac.uk
Sat Feb 21 13:58:32 PST 2009


On 21-Feb-09 21:25:57, Ralph Sims wrote:
> (or perl, or ...)
> 
> I have a text file that contains alpha characters as well as
> puncuation parks.  I tried grep [:alpha:] filename |wc but
> still get the punction marks counted. I've also used [aZ-zZ]
> and still get the same result.  
> What I'm looking for is a way to count the letters and words in a file 
> without punctuation, spaces, etc.
> 
> Thanks in advance.

To count words, you will have to leave spaces (and NL) in, otherwise
there's no way for separate words to be recognised.

To start with, if you just want to count characters which are not
space (of any kind) nor punctuation nor numerals, you can get rid
of these with 'tr -d'.

Example (counting non-numeric characters)
testfile.txt:

This is 1 text file, with
46 non-numeric characters: and 9 words.

cat testfile.txt | tr -d "[:punct:][:space:]0-9" | wc -c
46

You can count the words by leaving the spaces in, with 'wc -w':

$ cat testfile.txt | tr -d "[:punct:]0-9" | wc -w
9

(note that "non-numeric" went into 1 word, since "-" is punctuation
and so was deleted).

Hoping this helps,
Ted.





--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 21-Feb-09                                       Time: 21:58:27
------------------------------ XFMail ------------------------------


More information about the linux-list mailing list