Unix Technical Forum

OT How to count occurances in a text file

This is a discussion on OT How to count occurances in a text file within the Slackware Linux Support forums, part of the Unix Operating Systems category; --> Can you please help me determine where there are unmatched quotes (") in my .mailfilterrc file? grep -c returns ...


Go Back   Unix Technical Forum > Unix Operating Systems > Slackware Linux Support

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-19-2008, 09:20 PM
buck
 
Posts: n/a
Default OT How to count occurances in a text file

Can you please help me determine where there are unmatched quotes (")
in my .mailfilterrc file?

grep -c returns the number of lines where the " appears, but I need
something that counts the number of instances of " per line.

Something like

while read LINE; do
count how many times " occurs in this line
if count=0 then ignore # Skip zero and
if count<>2 then echo $LINE # two; show all others
done <.mailfilterrc

TIA, I'm crosseyed looking at 377 lines :<
buck
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-19-2008, 09:20 PM
Robert Komar
 
Posts: n/a
Default Re: OT How to count occurances in a text file

buck <buck@private.mil> wrote:
>
>
> Can you please help me determine where there are unmatched quotes (")
> in my .mailfilterrc file?
>
> grep -c returns the number of lines where the " appears, but I need
> something that counts the number of instances of " per line.
>
> Something like
>
> while read LINE; do
> count how many times " occurs in this line
> if count=0 then ignore # Skip zero and
> if count<>2 then echo $LINE # two; show all others
> done <.mailfilterrc
>
> TIA, I'm crosseyed looking at 377 lines :<
> buck


Hi,
maybe the thing to do is to look for lines with a single " first
using regular expressions with grep, and if that doesn't uncover
your problem, then look for 3 or more.

Maybe something like the following:

grep -e '^[^\"]*\"[^\"]*$' filename
grep -e '^[^\"]*\"[^\"]*\"[^\"]*\"' filename

Cheers,
Rob Komar
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-19-2008, 09:20 PM
Loki Harfagr
 
Posts: n/a
Default Re: OT How to count occurances in a text file

Le Mon, 24 Jan 2005 23:24:45 -0800, buck a écrit*:

> Can you please help me determine where there are unmatched quotes (")
> in my .mailfilterrc file?
> TIA, I'm crosseyed looking at 377 lines :<


I bet you are )

Here, some funny way to do this, guess it's a bit of
an overkill but times are this mood these years ...

cat << _EOF_ >> GRUMPF
#!/bin/sh
#########
mkfifo TOTOPIPE
mkfifo TOTOPIPE2
TOTO=$1
sed 's/[^\"]//g' $TOTO | sed 's/\"\"//g' > TOTOPIPE &
seq 1 $(wc -l $TOTO | cut -f1 -d' ') >TOTOPIPE2 &
paste TOTOPIPE TOTOPIPE2 | grep '^\"'

rm -f TOTOPIPE TOTOPIPE2
_EOF_


Have fun )

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-19-2008, 09:20 PM
buck
 
Posts: n/a
Default Re: OT How to count occurances in a text file

On Tue, 25 Jan 2005 09:26:40 GMT, Robert Komar <robk@robpc2.home.org>
wrote:

>buck <buck@private.mil> wrote:
>>
>>
>> Can you please help me determine where there are unmatched quotes (")
>> in my .mailfilterrc file?
>> TIA, I'm crosseyed looking at 377 lines :<
>> buck

>
>Hi,
>maybe the thing to do is to look for lines with a single " first
>using regular expressions with grep, and if that doesn't uncover
>your problem, then look for 3 or more.
>
>Maybe something like the following:
>
>grep -e '^[^\"]*\"[^\"]*$' filename
>grep -e '^[^\"]*\"[^\"]*\"[^\"]*\"' filename
>
>Cheers,
>Rob Komar


Rob,

Thank you. These say I have no unmatched quotes. The first returns
nothing and the second returns lines where there are 4 " characters.

Please "teach me to fish". Say in words what one or the other of the
above lines does.

What does that first ^[

mean?

Doesn't the ^\"

right after that mean to find a line that _begins_ with a quote?

What is the purpose of the first ]

Perhaps a better question is "where do I find documentation for
(extended?) regular expressions so I can understand these two lines?"

Color me perplexed but greatful. Thx again
buck
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-19-2008, 09:20 PM
Loki Harfagr
 
Posts: n/a
Default Re: OT How to count occurances in a text file

Le Mon, 24 Jan 2005 23:24:45 -0800, buck a écrit*:

> Can you please help me determine where there are unmatched quotes (")
> in my .mailfilterrc file?
> if count=0 then ignore # Skip zero and
> if count<>2 then echo $LINE # two; show all others
> done <.mailfilterrc


Now, another, funny but simpler one :

#:> awk '{if(NF){if ((NF+1)%2){print NR "\t" NF-1} }; i=0}' FS=\"
..mailfilterrc

which'll give output like :

1 1
4 1
5 1
10 1
23 13


Cheers :-)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-19-2008, 09:20 PM
Loki Harfagr
 
Posts: n/a
Default Re: OT How to count occurances in a text file

Le Tue, 25 Jan 2005 16:11:07 +0100, Loki Harfagr a écrit*:

> Le Mon, 24 Jan 2005 23:24:45 -0800, buck a écrit*:
>
>> Can you please help me determine where there are unmatched quotes (")
>> in my .mailfilterrc file?
>> if count=0 then ignore # Skip zero and
>> if count<>2 then echo $LINE # two; show all others
>> done <.mailfilterrc

>
> Now, another, funny but simpler one :


and better :
# awk '{if(NF){if ((NF+1)%2){print NR "\t" NF-1 "\t" $0} }; }' FS=\" file
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-19-2008, 09:20 PM
Jeffrey Froman
 
Posts: n/a
Default Re: OT How to count occurances in a text file

buck wrote:

> Can you please help me determine where there are unmatched quotes (")
> in my .mailfilterrc file?
>
> grep -c returns the number of lines where the " appears, but I need
> something that counts the number of instances of " per line.


(Message sent with lines unwrapped)

Here's a python one-liner that can be run from the command line. It
will return a list of all line numbers in "myfile" which contain a
non-even number of double quotation marks:

python -c 'print [e for e, line in enumerate(open("myfile")) if line.count("\"") % 2]'


Hope that works for you,
Jeffrey
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-19-2008, 09:20 PM
Robert Komar
 
Posts: n/a
Default Re: OT How to count occurances in a text file

Loki Harfagr <lars.hummigeret@yahuu.no> wrote:
>
>
> Le Tue, 25 Jan 2005 16:11:07 +0100, Loki Harfagr a ?crit?:
>
>> Le Mon, 24 Jan 2005 23:24:45 -0800, buck a ?crit?:
>>
>>> Can you please help me determine where there are unmatched quotes (")
>>> in my .mailfilterrc file?
>>> if count=0 then ignore # Skip zero and
>>> if count<>2 then echo $LINE # two; show all others
>>> done <.mailfilterrc

>>
>> Now, another, funny but simpler one :

>
> and better :
> # awk '{if(NF){if ((NF+1)%2){print NR "\t" NF-1 "\t" $0} }; }' FS=\" file


Using the " as a field separator for awk is a good idea. I wish I thought
of it.

Cheers,
Rob Komar
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-19-2008, 09:20 PM
Robert Komar
 
Posts: n/a
Default Re: OT How to count occurances in a text file

buck <buck@private.mil> wrote:
>
>>grep -e '^[^\"]*\"[^\"]*$' filename
>>grep -e '^[^\"]*\"[^\"]*\"[^\"]*\"' filename

>
> Thank you. These say I have no unmatched quotes. The first returns
> nothing and the second returns lines where there are 4 " characters.
>
> Please "teach me to fish". Say in words what one or the other of the
> above lines does.
>
> What does that first ^[
>
> mean?
>
> Doesn't the ^\"
>
> right after that mean to find a line that _begins_ with a quote?
>
> What is the purpose of the first ]
>
> Perhaps a better question is "where do I find documentation for
> (extended?) regular expressions so I can understand these two lines?"
>
> Color me perplexed but greatful. Thx again
> buck


A pair of brackets denotes a range of characters to be used in the search.
Adding a circumflex "^" at the start of the range means to invert the logic,
so everything except the following range. So, "[^\"] means any character
other than \" (where the " is quoted to avoid confusing the command-line
shell). Adding the "*" immediately after means any number of the preceding
characters, including zero of them.

Outside of the brackets, "^" means the start of the line. "$" means the
end of the line.


^ [^\"]* \" [^\"]* $

start some number 1 " some number end
of of not-" of not-" of
line characters characters line


So only a single " somewhere on the line will return a match. The 3 or more
" regexp is only a little more complicated than that.

I don't know a good on-line reference for regular expressions, but I'm sure
google will turn something up. I learned about them in Sobell's book "A
Practical Guide to the Unix System", and in an earlier edition of O'Reilly's
camel book on Perl. There's a nice overview in Kernighan and Pike's book
"The Unix Programming Environment", and I seem to recall that the small
book on awk and sed programming had a section on regular expressions, as
well. They are common and confusing enough that they have been covered
in a lot of books on Unix.

Cheers,
Rob Komar
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-19-2008, 09:21 PM
buck
 
Posts: n/a
Default Re: OT How to count occurances in a text file

On Tue, 25 Jan 2005 18:20:02 GMT, Robert Komar <robk@robpc2.home.org>
wrote:


>A pair of brackets denotes a range of characters to be used in the search.
>Adding a circumflex "^" at the start of the range means to invert the logic,
>so everything except the following range. So, "[^\"] means any character
>other than \" (where the " is quoted to avoid confusing the command-line
>shell). Adding the "*" immediately after means any number of the preceding
>characters, including zero of them.
>
>Outside of the brackets, "^" means the start of the line. "$" means the
>end of the line.
>
>
>^ [^\"]* \" [^\"]* $
>
>start some number 1 " some number end
>of of not-" of not-" of
>line characters characters line
>
>
>So only a single " somewhere on the line will return a match. The 3 or more
>" regexp is only a little more complicated than that.
>
>I don't know a good on-line reference for regular expressions, but I'm sure
>google will turn something up. I learned about them in Sobell's book "A
>Practical Guide to the Unix System", and in an earlier edition of O'Reilly's
>camel book on Perl. There's a nice overview in Kernighan and Pike's book
>"The Unix Programming Environment", and I seem to recall that the small
>book on awk and sed programming had a section on regular expressions, as
>well. They are common and confusing enough that they have been covered
>in a lot of books on Unix.
>
>Cheers,
>Rob Komar


Rob,

Thank you SO MUCH!!

I have the Sobell book but I never thought to look for regex there.

buck

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:55 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com