Unix Technical Forum

regex match multiple words in a line

This is a discussion on regex match multiple words in a line within the MySQL forums, part of the Database Server Software category; --> Hi, I am trying to construct a regexp in mysql that looks for a multiple word match in one ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 11:29 AM
carlbernardi@gmail.com
 
Posts: n/a
Default regex match multiple words in a line

Hi,

I am trying to construct a regexp in mysql that looks for a multiple
word match in one line of text within other lines of text. When the
text is inserted into a field, all lines are already separated by
'\n'.

I can only get it work when it looks for one word in a line as such:
select * from sp where lines regexp('\n.*[[:<:]]word1[[:>:]]+.*[^\n]+
\n');


When I look for two words in a line it only returns lines that have
the search words in that order. For example:

select * from sp where lines
regexp('\n.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+\n');
works when words are in the order of 'word1 ... word2' but not when
words are in the order of 'word2 ... word1'.

Also I am having trouble searching only the first line of text. I have
been working with the following but it still doesn't work properly:
select * from sp where lines
regexp('^.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+\n');

Thanks,

Carl

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 11:30 AM
Michael Martinek
 
Posts: n/a
Default Re: regex match multiple words in a line

Overall, MySQL regular expression support is pretty weak. You may end
up needing to something like:

select * from sp where
lines regexp('\n.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+
\n') OR
lines regexp('\n.*[[:<:]]word2[[:>:]].*[[:<:]]word1[[:>:]]+.*[^\n]+
\n');


Which you may want to compare against:

select * from sp where lines LIKE "%word1%word2%" OR lines LIKE
"%word2%word1%"

To see which one is faster.

I ended up recompiling MySQL with plugin support and compiling UDF-
regexp 1.0 from http://udf-regexp.php-baustelle.de/trac/.. which
provides great regular expression support for more than just checking
for simple matches.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 11:30 AM
Captain Paralytic
 
Posts: n/a
Default Re: regex match multiple words in a line

On 26 Nov, 12:43, Michael Martinek <michael.marti...@gmail.com> wrote:
> Overall, MySQL regular expression support is pretty weak. You may end
> up needing to something like:
>
> select * from sp where
> lines regexp('\n.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+
> \n') OR
> lines regexp('\n.*[[:<:]]word2[[:>:]].*[[:<:]]word1[[:>:]]+.*[^\n]+
> \n');
>
> Which you may want to compare against:
>
> select * from sp where lines LIKE "%word1%word2%" OR lines LIKE
> "%word2%word1%"
>
> To see which one is faster.
>
> I ended up recompiling MySQL with plugin support and compiling UDF-
> regexp 1.0 fromhttp://udf-regexp.php-baustelle.de/trac/.. which
> provides great regular expression support for more than just checking
> for simple matches.


However the "words in any order" problem is one that can only be
solved in very advanced REGEX implementations ISTR.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:10 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com