Unix Technical Forum

Lack of regular expression manipulation.. speed up function?

This is a discussion on Lack of regular expression manipulation.. speed up function? within the MySQL forums, part of the Database Server Software category; --> Unfortunately, MySQL doesn't provide any routines for regular expression manipulation. To do something similiar to: preg_replace('/[^A-Z]+/i', '', sValue), I've ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 11:29 AM
Michael Martinek
 
Posts: n/a
Default Lack of regular expression manipulation.. speed up function?

Unfortunately, MySQL doesn't provide any routines for regular
expression manipulation. To do something similiar to:

preg_replace('/[^A-Z]+/i', '', sValue), I've put together the routine
at the bottom of this post. I'm also researching the ability to
execute library functions in external shared objects/DLL files.
Meanwhile, if anyone has suggestions on how to improve this function
further and/or details on executing external shared objects, I've love
to hear your input.

DROP FUNCTION IF EXISTS utilCleanString;
DELIMITER |
CREATE FUNCTION utilCleanString(
sValue VARCHAR(255)
) RETURNS VARCHAR(255)
BEGIN
DECLARE sKeepChars CHAR(52) DEFAULT
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ';
DECLARE sRetval VARCHAR(255) DEFAULT '';
DECLARE sChar CHAR(1) DEFAULT '';

DECLARE i INT UNSIGNED DEFAULT 1;
DECLARE nLength INT UNSIGNED DEFAULT LENGTH(sValue);

MAIN_LOOP: LOOP
IF (i > nLength) THEN
LEAVE MAIN_LOOP;
END IF;

SET sChar = SUBSTR(sValue, i, 1);

IF INSTR(sKeepChars, sChar) > 0 THEN
SET sRetval = CONCAT(sRetval, sChar);
END IF;

SET i = i + 1;
END LOOP MAIN_LOOP;

RETURN UPPER(sRetval);
END; |
DELIMITER ;

Regards,
Michael Martinek

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 11:29 AM
Michael Martinek
 
Posts: n/a
Default Re: Lack of regular expression manipulation.. speed up function?

Found http://www.xcdsql.org/MySQL/UDF/lib_...ludf_preg.html,
which has some perl regular expressions built for MySQL.. compiled in
using CREATE FUNCTION ... SONAME 'lib_mysqludf_preg.so'.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 11:29 AM
Michael Martinek
 
Posts: n/a
Default Re: Lack of regular expression manipulation.. speed up function?

SELECT BENCHMARK(1000000,utilCleanString('ABC 123 ABCDEF 99727'));
1 row in set (2 min 49.09 sec)

New version, after downloading the MySQL Source, recompiling it, and
compiling RegEx support into it:
DROP FUNCTION IF EXISTS utilCleanString2;
DELIMITER |
CREATE FUNCTION utilCleanString2(
sValue VARCHAR(255)
) RETURNS VARCHAR(255)
BEGIN
RETURN UPPER(regexp_replace(sValue, '[^A-Za-z]+', ''));
END; |
DELIMITER ;


SELECT BENCHMARK(1000000,utilCleanString2('ABC 123 ABCDEF 99727'));
1 row in set (22.28 sec)


Ended up going with http://udf-regexp.php-baustelle.de/t...REGEXP_REPLACE

Works like a charm. Compile flags, to mimic the old RPM-style set up I
was using before:

CFLAGS="-O3 -DHAVE_DLOPEN" CXX=gcc CXXFLAGS="-O3 -felide-constructors
\
-fno-exceptions -fno-rtti" ./configure \
--prefix=/var/lib/mysql \
--localstatedir=/var/lib/mysql/data \
--with-unix-socket-path=/var/run/mysqld/mysql.sock \
--enable-assembler \
--disable-static \
--with-mysqld-user=mysql \
--without-plugin-ndb \
--with-zlib-dir=bundled \
--with-plugins=max-no-ndb \
--with-csv-storage-engine \
--enable-local-infile \
--with-innodb \
--with-isam
make
make install


Hope this helps someone who ran into the same problems.

Cheers!
Michael Martinek

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 11:30 AM
=?ISO-8859-1?B?QW5kcukgSORuc2Vs?=
 
Posts: n/a
Default Re: Lack of regular expression manipulation.. speed up function?

Michael Martinek wrote:
> SELECT BENCHMARK(1000000,utilCleanString('ABC 123 ABCDEF 99727'));
> 1 row in set (2 min 49.09 sec)
>
> New version, after downloading the MySQL Source, recompiling it, and
> compiling RegEx support into it:
> [...]
> SELECT BENCHMARK(1000000,utilCleanString2('ABC 123 ABCDEF 99727'));
> 1 row in set (22.28 sec)


Hi Michael,
thanks for the research.

Is the first benchmark (2:49.09) done with the "workaround" function
or with the loaded lib_mysqludf_preg?
If the former, have you compared lib_mysqludf_preg to the udf-
regexp.php-baustelle.de solution?

Regards, André
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:48 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com