Unix Technical Forum

Index usage when bitwise operator is used

This is a discussion on Index usage when bitwise operator is used within the Pgsql Performance forums, part of the PostgreSQL category; --> Hello, My question is about index usage when bitwise operations are invoked. Situation Context: -------------------------- Lets suppose we have ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:33 AM
W.Alphonse HAROUNY
 
Posts: n/a
Default Index usage when bitwise operator is used

Hello,

My question is about index usage when bitwise operations are invoked.
Situation Context:
--------------------------

Lets suppose we have 2 tables TBL1 and TBL2 as the following:
TBL1 {
......... ;
integer categoryGroup; // categoryGroup is declared as an index on TABL1
......... ;
}

TBL2 {
......... ;
integer categoryGroup; // categoryGroup is declared as an index on TABL2
......... ;
}

By conception, I suppose that:
- [categoryGroup] may hold a limited number of values, less than 32 values.
- [categoryGroup] is of type integer => it means 4 bytes => 32 bits
=> 32 places available to hold binary '0' or binary '1' values.
- [categoryGroup] is the result of an "OR bitwise operation" among a
predefined set of variables [variableCategory].
We suppose that [variableCategory] is of type integer (=>32 bits)
and each binary value of [variableCategory] may only hold a single binary
'1'.


Ex: variableCategory1 = 00000000000000000000000000000010
variableCategory2 = 00000000000000000000000000100000
variableCategory3 = 00000000000000000000000000001000

If [categoryGroup] = variableCategory1 | variableCategory2 |
variableCategory3
=>[categoryGroup] = 00000000000000000000000000101010



Question:
--------------
I have an SQL request similar to:

SELECT ..... FROM TBL1, TBL2 WHERE
<inner join between TBL1 and TBL2 is True> AND
TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise
operator

Qst:
1/ IS the above SQL request will use the INDEX [categoryGroup] defined on
TBL1 and TBL2 ?
2/ What should I do or How should I modify my SQL request in order
to force the query engine to use an index ? (the already defined index or
another useful index)



Thx a lot

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:34 AM
Kevin Grittner
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

>>> On Thu, Sep 13, 2007 at 7:30 AM, in message
<f97d4e240709130530ve3cc522g61ccca23335618f4@mail. gmail.com>, "W.Alphonse
HAROUNY" <wharouny@gmail.com> wrote:

> and each binary value of [variableCategory] may only hold a single binary
> '1'.


> TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise
> operator


What about saying?:

TBL1.CATEGORY = TBL2.CATEGORY

If your indexes include this and the other columns which cause the tables
to be related, one or both of them stand a pretty good chance of evaluating
to the lowest-cost method to join the tables. Forcing a query to use an
index outside of it being the cheapest path is rarely productive.

-Kevin


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:34 AM
valgog
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

Hi,

I could not find and normal solution for that issue. But I am using
some workarounds for that issue.

The solution, that I am using now is to create an index for every bit
of your bitmap field.

So something like

CREATE INDEX idx_hobbybit_0_limited
ON "versionA".user_fast_index
USING btree
(gender, dateofbirth) -- here the gender and dateofbirth fields are
the fields that we usually ORDER BY in the select statements, but you
can play with the needed fields
WHERE (hobby_bitmap & 1) > 0;

by creating such an index for every used bit and combining WHERE
(hobby_bitmap & 1 ) > 0 like statements the planner will be choosing
the right index to use.

Another workaround, that will be more applicable in your case I think,
is to create a functional GIN index on your bitmap field using a
static function to create an array of bitmap keys from your bitmap
field.

CREATE OR REPLACE FUNCTION
"versionA".bitmap_to_bit_array(source_bitmap integer)
RETURNS integer[] AS
'select ARRAY( select (1 << s.i) from generate_series(0, 32) as s(i)
where ( 1 << s.i ) & $1 > 0 )'
LANGUAGE 'sql' IMMUTABLE STRICT;

And than create a GIN index on the needed field using this stored
procedure. After that, it would be possible to use intarray set
operators on the result of that function. This will also make it
possible to use that GIN index.

Actually it would be much much better if it were possible to build GIN
indexes directly on the bitmap fields. But this is to be implemented
by GIN and GiST index development team. Probably would be not a bad
idea to make a feature request on them.


With best regards,

Valentine Gogichashvili

On Sep 13, 2:30 pm, wharo...@gmail.com ("W.Alphonse HAROUNY") wrote:
> Hello,
>
> My question is about index usage when bitwise operations are invoked.
> Situation Context:
> --------------------------
>
> Lets suppose we have 2 tables TBL1 and TBL2 as the following:
> TBL1 {
> ......... ;
> integer categoryGroup; // categoryGroup is declared as an index on TABL1
> ......... ;
>
> }
>
> TBL2 {
> ......... ;
> integer categoryGroup; // categoryGroup is declared as an index on TABL2
> ......... ;
>
> }
>
> By conception, I suppose that:
> - [categoryGroup] may hold a limited number of values, less than 32 values.
> - [categoryGroup] is of type integer => it means 4 bytes => 32 bits
> => 32 places available to hold binary '0' or binary '1' values.
> - [categoryGroup] is the result of an "OR bitwise operation" among a
> predefined set of variables [variableCategory].
> We suppose that [variableCategory] is of type integer (=>32 bits)
> and each binary value of [variableCategory] may only hold a single binary
> '1'.
>
> Ex: variableCategory1 = 00000000000000000000000000000010
> variableCategory2 = 00000000000000000000000000100000
> variableCategory3 = 00000000000000000000000000001000
>
> If [categoryGroup] = variableCategory1 | variableCategory2 |
> variableCategory3
> =>[categoryGroup] = 00000000000000000000000000101010
>
> Question:
> --------------
> I have an SQL request similar to:
>
> SELECT ..... FROM TBL1, TBL2 WHERE
> <inner join between TBL1 and TBL2 is True> AND
> TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise
> operator
>
> Qst:
> 1/ IS the above SQL request will use the INDEX [categoryGroup] defined on
> TBL1 and TBL2 ?
> 2/ What should I do or How should I modify my SQL request in order
> to force the query engine to use an index ? (the already defined index or
> another useful index)
>
> Thx a lot



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 11:34 AM
valgog
 
Posts: n/a
Default Re: Index usage when bitwise operator is used


> What about saying?:
>
> TBL1.CATEGORY = TBL2.CATEGORY
>


Are you sure you understood what was the question?

Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY &
TBL2.CATEGORY > 0?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 11:34 AM
Kevin Grittner
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

>>> On Mon, Sep 17, 2007 at 2:49 AM, in message
<1190015368.148293.56830@y42g2000hsy.googlegroups. com>, valgog
<valgog@gmail.com> wrote:

>> What about saying?:
>>
>> TBL1.CATEGORY = TBL2.CATEGORY
>>

>
> Are you sure you understood what was the question?
>
> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY &
> TBL2.CATEGORY > 0?


Yes, given that he stipulated that one and only one bit would be set.

-Kevin




---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 11:34 AM
Tom Lane
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> On Mon, Sep 17, 2007 at 2:49 AM, in message
> <1190015368.148293.56830@y42g2000hsy.googlegroups. com>, valgog
> <valgog@gmail.com> wrote:=20
>> Are you sure you understood what was the question?
>>
>> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY &
>> TBL2.CATEGORY > 0?


> Yes, given that he stipulated that one and only one bit would be set.


Really? In that case, isn't this bit-field just a bad implementation of
an enum-style field?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 11:34 AM
Kevin Grittner
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

>>> On Mon, Sep 17, 2007 at 8:37 AM, in message <21403.1190036229@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> On Mon, Sep 17, 2007 at 2:49 AM, in message
>> <1190015368.148293.56830@y42g2000hsy.googlegroups. com>, valgog
>> <valgog@gmail.com> wrote:=20
>>> Are you sure you understood what was the question?
>>>
>>> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY &
>>> TBL2.CATEGORY > 0?

>
>> Yes, given that he stipulated that one and only one bit would be set.

>
> Really? In that case, isn't this bit-field just a bad implementation of
> an enum-style field?


My bad. I did misread it. Sorry, all.

-Kevin




---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 11:34 AM
W.Alphonse HAROUNY
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

Hi,

A little clarification. Actually, TBL1.CATEGORY and/or TBL2.CATEGORY may
hold a binary value having multiple binary(ies) '1'.
Each binary value column represent an business attribute.
If a binary value column is equal to '1', it means that the business
attribute is True,
otherwise it is false.
I adopted this avoid defining a detail table to table TBL1. Idem to TBL2.

If TBL1.CATEGORY | TBL2.CATEGORY > 0
=> it means that we have at least one common business attribute that is TRUE
for TBL1 and TBL2.

Regards
W.Alf


On 9/17/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> > On Mon, Sep 17, 2007 at 2:49 AM, in message
> > <1190015368.148293.56830@y42g2000hsy.googlegroups. com>, valgog
> > <valgog@gmail.com> wrote:=20
> >> Are you sure you understood what was the question?
> >>
> >> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY &
> >> TBL2.CATEGORY > 0?

>
> > Yes, given that he stipulated that one and only one bit would be set.

>
> Really? In that case, isn't this bit-field just a bad implementation of
> an enum-style field?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 11:34 AM
valgog
 
Posts: n/a
Default Re: Index usage when bitwise operator is used

Hi Tom,

do you think it would be a good idea to ask GIN index team to
implement an int-based bitmap set indexing operator for GIN/GiST based
indexes? Or there will be a possibility to somehow optimally index
arrays of enumerations to implement such bitmap structures in 8.3 or
later postgresql versions?

With best regards,

-- Valentine

On Sep 17, 3:37 pm, t...@sss.pgh.pa.us (Tom Lane) wrote:
> "Kevin Grittner" <Kevin.Gritt...@wicourts.gov> writes:
> > On Mon, Sep 17, 2007 at 2:49 AM, in message
> > <1190015368.148293.56...@y42g2000hsy.googlegroups. com>, valgog
> > <val...@gmail.com> wrote:=20
> >> Are you sure you understood what was the question?

>
> >> Is the TBL1.CATEGORY = TBL2.CATEGORY the same as TBL1.CATEGORY &
> >> TBL2.CATEGORY > 0?

> > Yes, given that he stipulated that one and only one bit would be set.

>
> Really? In that case, isn't this bit-field just a bad implementation of
> an enum-style field?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majord...@postgresql.org so that your
> message can get through to the mailing list cleanly



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:16 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com