Unix Technical Forum

about multiple fact tables and star schema

This is a discussion on about multiple fact tables and star schema within the DB2 forums, part of the Database Server Software category; --> HI, I had a special senario here.. I need to build a database which only around 50-70GB finally. But ...


Go Back   Unix Technical Forum > Database Server Software > DB2

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-27-2008, 08:57 AM
jane
 
Posts: n/a
Default about multiple fact tables and star schema

HI,

I had a special senario here..

I need to build a database which only around 50-70GB finally.

But one of table would be 40GB finally. ( grow every year until to the
retension period)

rest of the tables are very small . (MB size level)

I'm thinking put the database on one partition, but had some query
performance concern on
that big table.

I'm going to break down the table by the time. But had one concern. It
is data warehouse environoment, Currently, this big table is a fact
table. it is a star-schema model. If I build a view based on the break
down tables, is that any impact on the optimizer to use star schema
model to choose access plan?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-27-2008, 08:57 AM
Serge Rielau
 
Posts: n/a
Default Re: about multiple fact tables and star schema

Jane,

Which version of DB2, which platform?
In DB2 9 you can easily use range partitioning.
In both DB2 V8 and DB2 multi dimensional clustering may apply.

UNION ALL view would be my last choice.
Even 4KB Pages give you up to 64GB on DB2 V8.
How much headroom do you need?

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-27-2008, 08:57 AM
jane
 
Posts: n/a
Default Re: about multiple fact tables and star schema


Serge Rielau wrote:
> Jane,
>
> Which version of DB2, which platform?
> In DB2 9 you can easily use range partitioning.
> In both DB2 V8 and DB2 multi dimensional clustering may apply.
>
> UNION ALL view would be my last choice.
> Even 4KB Pages give you up to 64GB on DB2 V8.
> How much headroom do you need?
>
> Cheers
> Serge
> --
> Serge Rielau
> DB2 Solutions Development
> IBM Toronto Lab
>
> IOD Conference
> http://www.ibm.com/software/data/ond...ness/conf2006/


Thanks, Serge.

Probably I did not explain my question clearly.

My environment is UDB V8 on AIX.

My concern is not table size limit. Because now I know the Max size
for the big table is 40GB. It is OK on single partition, even for 4K
page.

My concern is on query performance. I know if using multi-partition, it
must be resolved.
but the other tables are all very small, and the total DB size is not
big (50GB), based on the DB size, I want to use single partition
database. Only because of this big table to use multi-partiton , seems
a little bit waste..

That's why I want to break down the big table to relatively smaller
ones. Use union all view.

One of my colleague remind me this would change optimizer to not use
star schema to get good access plan..
(because currently , the big table is fact table, it is in star schema
model)
I'm not sure if this is the case..

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-27-2008, 08:57 AM
Serge Rielau
 
Posts: n/a
Default Re: about multiple fact tables and star schema

Make your fact table a multi-dimensional-cluster.
That's ll do. I agree that DPF would be overkill.
If you don't have a low cardinality column that offers itself up for
MDC, just add another column e.g.
yearmonth GENERATED ALWAYS AS (INTEGER(date)/100)
Then cluster on the yearmonth column.
If you have 3 years worth of data this would partition the table into
3 * 12 => 36 slices.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-27-2008, 08:57 AM
jane
 
Posts: n/a
Default Re: about multiple fact tables and star schema


Serge Rielau wrote:
> Make your fact table a multi-dimensional-cluster.
> That's ll do. I agree that DPF would be overkill.
> If you don't have a low cardinality column that offers itself up for
> MDC, just add another column e.g.
> yearmonth GENERATED ALWAYS AS (INTEGER(date)/100)
> Then cluster on the yearmonth column.
> If you have 3 years worth of data this would partition the table into
> 3 * 12 => 36 slices.
>
> Cheers
> Serge
>
> --
> Serge Rielau
> DB2 Solutions Development
> IBM Toronto Lab
>
> IOD Conference
> http://www.ibm.com/software/data/ond...ness/conf2006/


Serge

Thanks a lot!

I did not realize I could use MDC...

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-27-2008, 08:58 AM
sethwai@yahoo.com
 
Posts: n/a
Default Re: about multiple fact tables and star schema

Serge,

Your response brings up a question. How will the advantages of MDC
tables be leveraged when the clustering column is not related to the
actual data relationships and therefore will not be referenced in any
predicates?

Thanks.

Lew

Serge Rielau wrote:
> Make your fact table a multi-dimensional-cluster.
> That's ll do. I agree that DPF would be overkill.
> If you don't have a low cardinality column that offers itself up for
> MDC, just add another column e.g.
> yearmonth GENERATED ALWAYS AS (INTEGER(date)/100)
> Then cluster on the yearmonth column.
> If you have 3 years worth of data this would partition the table into
> 3 * 12 => 36 slices.
>
> Cheers
> Serge
>
> --
> Serge Rielau
> DB2 Solutions Development
> IBM Toronto Lab
>
> IOD Conference
> http://www.ibm.com/software/data/ond...ness/conf2006/


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-27-2008, 08:58 AM
Serge Rielau
 
Posts: n/a
Default Re: about multiple fact tables and star schema

sethwai@yahoo.com wrote:
> Your response brings up a question. How will the advantages of MDC
> tables be leveraged when the clustering column is not related to the
> actual data relationships and therefore will not be referenced in any
> predicates?

It won't. That's the same for UNION ALL, or range clustering.
If there is no way for the compiler or the run time engine to eliminate
ranges then its' useless.
If you are looking at parallelism note that SMP doesn't operate on a per
range level. Instead it will start reading all the ranges in all the
threads and sub divide within the range.
This is different from DPF where each node will read it's database
partition exclusively.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:58 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com