View Single Post

   
  #6 (permalink)  
Old 02-28-2008, 06:41 PM
Steve Kass
 
Posts: n/a
Default Re: Need help with query

When I ran John's query on your sample data, It
returned the price of $25.04 for Product One, a price
that was not effective on the day I ran the query.

Make sure your sample data has some examples where
the lowest price is a future price that has been
recorded but is not yet effective.

As far as efficiency goes, all I did was look at query
plans. What I found was that John's query and mine
had virtually identical plans.

What I did was this:

Removed the part of product name in parentheses. If that data
is important, it should be a separate column. I removed it,
since to make this efficient there needed (I think) to be an
index with the ProductName value you want in the output, without
the extra comment, as the first column.

Removed all cases where the same (ProductName, Distributor,
EffectiveDate) appeared in more than one row (these probably
appeared when I removed the extra part of the product name).

Declared (ProductName, Distributor, EffectiveDate) to be the
primary key. Maybe I missed something and this does not work
as a key, but it seemed like a good choice for an index to
make the query fast. If it's not a key, add price as
a fourth column.

I realized now that there was extra nesting in my query.
I don't think it should have slowed it down tenfold, but the
outer two selects can be combined (if the data has no duplicates
in my key columns).


-- Select the lowest price
SELECT ProductName, MIN(Price)

-- from the table
FROM ProductList P1

-- that is not a future price
WHERE P1.EffectiveDate <= CURRENT_TIMESTAMP

-- and that has not been superseded
AND NOT EXISTS (
-- by another price
SELECT * FROM ProductList P2
-- from the same distributor
WHERE P2.Distributor = P1.Distributor
-- for the same product
AND P2.ProductName = P1.ProductName
-- which other price has gone into effect
AND P2.EffectiveDate <= CURRENT_TIMESTAMP
-- more recently
AND P2.EffectiveDate > P1.EffectiveDate
)

-- for each product
GROUP BY ProductName

Steve


netpurpose wrote:
> Hi Steve,
>
> Thanks for your response. When I copied John's query, it works fine
> and gives me the correct results. I did run both your query and his
> against 330K of records, and it seems that his query runs much faster
> (more than 10 times faster). I ran your query against clean products
> names without the codes.
>
> I am too novice to even know why the big difference. Both query
> yields the same results. Can you tell why his statement is much more
> efficient? LEFT OUTER JOIN vs EXISTS? I do want to learn from this.
>
> Also, thanks for showing me the command to import the excel file to
> db. Good to know
>
> Thanks,
> Netpurpose
>
> Steve Kass <skass@drew.edu> wrote in message news:<xWV3b.3268$Om1.1287@newsread2.news.atl.earth link.net>...
>
>>Did I miscopy John's query? It gives me the future price of $25.04
>>for Product One, but is easy to fix by adding
>>
>> AND P1.EffectiveDate <= CURRENT_TIMESTAMP
>>
>>to the where clause of the derived table LatestPrices
>>
>>John's initial idea of NOT EXISTS still works, too, in part. (Note
>>that I blew off the ProductName parsing by taking a hatchet to that
>>column with an update...
>>
>>
>>-- Be lazy
>>UPDATE ProductList SET
>> ProductName =
>> SUBSTRING(ProductName, 1,CHARINDEX('(', ProductName) - 2)
>>GO
>>
>>-- For each product, select the lowest non-future
>>-- price that has not been superseded.
>>SELECT ProductName, MIN(Price) as Price -- get the lowest price
>>FROM (
>> SELECT ProductName, Price -- from all non-future prices
>> FROM ProductList P1
>> WHERE P1.EffectiveDate <= CURRENT_TIMESTAMP
>> AND NOT EXISTS ( -- that have not been superseded
>> SELECT * FROM ProductList P2
>> WHERE P2.Distributor = P1.Distributor
>> AND P2.ProductName = P1.ProductName
>> AND P2.EffectiveDate <= CURRENT_TIMESTAMP
>> AND P2.EffectiveDate > P1.EffectiveDate
>> )
>>) T
>>GROUP BY ProductName -- for each product
>>go
>>
>>By the way, the delimited data was no trouble. I pasted it into
>>Excel, did Text To Columns, named the data region, and inserted it with:
>>
>>insert into [ProductList]
>>select Manufacturer, ProductName, Price, EffectiveDate
>>from OpenRowset(
>> 'Microsoft.Jet.OLEDB.4.0',
>> 'Excel 8.0;Database=e:\excel\ProductList.xls',
>> 'select * from ProductList')
>>
>>Be sure to close the Excel file before inserting.
>>
>>-- Steve Kass
>>-- Drew University
>>-- Ref: 2498EA13-1756-4C81-9307-25AFD5BB451D
>>
>>
>>
>>netpurpose wrote:
>>
>>>Hi Jag,
>>>
>>>Thanks a lot!! It works like a charm! Your query saves me a lot of headache.
>>>
>>>I have struggled with this for quite a while now and I am glad I find help here.
>>>
>>>Again, thank you very much! I really appreciate your help in such a short time!
>>>
>>>Regards,
>>>Netpurpose
>>>
>>>
>>>"John Gilson" <jag@acm.org> wrote in message news:<amO3b.52163$ev.12155054@twister.nyc.rr.com>. ..
>>>
>>>
>>>>"netpurpose" <netpurpose@hotmail.com> wrote in message
>>>>news:31c57b38.0308291024.6bd2a29c@posting.goog le.com...
>>>>
>>>>
>>>>>Hi Jag,
>>>>>
>>>>>Thank you for your response. I tried the query, but it gives me a
>>>>>different pricing result:
>>>>>
>>>>>ProductName Price
>>>>>---------------------------------------- ---------------------
>>>>>Product One (MXT234) 28.1500
>>>>>Product Two (DGT6789) 44.3500
>>>>>Product Three (QAT556) 40.1500
>>>>>
>>>>>The correct result should be:
>>>>>
>>>>>ProductName Price
>>>>>---------------------------------------- ---------------------
>>>>>Product One 33.5000
>>>>>Product Two 44.7500
>>>>>Product Three 40.4500
>>>>>
>>>>>The reason is because for Product One, Manufacturer "BBB" has a price
>>>>>of 28.15 on 2003-07-30, this price is no longer valid because "BBB"
>>>>>has more recent price of 33.5 starting on 2003-08-27. Same case with
>>>>>Product Two and Three (manufacturer "JJJ" overrides the 40.15 on
>>>>>2003-08-01 with 40.45 on 2003-08-04).
>>>>>
>>>>>I have to take the most recent prices from each manufacturer for a
>>>>>particular product (price at the latest date as of today), and then
>>>>>take the minimum price from the result.
>>>>>
>>>>>Again, thank you for your help. Sorry I might not explain it well in
>>>>>the beginning. Any further help is appreciated. Thank you very much
>>>>>in advance!
>>>>>
>>>>>Regards,
>>>>>Netpurpose
>>>>
>>>>I was a little too quick on the draw and misread your requirements.
>>>>
>>>>SELECT ProductName, MIN(Price)
>>>
>>>>FROM (SELECT P1.Distributor,
>>>
>>>> SUBSTRING(P1.ProductName, 1,
>>>> CHARINDEX('(', P1.ProductName) - 2) AS
>>>> ProductName,
>>>> P1.EffectiveDate AS LatestDate,
>>>> P1.price
>>>> FROM ProductList AS P1
>>>> LEFT OUTER JOIN
>>>> ProductList AS P2
>>>> ON P1.Distributor = P2.Distributor AND
>>>> P1.ProductName = P2.ProductName AND
>>>> P1.EffectiveDate <= CURRENT_TIMESTAMP AND
>>>> P2.EffectiveDate <= CURRENT_TIMESTAMP AND
>>>> P2.EffectiveDate > P1.EffectiveDate
>>>> WHERE P2.EffectiveDate IS NULL) AS LatestPrices
>>>>GROUP BY ProductName
>>>>
>>>>Regards,
>>>>jag
>>>>
>>>>
>>>>
>>>>>"John Gilson" <jag@acm.org> wrote in message news:<_uI3b.51441$ev.11935402@twister.nyc.rr.com>. ..
>>>>>
>>>>>
>>>>>>"netpurpose" <netpurpose@hotmail.com> wrote in message
>>>>>>news:31c57b38.0308290040.6a560bd6@posting.go ogle.com...
>>>>>>
>>>>>>
>>>>>>>I need to extract data from this table to find the lowest prices of
>>>>>>>each product as of today. The product will be listed/grouped by the
>>>>>>>name only, discarding the product code - I use SUBSTRING(ProductName,
>>>>>>>1, CHARINDEX('(', ProductName)-2).
>>>>>>>
>>>>>>>I can get this result, but I had to use several views (totally
>>>>>>>inefficient). I think this can be done in one efficient/fast query,
>>>>>>>but I can't think of one.
>>>>>>>
>>>>>>>In the case that one query is not possible, is there other efficient
>>>>>>>way to get the results? There are about 300K records in the table.
>>>>>>>
>>>>>>>Any help is appreciated! Thanks a lot in advance!
>>>>>>>
>>>>>>>Thanks,
>>>>>>>Netpurpose
>>>>>>
>>>>>>SELECT P1.ProductName, P1.Price
>>>>>
>>>>>>FROM ProductList AS P1
>>>>>
>>>>>>WHERE EffectiveDate <= CURRENT_TIMESTAMP AND
>>>>>> NOT EXISTS
>>>>>> (SELECT *
>>>>>> FROM ProductList AS P2
>>>>>> WHERE SUBSTRING(P1.ProductName, 1,
>>>>>> CHARINDEX('(', P1.ProductName) - 2) =
>>>>>> SUBSTRING(P2.ProductName, 1,
>>>>>> CHARINDEX('(', P2.ProductName) - 2) AND
>>>>>> P2.EffectiveDate <= CURRENT_TIMESTAMP AND
>>>>>> P2.Price < P1.Price)
>>>>>>
>>>>>>Regards,
>>>>>>jag
>>>>>>
>>>>>>


Reply With Quote