Re: Need help with query Hi John and Steve,
Thank you very much for your help and the quick responses! I really
appreciate them.
Steve, thanks for explaining and commenting your query. I get a much
better understanding now.
John, I think in this last query, P2.EffectiveDate <=
CURRENT_TIMESTAMP is missing from the OUTER JOIN. But as you
mentioned, it works fine when I just take the previous query and move
out the P1.EffectiveDate <= CURRENT_TIMESTAMP. I didn't realize there
was a problem before because the live data as of yesterday has no
future pricing. Sorry, I should have tested it better.
Again, thanks to you both. The system works great and fast now!
Sincerely,
netpurpose
"John Gilson" <jag@acm.org> wrote in message news:<Q5m4b.60119$ev.13564013@twister.nyc.rr.com>. ..
> "Steve Kass" <skass@drew.edu> wrote in message
> news:u5f4b.9688$Om1.4472@newsread2.news.atl.earthl ink.net...
> > When I ran John's query on your sample data, It
> > returned the price of $25.04 for Product One, a price
> > that was not effective on the day I ran the query.
> >
> > Make sure your sample data has some examples where
> > the lowest price is a future price that has been
> > recorded but is not yet effective.
>
> Steve, you're absolutely right and stems from the (mis)placement
> of the selection criteria for the outer join. The constraint
> P1.EffectiveDate <= CURRENT_TIMESTAMP
> should be applied *after* the outer join condition, that is, in the
> WHERE clause. So this should do the right thing:
>
> SELECT ProductName, MIN(Price)
> FROM (SELECT P1.Distributor,
> SUBSTRING(P1.ProductName, 1,
> CHARINDEX('(', P1.ProductName) - 2) AS
> ProductName,
> P1.EffectiveDate AS LatestDate,
> P1.price
> FROM ProductList AS P1
> LEFT OUTER JOIN
> ProductList AS P2
> ON P1.Distributor = P2.Distributor AND
> P1.ProductName = P2.ProductName AND
> P2.EffectiveDate > P1.EffectiveDate
> WHERE P2.EffectiveDate IS NULL AND
> P1.EffectiveDate <= CURRENT_TIMESTAMP)
> AS LatestPrices
> GROUP BY ProductName
>
> Regards,
> jag
>
> > As far as efficiency goes, all I did was look at query
> > plans. What I found was that John's query and mine
> > had virtually identical plans.
> >
> > What I did was this:
> >
> > Removed the part of product name in parentheses. If that data
> > is important, it should be a separate column. I removed it,
> > since to make this efficient there needed (I think) to be an
> > index with the ProductName value you want in the output, without
> > the extra comment, as the first column.
> >
> > Removed all cases where the same (ProductName, Distributor,
> > EffectiveDate) appeared in more than one row (these probably
> > appeared when I removed the extra part of the product name).
> >
> > Declared (ProductName, Distributor, EffectiveDate) to be the
> > primary key. Maybe I missed something and this does not work
> > as a key, but it seemed like a good choice for an index to
> > make the query fast. If it's not a key, add price as
> > a fourth column.
> >
> > I realized now that there was extra nesting in my query.
> > I don't think it should have slowed it down tenfold, but the
> > outer two selects can be combined (if the data has no duplicates
> > in my key columns).
> >
> >
> > -- Select the lowest price
> > SELECT ProductName, MIN(Price)
> >
> > -- from the table
> > FROM ProductList P1
> >
> > -- that is not a future price
> > WHERE P1.EffectiveDate <= CURRENT_TIMESTAMP
> >
> > -- and that has not been superseded
> > AND NOT EXISTS (
> > -- by another price
> > SELECT * FROM ProductList P2
> > -- from the same distributor
> > WHERE P2.Distributor = P1.Distributor
> > -- for the same product
> > AND P2.ProductName = P1.ProductName
> > -- which other price has gone into effect
> > AND P2.EffectiveDate <= CURRENT_TIMESTAMP
> > -- more recently
> > AND P2.EffectiveDate > P1.EffectiveDate
> > )
> >
> > -- for each product
> > GROUP BY ProductName
> >
> > Steve
> >
> >
> > netpurpose wrote:
> > > Hi Steve,
> > >
> > > Thanks for your response. When I copied John's query, it works fine
> > > and gives me the correct results. I did run both your query and his
> > > against 330K of records, and it seems that his query runs much faster
> > > (more than 10 times faster). I ran your query against clean products
> > > names without the codes.
> > >
> > > I am too novice to even know why the big difference. Both query
> > > yields the same results. Can you tell why his statement is much more
> > > efficient? LEFT OUTER JOIN vs EXISTS? I do want to learn from this.
> > >
> > > Also, thanks for showing me the command to import the excel file to
> > > db. Good to know 
> > >
> > > Thanks,
> > > Netpurpose
> > >
> > > Steve Kass <skass@drew.edu> wrote in message
> news:<xWV3b.3268$Om1.1287@newsread2.news.atl.earth link.net>...
> > >
> > >>Did I miscopy John's query? It gives me the future price of $25.04
> > >>for Product One, but is easy to fix by adding
> > >>
> > >> AND P1.EffectiveDate <= CURRENT_TIMESTAMP
> > >>
> > >>to the where clause of the derived table LatestPrices
> > >>
> > >>John's initial idea of NOT EXISTS still works, too, in part. (Note
> > >>that I blew off the ProductName parsing by taking a hatchet to that
> > >>column with an update...
> > >>
> > >>
> > >>-- Be lazy
> > >>UPDATE ProductList SET
> > >> ProductName =
> > >> SUBSTRING(ProductName, 1,CHARINDEX('(', ProductName) - 2)
> > >>GO
> > >>
> > >>-- For each product, select the lowest non-future
> > >>-- price that has not been superseded.
> > >>SELECT ProductName, MIN(Price) as Price -- get the lowest price
> > >>FROM (
> > >> SELECT ProductName, Price -- from all non-future prices
> > >> FROM ProductList P1
> > >> WHERE P1.EffectiveDate <= CURRENT_TIMESTAMP
> > >> AND NOT EXISTS ( -- that have not been superseded
> > >> SELECT * FROM ProductList P2
> > >> WHERE P2.Distributor = P1.Distributor
> > >> AND P2.ProductName = P1.ProductName
> > >> AND P2.EffectiveDate <= CURRENT_TIMESTAMP
> > >> AND P2.EffectiveDate > P1.EffectiveDate
> > >> )
> > >>) T
> > >>GROUP BY ProductName -- for each product
> > >>go
> > >>
> > >>By the way, the delimited data was no trouble. I pasted it into
> > >>Excel, did Text To Columns, named the data region, and inserted it with:
> > >>
> > >>insert into [ProductList]
> > >>select Manufacturer, ProductName, Price, EffectiveDate
> > >>from OpenRowset(
> > >> 'Microsoft.Jet.OLEDB.4.0',
> > >> 'Excel 8.0;Database=e:\excel\ProductList.xls',
> > >> 'select * from ProductList')
> > >>
> > >>Be sure to close the Excel file before inserting.
> > >>
> > >>-- Steve Kass
> > >>-- Drew University
> > >>-- Ref: 2498EA13-1756-4C81-9307-25AFD5BB451D
> > >>
> > >>
> > >>
> > >>netpurpose wrote:
> > >>
> > >>>Hi Jag,
> > >>>
> > >>>Thanks a lot!! It works like a charm! Your query saves me a lot of headache.
> > >>>
> > >>>I have struggled with this for quite a while now and I am glad I find help here.
> > >>>
> > >>>Again, thank you very much! I really appreciate your help in such a short time!
> > >>>
> > >>>Regards,
> > >>>Netpurpose
> > >>>
> > >>>
> > >>>"John Gilson" <jag@acm.org> wrote in message
> news:<amO3b.52163$ev.12155054@twister.nyc.rr.com>. ..
> > >>>
> > >>>
> > >>>>"netpurpose" <netpurpose@hotmail.com> wrote in message
> > >>>>news:31c57b38.0308291024.6bd2a29c@posting.goog le.com...
> > >>>>
> > >>>>
> > >>>>>Hi Jag,
> > >>>>>
> > >>>>>Thank you for your response. I tried the query, but it gives me a
> > >>>>>different pricing result:
> > >>>>>
> > >>>>>ProductName Price
> > >>>>>---------------------------------------- ---------------------
> > >>>>>Product One (MXT234) 28.1500
> > >>>>>Product Two (DGT6789) 44.3500
> > >>>>>Product Three (QAT556) 40.1500
> > >>>>>
> > >>>>>The correct result should be:
> > >>>>>
> > >>>>>ProductName Price
> > >>>>>---------------------------------------- ---------------------
> > >>>>>Product One 33.5000
> > >>>>>Product Two 44.7500
> > >>>>>Product Three 40.4500
> > >>>>>
> > >>>>>The reason is because for Product One, Manufacturer "BBB" has a price
> > >>>>>of 28.15 on 2003-07-30, this price is no longer valid because "BBB"
> > >>>>>has more recent price of 33.5 starting on 2003-08-27. Same case with
> > >>>>>Product Two and Three (manufacturer "JJJ" overrides the 40.15 on
> > >>>>>2003-08-01 with 40.45 on 2003-08-04).
> > >>>>>
> > >>>>>I have to take the most recent prices from each manufacturer for a
> > >>>>>particular product (price at the latest date as of today), and then
> > >>>>>take the minimum price from the result.
> > >>>>>
> > >>>>>Again, thank you for your help. Sorry I might not explain it well in
> > >>>>>the beginning. Any further help is appreciated. Thank you very much
> > >>>>>in advance!
> > >>>>>
> > >>>>>Regards,
> > >>>>>Netpurpose
> > >>>>
> > >>>>I was a little too quick on the draw and misread your requirements.
> > >>>>
> > >>>>SELECT ProductName, MIN(Price)
>
> > >>>>FROM (SELECT P1.Distributor,
>
> > >>>> SUBSTRING(P1.ProductName, 1,
> > >>>> CHARINDEX('(', P1.ProductName) - 2) AS
> > >>>> ProductName,
> > >>>> P1.EffectiveDate AS LatestDate,
> > >>>> P1.price
> > >>>> FROM ProductList AS P1
> > >>>> LEFT OUTER JOIN
> > >>>> ProductList AS P2
> > >>>> ON P1.Distributor = P2.Distributor AND
> > >>>> P1.ProductName = P2.ProductName AND
> > >>>> P1.EffectiveDate <= CURRENT_TIMESTAMP AND
> > >>>> P2.EffectiveDate <= CURRENT_TIMESTAMP AND
> > >>>> P2.EffectiveDate > P1.EffectiveDate
> > >>>> WHERE P2.EffectiveDate IS NULL) AS LatestPrices
> > >>>>GROUP BY ProductName
> > >>>>
> > >>>>Regards,
> > >>>>jag
> > >>>>
> > >>>>
> > >>>>
> > >>>>>"John Gilson" <jag@acm.org> wrote in message
> news:<_uI3b.51441$ev.11935402@twister.nyc.rr.com>. ..
> > >>>>>
> > >>>>>
> > >>>>>>"netpurpose" <netpurpose@hotmail.com> wrote in message
> > >>>>>>news:31c57b38.0308290040.6a560bd6@posting.go ogle.com...
> > >>>>>>
> > >>>>>>
> > >>>>>>>I need to extract data from this table to find the lowest prices of
> > >>>>>>>each product as of today. The product will be listed/grouped by the
> > >>>>>>>name only, discarding the product code - I use SUBSTRING(ProductName,
> > >>>>>>>1, CHARINDEX('(', ProductName)-2).
> > >>>>>>>
> > >>>>>>>I can get this result, but I had to use several views (totally
> > >>>>>>>inefficient). I think this can be done in one efficient/fast query,
> > >>>>>>>but I can't think of one.
> > >>>>>>>
> > >>>>>>>In the case that one query is not possible, is there other efficient
> > >>>>>>>way to get the results? There are about 300K records in the table.
> > >>>>>>>
> > >>>>>>>Any help is appreciated! Thanks a lot in advance!
> > >>>>>>>
> > >>>>>>>Thanks,
> > >>>>>>>Netpurpose
> > >>>>>>
> > >>>>>>SELECT P1.ProductName, P1.Price
>
> > >>>>>>FROM ProductList AS P1
>
> > >>>>>>WHERE EffectiveDate <= CURRENT_TIMESTAMP AND
> > >>>>>> NOT EXISTS
> > >>>>>> (SELECT *
> > >>>>>> FROM ProductList AS P2
> > >>>>>> WHERE SUBSTRING(P1.ProductName, 1,
> > >>>>>> CHARINDEX('(', P1.ProductName) - 2) =
> > >>>>>> SUBSTRING(P2.ProductName, 1,
> > >>>>>> CHARINDEX('(', P2.ProductName) - 2)
> AND
> > >>>>>> P2.EffectiveDate <= CURRENT_TIMESTAMP AND
> > >>>>>> P2.Price < P1.Price)
> > >>>>>>
> > >>>>>>Regards,
> > >>>>>>jag
> > >>>>>>
> > >>>>>>
> > |