1 Vote Vote

Poorly performing Mysql subquery -- can I turn it into a Join?

Posted by topdog 569 days ago Questions| emailaddress null product All

I have a subquery problem that is causing poor performance... I was thinking that the subquery could be re-written using a join, but I'm having a hard time wrapping my head around it.

The gist of the query is this: For a given combination of EmailAddress and Product, I need to get a list of the IDs that are NOT the latest.... these orders are going to be marked as 'obsolete' in the table which would leave only that latest order for a a given combination of EmailAddress and Product... (does that make sense?)

Table Definition

CREATE TABLE  `sandbox`.`OrderHistoryTable` (
 `id` INT( 11 ) NOT NULL AUTO_INCREMENT ,
 `EmailAddress` VARCHAR( 100 ) NOT NULL ,
 `Product` VARCHAR( 100 ) NOT NULL ,
 `OrderDate` DATE NOT NULL ,
 `rowlastupdated` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ,
PRIMARY KEY (  `id` ) ,
KEY  `EmailAddress` (  `EmailAddress` ) ,
KEY  `Product` (  `Product` ) ,
KEY  `OrderDate` (  `OrderDate` )
) ENGINE = MYISAM DEFAULT CHARSET = latin1;

Query

SELECT id
FROM
OrderHistoryTable AS EMP1
WHERE
OrderDate not in 
   (
   Select max(OrderDate)
   FROM OrderHistoryTable AS EMP2
   WHERE 
       EMP1.EmailAddress =  EMP2.EmailAddress
   AND EMP1.Product IN ('ProductA','ProductB','ProductC','ProductD')
   AND EMP2.Product IN ('ProductA','ProductB','ProductC','ProductD')
   )

Explanation of duplicate 'IN' statements

13   bob@aol.com  ProductA  2010-10-01
15   bob@aol.com  ProductB  2010-20-02
46   bob@aol.com  ProductD  2010-20-03
57   bob@aol.com  ProductC  2010-20-04
158  bob@aol.com  ProductE  2010-20-05
206  bob@aol.com  ProductB  2010-20-06
501  bob@aol.com  ProductZ  2010-20-07

The results of my query should be | 13 | | 15 | | 46 | | 57 |

This is because, in the orders listed, those 4 have been 'superceded' by a newer order for a product in the same category. This 'category' contains prodcts A, B, C & D.

Order ids 158 and 501 show no other orders in their respective categories based on the query.

Originally asked by: BrianAdkins on Stack Overflow

Discuss Bury


Who Voted for this Question