Problems with SELF Join

  • I am trying to do a self join that will take the value of the col called TO_DEPT of the PREVIOUS record and put it in the FROM_DEPT.   I have tried may different queries with frustrating results. Can some on help me out?  I need to only update the records from the same O_ID.

    What it looks like now

    31175, '2003/11/24 14:17','',''                

    31175, '2003/11/24 14:36','',''                

    31175, '2003/11/25 13:08','',''                

    31175, '2003/11/25 13:08','',''                

    31175, '2003/11/25 13:09','','Pending'    

    31175, '2003/11/30 23:36','','Quality'

    31175, '2003/12/01 15:08','','Technical'       

    31175, '2003/12/02 10:45','','OM'

    31175, '2003/12/03 15:22','',''                

    31175, '2003/12/03 15:23','',''                

    31175, '2003/12/03 15:23','',''                

    31190, '2003/11/25 13:08','',''                

    31190, '2003/11/25 13:09','','Pending'    

    31190, '2003/11/30 23:36','','Quality'

    31190, '2003/12/01 15:08','','Technical'       

    31190, '2003/12/02 10:45','','OM'

    What I want it to look like

    31175, '2003/11/24 14:17','',''                  

    31175, '2003/11/24 14:36','',''                  

    31175, '2003/11/25 13:08','',''                  

    31175, '2003/11/25 13:08','',''                  

    31175, '2003/11/25 13:09','','Pending'           

    31175, '2003/11/30 23:36','Pending','Quality' 

    31175, '2003/12/01 15:08','Qualitye','Technical'         

    31175, '2003/12/02 10:45','Technical','OM'  

    31175, '2003/12/03 15:22','OM',''                

    31175, '2003/12/03 15:23','',''                  

    31175, '2003/12/03 15:23','',''

    31190, '2003/11/25 13:09','','Pending'           

    31190, '2003/11/30 23:36','Pending','Quality' 

    31190, '2003/12/01 15:08','Quality','Technical'         

    31190, '2003/12/02 10:45','Technical','OM'  

    31190, '2003/12/03 15:22','OM',''                

    -----------------------------------------------------------------------

    CREATE TABLE [Dummy] (

     [O_ID] [decimal](9, 0) NOT NULL ,

     [O_TMST] [datetime] NOT NULL ,

     [From_Dept] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     [To_Dept] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

     CONSTRAINT [PK_FSA_ORDER_CHNG_HIST] PRIMARY KEY  CLUSTERED

     (

      [O_ID],

      [O_TMST]

    &nbsp WITH  FILLFACTOR = 90  ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/11/24 14:17','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/11/24 14:36','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/11/25 13:08','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/11/25 13:08','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/11/25 13:09','','Pending')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/11/30 23:36','','Quality')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/12/01 15:08','','Technical')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/12/02 10:45','','OM')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/12/03 15:22','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31175, '2003/12/03 15:23','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31190, '2003/12/03 15:23','','')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31190, '2003/11/25 13:09','','Pending')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31190, '2003/11/30 23:36','','Quality')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31190, '2003/12/01 15:08','','Technical')

    INSERT dummy (o_id, o_tmst, from_dept, to_dept) values (31190, '2003/12/02 10:45','','OM')

                                                     

  • SELECT a.o_id

     , a.o_tmst

     , a.from_dept

     , a.to_dept

     , b.to_dept

     from dummy a

     inner join dummy b

      on a.o_id = b.o_id

     where b.o_tmst = ( select max( o_tmst)

           from dummy c

           where c.o_tmst < a.o_tmst

         &nbsp

  • Steve,

    That did not quite work.  All the data got pushed down one record.

    the records with a timestamp of 13:09 should have data, but they did not, instead the data that should have been on the 13:09 record showed up on the 23:36 record.

    You have given me something to work with and that gets me a little closer.  Thanks.

  • try:

    SELECT a.o_id

     , a.o_tmst

     , a.from_dept

     , isnull(b.to_dept, '')

     , a.to_dept

     from dummy a

     left join dummy b

      on a.o_id = b.o_id and b.o_tmst = ( select max( o_tmst)

           from dummy c

           where c.o_tmst < a.o_tmst and c.o_id = a.o_id)

    ORDER BY a.o_id, a.o_tmst

    Regards

    Peter

  • Sometimes subselecting can do the trick:

    select

     outd.O_ID,

     outd.O_TMST,

     isnull(

     ( select ind.To_Dept from dummy ind where ind.O_ID = outd.O_ID

     and ind.O_TMST = (  select max(inind.O_TMST) from dummy inind

        where inind.O_TMST < outd.O_TMST ) ), outd.From_Dept ),

     outd.To_Dept

    from

     dummy outd

    By the way, I think that isnull(b.to_dept, '') should be isnull(b.to_dept, a.from_dept) in Peter's left join approach..

    /rocmoose


    You must unlearn what You have learnt

  • Rockmoose,  The subquery returns more than one record which causes it to fail.

    I have spent many hours on this already and the the self join is the only way to go so far.

    Peter your solution is working about 90%, I am still trying to figure out under what conditions it fails.   I will post again when I do.

    Thanks everyone for there help!   Keep suggestions coming all have been helpful and got me closer to the solution.  Thanks again.  

  • Yeah, see this.

    Try:

    select

     outd.O_ID,

     outd.O_TMST,

     isnull(

     ( select ind.To_Dept from dummy ind where ind.O_ID = outd.O_ID

     and ind.O_TMST = (  select max(inind.O_TMST) from dummy inind

        where inind.O_TMST < outd.O_TMST and inind.O_ID = outd.O_ID&nbsp ), outd.From_Dept ),

     outd.To_Dept

    from

     dummy outd


    You must unlearn what You have learnt

  • I know my query will fail of there are any records with duplicate o_id and o_tmst, as per your sample data, but the primary key should stop the duplicates being added. If you want to have records with duplicate o_id and o_tmst values you will have to add an identity field to separately identify the records. In which case use :

      on a.o_id = b.o_id and b.ID = ( select max(ID)

    if ID is the new field and the records are added to the table in datetime order. Instead of

      on a.o_id = b.o_id and b.o_tmst = ( select max( o_tmst)

    regards

    Peter

  • Should be no duplicate data. The sample data was corrupt.

    ( 31175, '2003/11/25 13:08' ) is duplicate in sample.

    CONSTRAINT [PK_FSA_ORDER_CHNG_HIST] PRIMARY KEY  CLUSTERED

     ( [O_ID], [O_TMST] )

    You do have this on your table, ghughes ?

    /rockmoose


    You must unlearn what You have learnt

  • There are no dups in the data, I just cut and pasted one record too many.   I have 300,000 records in this table.   Sorry about that.

    The table is indexed in O_id and O_TMST.

    Rockmoose, your subquery still returns more than one value and causes it to fail. 

    I appreciate this help alot. 

  • If you could add an identity column, this query would be simple.

      select *

      from table as t1 join table as t2 on t1.id = t2.id + 1

    dw

  • Or if you can not add a identity column to the base table, insert the data into a temp table with an identity column and then use the query from above.

    dw

  • Ok, Identity columns are only necessary if you don't have a primary key or unique index.

    If you have that there is no point in adding an identity column.

    This got me a bit perplexed , so try this sql:

    -- yet another sql...

    select

     a.O_ID,

     a.O_TMST,

     isnull( b.To_Dept, a.From_Dept ),

     a.To_Dept

    from

     dummy a

     join

     ( select a.O_ID, a.O_TMST, max(b.O_TMST) as my_prev_O_TMST

     from dummy a left outer join dummy b

      on a.O_ID = b.O_ID

      and a.O_TMST > b.O_TMST

     group by a.O_ID, a.O_TMST ) deedum

     on a.O_ID = deedum.O_ID and a.O_TMST = deedum.O_TMST

     left join dummy b

     on b.O_ID = deedum.O_ID and b.O_TMST = deedum.my_prev_O_TMST

    And also, please double check that O_ID and O_TMST is Unique!,

    Provided are 2 methods.

    -- double check dups...

    select O_ID, O_TMST from dummy group by O_ID, O_TMST having count(*) > 1

    -- assert duplicates

    if( select count(*) from dummy ) <>

    ( select count(*) from dummy a join dummy b on a.O_ID = b.O_ID and a.O_TMST = b.O_TMST )

    print 'duplicates'

    else

    print 'no duplicates'

    --/rockmoose


    You must unlearn what You have learnt

  • Assuming correct ordering of the rows !

    And also not taking into account that the data is grouped on O_ID, which must be included in the join criteria.

    /rockmoose


    You must unlearn what You have learnt

  • Try this:

     

    update a

    set a.from_dept = b.to_dept

    from dummy a join dummy b on

    a.o_id = b.o_id and b.o_tmst = (select max(o_tmst) from dummy c where c.o_id = a.o_id and c.o_tmst < a.o_tmst)

    Kathi

    Aunt Kathi Data Platform MVP
    Author of Expert T-SQL Window Functions
    Simple-Talk Editor

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply