sql query help

  • id Name

    1 xxxx

    2 yyyy

    3 zzzz

    1 pppp

    3 dfgf

    2 dhfl

    I have this data in my sql table and I am trying to write a query to get the data like shown below.

    id name

    1 xxxx,pppp

    2 yyyy,dhfl

    3 zzzz,dfgf

    Can anyone help in this.

    Thanks. 

  • The usual answer is 'do it on the client, SQL is no good at this sort of thing'

    However, if you really need to anyway, there are a few different ways to achieve it.

    Search the forum for 'PIVOT' and you'll get quite a few suggestions.

    /Kenneth

  • Reference post I just did with same basic question.

    http://qa.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=252864&post=true

  • Try This:

    CREATE FUNCTION dbo.MyFunction

    (@ID int)

    returns varchar(2000)

    as

    begin

     declare @rtrn varchar(2000)

     declare @temp varchar(500)

     declare @rowcount int 

     set @temp = ''

     set @rowcount = 1

     set @rtrn = ''

     while(@rowcount > 0)

     begin

      select top 1 @temp = [Name]

      from MyTable

      where [id] = @ID

       and cast(cast([Name] as varbinary)as int)  >

       cast(cast(@temp as varbinary)as int)

      order by [id],cast(cast([Name] as varbinary)as int)

      set @rowcount = @@rowcount

      if(@rowcount > 0)

      begin

       select @rtrn = @rtrn + case when len(@rtrn) > 0

        then ',' + @temp else @temp end

       where len(@rtrn + case when len(@rtrn) > 0

        then ',' + @temp else @temp end)<2000

       

       set @rowcount = @@rowcount

      end  

     end 

     return @rtrn

    end

    select distinct id,  dbo.MyFunction(id) as [name]

    from Mytable

  • I really wish people would stop saying that... sometimes SQL is the only client application you have.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • To add to my previous, RayM's answer is absolutely correct.  Further, if you have to do such a thing on a hundred thousand rows to send as a file, then the "client" code will be comparatively dog slow.  RayM's answer is simple and quick... to do such a thing in Java or VB or whatever, would certainly be at least, if not more, complex and will always be slower.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Well, 'sometimes' isn't making it standard, so why stop saying how something 'should' be done..?

    The fact remains, SQL sucks at row-by-row operations, doing it by hiding a loop in a UDF doesn't make the fact go away. The language just isn't designed for that type of operations. So, the 'best' is always to rip out the data as a set, and do the 'stuff' elsewhere.

    The above is the starting point when no other details are known, right? (as the poster didn't provide any other details than just asking 'how can I do x'?) Thus, the generic recommendation for this generic question is: 'it is best to do on the client, not the server'. (not involving 2005 and clr etc here )

    Enter real life.. Sure, sometimes SQL is the only 'client', that doesn't change what to be said when the client is unknown. Relative to 'best' is 'good enough', which in the end is the deciding factor anyway, regardless of what's concieved to be 'best'. An example of 'good enough' is many times to do exactly as Ray's UDF.

    Still, from a SQL perspective, it's not 'optimal' in any way, but may for each and everyone's purpose be 'good enough'. It depends. Rules are made to be broken, the thing is just that it's better to break them intentionally than being unaware of it.

    So, I disagree. I don't think we should, as a first response to questions on 'how do I write SQL to do a pivot', not say 'It's 'best' done on the 'client'.

    just my .02, though

    /Kenneth

  • Do you have precisely two occurencies of each Id in your data (like in your test example)? Then you could try the following:

    select Id, max(Name) + ',' + min(Name) from yourtable group by Id

    If this is not the case, please state what the max number of occurencies of each Id is (or is it unbounded?).

     

  • Hmmmm... Ken, I can't say I agree... here's some test code to demonstrate why...

    The first part of the code makes a test table of a million rows.  After all, if I'm to demonstrate that value and power of doing such a task in SQL instead of an application is the right way to go, I might as well make it a good one... the code takes about 40 seconds or less to run on a properly configured server...

    --===== Create a populate a million row test table to demonstrate the

         -- the power of SQL as compared to an "application".  A "real life"

         -- example would be much wider but this will suffice for test data

         -- and is easy to write a comparison test for in an "application".

     SELECT TOP 1000000

            IDENTITY(INT,1,1) AS RowNum,

            CAST(RAND(CAST(NEWID() AS VARBINARY))*9999+1 AS INT) AS UserID,

            CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

          + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS SomeValue

       INTO dbo.CSVTest

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== A table is not properly formed unless a Primary Key has been assigned

      ALTER TABLE dbo.CSVTest

            ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== ...and, we'll do a little code optimization by adding an index to

         -- support the upcoming query.

    CREATE NONCLUSTERED INDEX CSVTest_UserID_SomeValue

        ON dbo.CSVTest (UserID,SomeValue)

    The code that forms the UserID column creates random numbers from 1 to 9999 throughout the million rows.  The code that forms the SomeValue column creates random occurances of the values "AA" through "AZ" throughout the million rows.  To keep the test uncomplicated and true, dupes are allowed and will be present.

    Ok, now for a function very similar to that function that RAYM wrote...

    --===== Create a function to produce a CSV column from a set of data

         -- identified by the input parameter @userid

     CREATE FUNCTION dbo.fTestCSV(@UserID INT)

    RETURNS VARCHAR(8000)

         AS

      BEGIN

            --===== Declare a working variable

            DECLARE @CSV VARCHAR(8000)

            --===== Concatenate the "SomeValue" column into a single CSV value

                 -- for a given user ID using a set based "loop"

             SELECT @CSV = COALESCE(@CSV+',','')+CAST(SomeValue AS VARCHAR(10))

               FROM CSVTest

              WHERE UserID = @userid

             RETURN @CSV

        END

    We're all set for the test... the goal is to create a single CSV column for each UserID that contains all of the values of the SomeValue column in a million row table.  Run this in Query Analyzer in the grid mode to simulate something like a "grid object" in an application.  The actual code to run this test is surrounded by other code to measure the duration and performance... the code under test looks like this...

     SELECT UserID,dbo.fTestCSV(UserID)

       FROM dbo.CSVTest

      GROUP BY UserID

    ...and here's that rascal embedded in parametric code... to keep from having to calculate the differences between the start and end parametrics, open a new window in Query Analyzer and paste the following code in it and then run it...

    --===== Ensure that the data is NOT cached so we can get a

         -- true measure of performance of the code being tested.

       DBCC DROPCLEANBUFFERS

       DBCC FREEPROCCACHE

    --===== Declare a couple of variables to measure duration

    DECLARE @StartTime DATETIME

    DECLARE @EndTime   DATETIME

    --===== Show the condition of the resources used so far

         -- Should be very close to zero if you run this in

         -- a new window of Query Analyzer

     SELECT CPU/1000.0 AS [CPU Seconds],

            Physical_IO AS [Disk Reads/Writes],

            MemUsage * 8.096 AS [KBytes of Memory Used]

       FROM Master.dbo.SysProcesses

      WHERE SPID = @@SPID

    --===== Start the duration "timer"

        SET @StartTime = GETDATE()

    --===== Run the code being tested

     SELECT UserID,dbo.fTestCSV(UserID)

       FROM dbo.CSVTest

      GROUP BY UserID

    --===== Stop the duration "timer

        SET @EndTime = GETDATE()

    --===== Again, show the condition of the resources used.

         -- To get the true measure, subtract the previous

         -- values of resources display from these.

     SELECT CPU/1000.0 AS [CPU Seconds],

            Physical_IO AS [Disk Reads/Writes],

            MemUsage * 8.096 AS [KBytes of Memory Used]

       FROM Master.dbo.SysProcesses

      WHERE SPID = @@SPID

    --===== Finally, display the total duration of the test

         -- code in decimal seconds

     SELECT DATEDIFF(ms,@StartTime,@EndTime)/1000.0

    On my humble 1.8Ghz single processor box, I get the following stats...

    3.516 Seconds Total duration including display time by the "application"

    3.062 Seconds Total CPU time used (3.109 - 0.047)

    90     Total Disk Read/Writes (97-7)

    0.000 KBytes Total memory used (0.000-0.000)

    9,999 Total number of records transmitted to the "application"

    Basically, SQL Server processed more than 320,000 rows each second... I'm thinking that even a very well written application might not be able to keep up with that.   Certainly, the network folks are going to love it because instead of transmitting a million rows to an app, this transmits less than 1/100th of that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Very interesting - and surprising, at least to me...

    I have removed the rest of my comments now that Jeff has rewritten his post

  • Jesper, thanks for pulling me off the ceiling... you're the best.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • No problems Jeff (just had to google on 'recipro....' that wierd word.. )

    If you feel like blowing off steam offline you're more than welcome to pm me.

    I do sense an amount of misunderstanding on the matter.

    If not, that's fine too

    /Kenneth

  • Thanks for the offer, Ken... we're all set.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply