iterate over rows of qroups from sub-query

  • I'm trying to emulate a historian database which is just several thousand rows

    with 3 columns:

    - "Time" of type DateTime

    - "Tag" of type varchar(whatever)

    - "TagValue" of type float.

    From online posts I've been pulling together tidbits of code (and learning) but could use some direction.

    The historian only needs a few tags but I'd like to repeat the same sets of code for each tag.

    So that I can adjust the "trend" values of the TagValue.

    For example, say I generate 4k of rows for Tag A.

    The DateTime values will be unique for each Tag.

    I'll use RAND with limits for the TagValue values - but I want to adjust those limits per group of consecutive DateTimes.

    Like so:

    rows 1 to 1000 => rand vals between 4.0 and 5.0

    rows 1000 to 2000 => rand vals between 5.0 and 6.0

    rows 2000 to 3000 => rand vals between 6.0 and 4.0

    rows 3000 to remaining => rand vals between 7.0 and 6.0

    It seems like I should be able to use an inner SELECT for each of those groups and use ROW_COUNT and a function call to iterate through the rows within the group.

    As I select a row, call the function to get a new rand value within the same limits.

    Then I could repeat the code for the next group, but tweak the rand-val limits.

    But have no idea how to make that sub-query/group/iteration.

    Then repeat the whole process for Tags B and C. For my purposes 3 or 4 tags would be sufficient.

    Here's what I got so far:

    USE [SIM_MyHistorianDB]

    IF OBJECT_ID('dbo.historian', 'U') IS NOT NULL

    DROP TABLE [dbo].[historian]

    -- from http://stackoverflow.com/questions/7149271/sql-server-2008-generate-a-series-of-date-times

    -- create table with DateTime columns StartRange, EndRange

    CREATE TABLE [dbo].[historian]

    (

    StartRange DateTime NULL

    ,EndRange DateTime NULL

    )

    declare @StartTime datetime = '2011-07-20 11:00:33',

    @EndTime datetime = '2011-08-20 15:37:34',

    @Interval int = 555

    ;WITH cSequence AS

    (

    SELECT

    @StartTime AS StartRange,

    DATEADD(SECOND, @Interval, @StartTime) AS EndRange

    UNION ALL

    SELECT

    EndRange,

    DATEADD(SECOND, @Interval, EndRange)

    FROM cSequence

    WHERE DATEADD(SECOND, @Interval, EndRange) < @EndTime

    )

    INSERT INTO [dbo].[historian](

    StartRange

    ,EndRange

    )

    SELECT * FROM cSequence OPTION (MAXRECURSION 0);

    -- Drop the EndRange column

    ALTER TABLE [dbo].[historian]

    DROP COLUMN EndRange;

    -- Rename 'StartRange' to 'Time'

    EXEC sp_rename 'historian.StartRange', 'Time', 'COLUMN';

    GO

    -- I know the above 2 steps were a funky way to get to this point (still learning)

    -- Add the Tag and Tag's Value column

    ALTER TABLE [dbo].[historian]

    ADD Tag varchar(10), TagValue float

    GO

    -- Set the Tag name for *this* tag

    UPDATE [dbo].[historian]

    SET Tag = 'P_MAN_CHG'

    -- the function for getting a rand value with limits

    -- from http://blog.sqlauthority.com/2012/11/20/sql-server-using-rand-in-user-defined-functions-udf/

    IF OBJECT_ID (N'dbo.RandFn', N'FN') IS NOT NULL

    DROP FUNCTION RandFn;

    GO

    CREATE FUNCTION RandFn(@RAND FLOAT, @min-2 int, @max-2 int)

    RETURNS FLOAT

    AS -- Returns a random float value in range [min max]

    BEGIN

    DECLARE @rndValue FLOAT

    SET @rndValue = @RAND*(@max - @min-2) + @min-2

    RETURN @rndValue

    END

    GO

    Thanks for any help.

  • Quick suggestion (if I understood your question correctly)

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_SIZE BIGINT = 4000;

    DECLARE @StartTime DATETIME = CONVERT(DATETIME,'2011-07-20 11:00:33',120);

    DECLARE @EndTime DATETIME = CONVERT(DATETIME,'2011-08-20 15:37:34',120);

    DECLARE @TIME_RANGE INT = 2 * (DATEDIFF(SECOND,@StartTime,@EndTIME) / @SAMPLE_SIZE);

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)

    ,BASE_DATA AS

    (

    SELECT

    NM.N

    ,CONVERT(BIGINT,ABS(CHECKSUM(NEWID())),0) % @TIME_RANGE AS TIME_VAL

    ,CASE

    WHEN NM.N < 1001 THEN 4

    WHEN NM.N BETWEEN 1001 AND 2000 THEN 5

    WHEN NM.N BETWEEN 2001 AND 3000 THEN 6

    WHEN NM.N > 3000 THEN 7

    END AS BASE_VALUE

    ,ABS(CHECKSUM(NEWID())) AS R_SEED

    FROM NUMS NM

    )

    SELECT

    BD.N

    ,DATEADD(SECOND,SUM(BD.TIME_VAL) OVER

    (

    ORDER BY BD.N

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND CURRENT ROW

    ),@StartTime) AS [Time]

    ,CASE

    WHEN BD.BASE_VALUE < 8 THEN 'P_MAN_CHG'

    ELSE 'OTHER_TAG'

    END AS Tag

    ,BD.BASE_VALUE + RAND(BD.R_SEED) AS TagValue

    FROM BASE_DATA BD;

    First few rows of the output

    N Time Tag TagValue

    ----- ----------------------- --------- ----------------------

    1 2011-07-20 11:16:31.000 P_MAN_CHG 4.98761653192708

    2 2011-07-20 11:29:11.000 P_MAN_CHG 4.75271534782763

    3 2011-07-20 11:34:02.000 P_MAN_CHG 4.53856566898698

    4 2011-07-20 11:40:38.000 P_MAN_CHG 4.72297868147395

    5 2011-07-20 11:51:20.000 P_MAN_CHG 4.20620937695261

    6 2011-07-20 11:54:20.000 P_MAN_CHG 4.88951905633601

    7 2011-07-20 11:54:32.000 P_MAN_CHG 4.39667228572923

    8 2011-07-20 11:57:47.000 P_MAN_CHG 4.25596765660796

    9 2011-07-20 12:12:16.000 P_MAN_CHG 4.24687268000568

    10 2011-07-20 12:21:16.000 P_MAN_CHG 4.96680488930536

    11 2011-07-20 12:31:04.000 P_MAN_CHG 4.95324994986705

    12 2011-07-20 12:46:20.000 P_MAN_CHG 4.13457461207966

    13 2011-07-20 12:52:55.000 P_MAN_CHG 4.16643477498641

    14 2011-07-20 13:10:55.000 P_MAN_CHG 4.42943506135419

    15 2011-07-20 13:32:32.000 P_MAN_CHG 4.46022334598849

    16 2011-07-20 13:36:32.000 P_MAN_CHG 4.6216370828385

    17 2011-07-20 13:42:22.000 P_MAN_CHG 4.12497315104819

    18 2011-07-20 13:55:52.000 P_MAN_CHG 4.98936973787402

    19 2011-07-20 14:11:56.000 P_MAN_CHG 4.75868663303926

    20 2011-07-20 14:24:34.000 P_MAN_CHG 4.17822013303251

    21 2011-07-20 14:40:38.000 P_MAN_CHG 4.26549995762316

    22 2011-07-20 14:44:03.000 P_MAN_CHG 4.16613587118889

    23 2011-07-20 14:56:55.000 P_MAN_CHG 4.18974123713843

    24 2011-07-20 15:06:45.000 P_MAN_CHG 4.35508370063455

    25 2011-07-20 15:21:39.000 P_MAN_CHG 4.20015389645265

    26 2011-07-20 15:28:25.000 P_MAN_CHG 4.27800895472645

    27 2011-07-20 15:38:28.000 P_MAN_CHG 4.70747701004639

    28 2011-07-20 15:46:20.000 P_MAN_CHG 4.56643514761468

    29 2011-07-20 16:05:20.000 P_MAN_CHG 4.19048080134641

    30 2011-07-20 16:06:36.000 P_MAN_CHG 4.29610962436167

  • @Eirikur, Much thanks! The output is nearly there. Now I need to understand it.

    I got as far as

    (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    I figure you chose 10 values arbitrarily but not sure why 10. Is 10 values processed at a time?

    Can you write some pseudo-code for this code?

  • You are very welcome.

    The code uses an inline tally table (numbers table) to generate the initial rows, the iTally works on the principal of cross joining a seed table of 10 rows in order to generate a far greater number of rows.

    😎

    -- SEED FOR AN INLINE TALLY TABLE

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    SELECT

    XT.N

    ,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN

    FROM T XT;

    -- CROSS JOIN SHORTCUT "," RETURNS 10 x 10 ROWS

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    SELECT

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN

    FROM T T1,T T2;

    -- COMBINED AS AN INLINE TALLY TABLE THIS CAN GENERATE UP TO 10^9 ROWS

    DECLARE @SAMPLE_SIZE BIGINT = 4000;

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)

    SELECT

    NM.N

    FROM NUMS NM;

  • Further explanation (comments in the code)

    😎

    -- Number of rows generated

    DECLARE @SAMPLE_SIZE BIGINT = 4000;

    -- First date or Start Date

    DECLARE @StartTime DATETIME = CONVERT(DATETIME,'2011-07-20 11:00:33',120);

    -- Last date or End Date

    DECLARE @EndTime DATETIME = CONVERT(DATETIME,'2011-08-20 15:37:34',120);

    -- Date increments in second calculated as the number of seconds between the Start Date and End Date

    -- divided by the number of rows and then multiplied by 2 as the average value is close to 0.5 times

    -- the difference.

    DECLARE @TIME_RANGE INT = 2 * (DATEDIFF(SECOND,@StartTime,@EndTIME) / @SAMPLE_SIZE);

    -- Inline tally table (iTally)

    ;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))

    , NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)

    ,BASE_DATA AS

    (

    SELECT

    NM.N

    -- Pseudo random time interval

    ,CONVERT(BIGINT,ABS(CHECKSUM(NEWID())),0) % @TIME_RANGE AS TIME_VAL

    -- Base group assignement

    ,CASE

    WHEN NM.N < 1001 THEN 4

    WHEN NM.N BETWEEN 1001 AND 2000 THEN 5

    WHEN NM.N BETWEEN 2001 AND 3000 THEN 6

    WHEN NM.N > 3000 THEN 7

    END AS BASE_VALUE

    -- Seed for the RAND() function

    ,ABS(CHECKSUM(NEWID())) AS R_SEED

    FROM NUMS NM

    )

    SELECT

    BD.N

    -- To guarantee the unique time values the query uses a

    -- running total of the time interval values and adds it

    -- to the Start Time

    ,DATEADD(SECOND,SUM(BD.TIME_VAL) OVER

    (

    ORDER BY BD.N

    ROWS BETWEEN UNBOUNDED PRECEDING

    AND CURRENT ROW

    ),@StartTime) AS [Time]

    ,CASE

    WHEN BD.BASE_VALUE < 8 THEN 'P_MAN_CHG'

    ELSE 'OTHER_TAG'

    END AS Tag

    -- The RAND() generates values between 0 and 1, adding the

    -- output to the BASE_VALUE generates values between the

    -- desired group values, i.e. 4 - 5 etc.

    ,BD.BASE_VALUE + RAND(BD.R_SEED) AS TagValue

    FROM BASE_DATA BD;

  • Excellent write-up! Thanks again.

    So if I wanted to adjust the code and expand on the portion that builds the random value added to the base value, I could just use the BD.BASE_VALUE as a selector in another case statement. To call my rand-function with a wider range for a specific case, e.g. '7' for example. Right?

  • skippyV (9/3/2016)


    Excellent write-up! Thanks again.

    So if I wanted to adjust the code and expand on the portion that builds the random value added to the base value, I could just use the BD.BASE_VALUE as a selector in another case statement. To call my rand-function with a wider range for a specific case, e.g. '7' for example. Right?

    That's correct, just do a multiplication of the range desired

    😎

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply