How should I use SQL to normalize my tables?

  • This is coming from a novice in SQL with only a few months experience. I have a table that I would like to normalize. I plan to split the table into four tables. I have a plan of how I want to do this. I'll outline how I think I should go about doing this, please correct me if I am wrong or going in the wrong direction.

    1. import the table in SSMS.
    2. Use SQL to create a table using columns for table 1.
    3. Use SQL to create a table using columns for table 2.
    4. Use SQL to create a table using columns for table 3.
    5. Use SQL to create a table using columns for table 4.
    6. Drop the original table.
    7. Create relationships for tables 1,2,3 and 4.

    Thanks in advance

  • That's an ELT (Extract, Load, & Transform) strategy using a staging table, as opposed to an ETL strategy (in which you would do the transformation before importing, loading directly into the destination tables). Both are valid & common. The ELT strategy takes advantage of the power/features of SQL Server & TSQL. ETL strategy tend to use tools like SSIS or similar 3rd-party tools to transform the data.

    One recommendation: Swap steps 6 & 7 -- do not drop your original table until you have successfully created foreign keys (relationships) & other constraints to ensure your data and data model are valid.

  • The tricky part of normalization is populating the tables imo.  A lot of times it's an attempt to establish foreign key referential integrity post priori (after the fact).   You could divide your tables into two categories: first, look up tables (without any fks), and second, tables with fks.  To populate the lookup tables: assign a unique primary key on INSERT of distinct values.   To INSERT into foreign key columns JOIN on the lookup value(s) to SELECT the primary key of the lookup table

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • @ratbak Just curious if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?. Good advice I'll be sure to drop the table at the very end. Thanks.

    Steve Collins I'm embarrassed to say I've not used lookup tables in SQL yet. But I'll be sure to learn it and do it. It's definitely a method I should know. Thanks.

    • This reply was modified 2 years ago by  Mr_X.
    • This reply was modified 2 years ago by  Mr_X.
  • I would do the normalization on paper first. That will give you a chance to see if you are going through the normalization process correctly without wasting effort in building your tables on the fly. Doing this will also allow you to create the DRI while creating the tables and help you see the steps for importing the data from the source table to the destination tables.

  • My fault, I should have been more specific. The steps outlined above are assuming I know what the model will look (so after I've done the normalization on paper). Out of curiosity, what does DRI stand for?.

    Also, if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?.

  • Mr_X wrote:

    My fault, I should have been more specific. The steps outlined above are assuming I know what the model will look (so after I've done the normalization on paper). Out of curiosity, what does DRI stand for?.

    Also, if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?.

    As suggested above by Lynn Pettis, the normalisation should be done on paper first. Then you build your (initially empty) tables based on what you have decided.

    After that, build a process which moves the data into the new structures. The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )

    DRI stands for declarative referential integrity. It refers to the use of primary and foreign keys to maintain data integrity.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • I appreciate your help. You mentioned:

    "The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )"

    Does that mean that the transformation usually takes place outside of SQL? If it were to take place in SQL does that mean the table is imported, transformed and then inserted into the normalized tables?.

    Thank you

    • This reply was modified 2 years ago by  Mr_X.
  • Mr_X wrote:

    I appreciate your help. You mentioned:

    "The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )"

    Does that mean that the transformation usually takes place outside of SQL? If it were to take place in SQL does that mean the table is imported, transformed and then inserted into the normalized tables?.

    Thank you

    All depends on how you set up the import. If you stage the data in a staging table, then you are going to use an ELT process and you will do your transformations using SQL.

    Either way, you want to be sure to populate the parent table(s) first, then the child tables moving down each level in order. That is why you want to do the normalization on paper so you can determine the order of the imports.

     

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply