How to remove null COLUMNS

  • below is my input csv file

    col1-----col2------col3-----col4-----col5

    a--------null-------1-------null------null

    b--------w---------2-------null------null

    c--------null-------3--------null------null

    d--------x---------null------null------null

    e--------y---------4--------null------null

    i want to import this file to sql database where there is no table created. the new table created by the ssis package should look like this

    col1------col2------col3

    a---------null--------1

    b---------w---------2

    c---------null--------3

    d---------x----------null

    e---------y----------4

    i want to dynamically get rid of the COLUMNS that have all null values.

    i cannot manually go select the output columns for the tables because there are a lot of columns in the input csv file with a lot of data

    thank you for your responses

  • I would input to a temporary table that allows nulls, then pass data into a new table handling the Nulls with a query using say COALESCE.

    Facts are stubborn things, but statistics are more pliable - Mark Twain
    Carolyn
    SQLServerSpecialists[/url]

  • If it is just Column 4 + 5 that you want to get rid of, then just un-tick them in the columns tab of your data source and then they will not get used in the data flow.

    Failing this, I would use the staging table suggestion from Carolyn.

  • i cannot untick them because there are around 40 columns with 100's of rows in the csv file. so checking if a column is null or not is not appropriate.

    is there some way to write a query in select which says

    select (non null columns)

    from tablename

  • ar-727381 (9/23/2009)


    i cannot untick them because there are around 40 columns with 100's of rows in the csv file. so checking if a column is null or not is not appropriate.

    is there some way to write a query in select which says

    select (non null columns)

    from tablename

    Not possible.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • So what kind of data flow transformation in SSIS will help achieve the removing of ALL NULL columns before loading it into database

  • This is an unusual requirement - let's make sure I understand.

    Say your input file has 10 columns and, out of those, 8 columns contain only nulls.

    You want to load only the two columns which contain some non-null values?

    Should these columns be loaded into columns 1 and 2 of your destination database? If so, can you confirm that all columns in the destination table have the same datatype (presumably char(1))? Is there a primary key that you have not mentioned? The data looks meaningless otherwise.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • this is basically a personal requirement.

    I have my contacts imported in csv file. now the csv file contains around 40 columns like

    first name, last name, phone, fax, web, business fax, business phone1, business address, etc etc

    of which not all are important to me like web, business fax, business address, etc, etc

    i want to import all this contact info into my database without sending the NULL columns

    Thus i am creating the package to import the csv file after eliminating the null columns

    what is the best approach for this

  • You have not answered all of my questions, so it's difficult for me to answer.

    Does your destination table already exist? If so, what's the problem? Just go ahead and import. If not, are you expecting the package to create the table as well as import to it? Or are you going to do the analysis before you do the import?

    You can do just about anything with SSIS, if you are prepared to invest the time. But in those case where meta data for either the source or the target is required to be dynamic, it is usually not easy.

    Especially if this is a one-off, I'd recommend doing the initial analysis of which columns are required in Excel (or whatever) and then building your destination table in SQL Server based on this analysis and then doing your import as a standard data flow in SSIS.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thanks for your reply, I will give you more detailed information here.

    Currently my table does not exist. I was looking to create the table in the package. the import datatypes of the columns are

    varchar (names, address, etc and also for phone because all numbers are not saved in specific format)

    datetime (birthdays, anniversary, ect)

    now my concern is if i create a table after analysis of all the columns in the csv file and just import the required columns things would work just fine and it would be a simple import package.

    My concerns is I would be importing this contacts csv file every couple of months with this package, so i will have a lookup transformation to just update the table without importing the old contacts which would already be there in the table.

    So now, this is the first time i am importing this data, so should i first create the table in database or create it in the package because I would not need to create the table next time when I run this package right, because it will then only be an insert of new contacts which are added to the csv file.

    Also there is one more issue.

    What if the table does not have Fax column because no contacts of mine in csv file have fax number. and suddenly there is this new contact of mine who gives me a Fax number. is there some way in ssis that it will create a new column in the table and insert the fax number for this contact and the new table thus mad will have fax column.

    Does this all make sense to you, or is it some random thing I am trying to do. but i really want to give time to this and work on it to improve my ssis knowledge with the help of you fellow developers and dba's

  • OK, now we're really getting somewhere 🙂

    I will tell you what I would do - others would certainly have their own ideas.

    1) Build the destination table now, including all of the fields that include data now, or may include data that you want to store in the future. Do not worry about all the nulls, untidy as they may look.

    2) Build your SSIS package to import all of these fields, null or not.

    That takes care of your initial import and future-proofs your SQL Server table. I can imagine a possible problem with the subsequent update jobs: what will you use to do the lookup - do you have a unique key? Looking up on name may work initially, but when you record yet another contact named Mike Litoris, your lookups will fail.

    Also - just checking - I am assuming that this contact info in SQL Server will never be updated, except by the SSIS package? Ie, that this is a one-way sync.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Sounds good, that could certainly be one way to do it.

    I am thinking my having a composite primary key of firstname and primary phone which would help me for looking up rows during update.

    will this work for future updates?

  • Yes, that should work. If you decide to do that, I would suggest putting a uniqueness constraint on that combination in SQL Server, to avoid any possibility that duplicates can be created.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Hi,

    If you try little hard you can achieve this.

    In sql server DTS you can write script like this.

    Find all cloumn name of that table.

    say you get fname,lname,roll,age.

    Then you can query on each column seprately,i.e.

    select fname from table

    If its return not null then store the column name in variable.

    In this manner query thru all column .

    then with the help of stored variable(all not null columns) create table

    and insert.

    [font="Verdana"]Regards
    Kumar Harsh[/font]

    :unsure:

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply