• Try the following for a case-insensitive query against c:\temp\example.txt (assuming your SQL server is case-insensitive):

     
    
    SELECT
    COUNT(*)-1 AS Duplications,
    TextLine
    FROM
    (
    SELECT
    F1 AS TextLine
    FROM OpenDataSource(
    'Microsoft.Jet.OLEDB.4.0',
    'Data Source="C:\TEMP\";Extended properties="Text;HDR=NO"')...Example#Txt
    ) TextInfo
    GROUP BY TextLine
    HAVING COUNT(*) > 1
    COMPUTE SUM(COUNT(*)-1)

    Or, for a case-sensitive query, try:

     
    
    SELECT
    COUNT(*)-1 AS Duplications,
    TextLine,
    Checksum
    FROM
    (
    SELECT
    F1 AS TextLine,
    BINARY_CHECKSUM (*) AS Checksum
    FROM OpenDataSource(
    'Microsoft.Jet.OLEDB.4.0',
    'Data Source="C:\TEMP\";Extended properties="Text;HDR=NO"')...Example#Txt
    ) TextInfo
    GROUP BY TextLine, Checksum
    HAVING COUNT(*) > 1
    COMPUTE SUM(COUNT(*)-1)

    For my testing, c:\temp\example.txt contained:

    This line appears twice (with different CASE)

    This line appears 4 times

    This line appears 3 times

    This line appears 2 times

    This line appears 1 time

    This line appears 4 times

    This line appears 3 times

    This line appears 2 times

    This line appears 4 times

    This line appears 3 times

    This line appears 4 times

    This line appears twice (with different case)


    Cheers,
    - Mark