What's in your CLR?

  • I don't know whether the DoD "Ada Mandate" still exists, as I got out of defense work in 2003. But from the 1980s till then, there was always the option to use "the best language for the job" instead of Ada, provided you wrote a sufficiently convincing argument that it was actually the best language.

    In the early days, it was rather difficult to do so, because the people you had to convince know absolutely nothing about programming.

    By the time I got out of that field, those people had been replaced by youngsters who still knew little about programming, but had played with the most error-prone language I have ever seen (C) on their home computers. Consequently, it was too easy for other clueless youngsters to sell them on C or Java.

    There was a group in the early days who believed that LISP was the best language for their job. So instead of seeking a waiver, they wrote a preprocessor that translated their LISP into Ada. Impressive accomplishment, but it was illegal. Writing LISP and auto-translating it into Ada is NOT developing in Ada. Of course, the clueless bureaucrats I mentioned earlier didn't understand that.

    Not very different from writing SQL that looks like Pascal .....

  • I use a set of CLR based functions to detect injection. Helps that its done right at the database level, with no possible client or network hacking possibilities.

    Handwritten, they can't be perfect, but I considered very many tricks to get code to run. It seems their are a awful lot of ways to encode plain text. Once assured its simple plain text with no encoded characters, then just make sure none of the SQL keywords or abbr. are in the text.

    I use a similar function to "clean" a string that I use for free text (keyword search) fields. And another that uses REGEX to validate text as a valid email address (and not injection).

    I suppose there isn't a need to have used CLR over SQLs string functions, but I have several other CLRs that cannot be done in SQL alone and there was simply no reason not to add a few extra functions using a preferred language.

  • Hi
    It seems this topic has come around again, so I will add my 2 cents worth:-)

    I come from an Oracle background where PL/SQL has a much broad set of functionality than T-SQL - probably because you can write CLR for anything missing.  In the Oracle databases we developed it was mostly "stage and then process" within the database, this meant a common set of logic was used for all sources of the information (multiple file formats, directly storage from web pages, and others).  Given this philosophy when we started moving to SQL Server we followed the same development processes - which was naive, as they are different even though they are both database systems.

    So we built CLRS for the following tasks:
    1) calling out of the database to remote web sites (secure pages only)
    2) logging to file
    3) binary file generation - so the file looked like it was generated from a 1990's C application
    4) communication back to the Oracle database

    While these all worked, some better than others, we have been moving away from this to using tools better suited to the job e.g. custom applications/services or SSIS tasks.

    So would I write a CLR again - "maybe" as that would depend on the reason/task, rather than the current philosophy.

  • Since the original editorial we've had Solomun Rutzky's excellent Stairway to SQL CLR series which pretty much drove a stake through the heart of a lot of the myths and assumptions surrounding SQLCLR.  Solomon's own SQL# product demonstrates the art of the possible and is a useful toolkit.

    I found that user defined aggregates, while mechanically producing the result I wanted, created expensive execution plans.  This wouldn't matter on a DB Server for a small team of data scientists but on a machine with a much wider audience with SLAs to adhere to it very much did matter.  My experience with other analytical databases is that they are very good at what they do but I would doubt that they could handle a large number of users.

    An evolution I have seen is the adoption of a broader spectrum of tooling for data processing activity.  Whether this be NOSQL, search engine technology, languages, compute frameworks like Apache Spark and Pandas or something else entirely.  SQL Server does have an incredibly broad range of capabilities and with that comes an equally broad range of possibilities.  However, I've seen a marked preference for choosing a broader range of technologies, perhaps too much so.

    There are things that T-SQL is brilliant at.  There are things that it can be made to do such as complex string manipulation, which are much better handled in SQLCLR.  However, as soon as you decide that T-SQL might not be the answer you also find that the idea that using SQL Server as the platform for what you intend to do starts to gain resistance.  

    If you ask people to pin down what their objection to using SQL Server then, in my experience, aversion to monolithic architectures will be mentioned, specific open-source tooling will be mentioned, licencing costs together with over-dependence on a vendors etc.

    I have seen technologies fall out of favour, not for their intrinsic faults, but because by the time people had figured out how to use them properly the world had moved on.

  • Personally I find it difficult to motivate using sqlclr. Typically you have an application sitting on top that can take care of it for a fraction of the operational cost.

  • I once looked at CLR for doing some external file management (specifically for deleting files older than a supplied date/time parameter).
    Had too much problems with user permissions (since by default it authenticates with the same user as the SQL Server Service).

    I am sure there are ways around this, but it was not worth pursuing it any further.
    PowerShell and custom .NET services works just as well.

  • When SQLCLR first came on the scene we used it to proof of concept some generic auditing and encryption / decryption. However, we went different directions with both. The only thing I have really used it for in a production environment so far is to handle some generic regex replacement processing during certain ETL solutions.

  • As far as I know, the only CLR based functions we're using across the enterprise is a 3rd party library called Voltage for encryption and decryption on personal identifier columns. It's something that's been around before I joined the organization, and it's OK for singleton SQL queries, but any ETL or aggregate reporting process that references the functions take a huge performance hit. If the business chose to replace the Voltage CLR solution with SQL Server's own symmetric key functions, there would be no objection from me.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • We use CLR as a last resort and usually on an analysis instance -- rarely on production.  Main functions we use are very simple and for reporting -- Group_Concat, and CollapseWhiteSpace (which trims, removes invisible characters from imported data, removes double spaces, etc. ), some Regex functions, and some business specific string functions.  
    Overall, glad to use it.  The difficulty with debugging is kind of a deal breaker for anything too complicated though.

    The three biggest mistakes in life...thinking that power = freedom, sex = love, and data = information.

  • It seems like there is a bit of consensus that CLR is not a go-to tool, especially for high transaction databases. When it is use, it is mostly for text manipulation. (I suspect there might be some great uses for geom/geo objects too).

    Some seem to think it inherently lacks efficiency and elegance, but you can write the most inefficient program (an infinite loop that does nothing) in TSQL or the CLR language of choice. It is only fair to compare well written, thoroughly tested code. I haven't found the overhead in calling CLR to be too much of an issue, despite using it for injection protection from website users input (many thousands of calls per hour). Since injection occurs within the database, I like to have that code as close to the database as possible (as opposed to putting the code in the web interface or middle-tier). It is a minor headache that CLR code has to be backed-up and maintained separately from the database.

    CLRs probably work well with particular API (someone mentioned a third-party encryption/decryption tool). But I would be really hesitant to use them with long-running programs or with applications that require network communication.   I have not tried aggregate functions, I just haven't found a good use-case for this yer.

Viewing 10 posts - 46 through 54 (of 54 total)

You must be logged in to reply to this topic. Login to reply