Does a check constraint with UDF has performance impact on Query?

Question

Post reply

Does a check constraint with UDF has performance impact on Query?

krishnaroopa

SSChasing Mays

Points: 650
More actions
September 11, 2018 at 11:57 pm

#372709

Hi all

We have a Master table that has a primary key with two columns ( Code char(4), location int ). We have child tables referring this table, with only 1 column Code. So i cannot create a Foreign key constraint. So a work around identified was, to have a check constraint with an UDF to check the column alone. Now i have some clarifications
1. Can we implement this. How it will impact the performance on Query on the tables .
2. If this is a bad idea, can we implement it in any other means..
regards
KRS

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 1

So I understand, your first table has a unique constraint consisting of two columns, but you're trying to set up a FK relationship to only one of the columns? You do realize that's probably going to lead to all sorts of data problems. Let's look at this:

Table 1
Code    Location
'Boom' 1
'Loss'   2
'Bad'     1
'Boom' 2
'Boom' 3
'Boom' 4

Table 2
Code
'Boom'

If I write the following query, what comes back?

SELECT t1.Location, t2.Code FROM dbo.Table1 as t1 JOIN dbo.Table2 as t2 ON t1.Code = t2.Code;
It's going to be a hot mess. If Code is actually unique, put a unique constraint on it. If the only way this works is that Code and Location have to be the primary key, then you need both columns in the other table to join to a unique row in the first one. If you can only add one column to the other table (for some odd reason that I'd need a lot of explanation for), then create a third column in your first table, make it a GUID or an IDENTITY or a SEQUENCE column, put a unique constraint on it, and then use that in the second table.

What you have currently not only doesn't work from a structural stand point as you're finding out, it doesn't make any sense from a data stand point.

Another approach would be to create a Code table. Then that maps to the Table1 and it can map to the Table2, it then enforces that only codes that can exist in either Table1 or Table2 actually exist.

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

Phil Parkin SSC Guru Points: 245813 More actions · Answer 2

Another approach would be to create a Code table. Then that maps to the Table1 and it can map to the Table2, it then enforces that only codes that can exist in either Table1 or Table2 actually exist.

This is what I would do.

If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

sgmunson SSC Guru Points: 110588 More actions · Answer 3

krishnaroopa - Tuesday, September 11, 2018 11:57 PM

Hi all

We have a Master table that has a primary key with two columns ( Code char(4), location int ). We have child tables referring this table, with only 1 column Code. So i cannot create a Foreign key constraint. So a work around identified was, to have a check constraint with an UDF to check the column alone. Now i have some clarifications
1. Can we implement this. How it will impact the performance on Query on the tables .
2. If this is a bad idea, can we implement it in any other means..
regards
KRS

Let me pile on with the following:

To answer question one, the answer to "Can we implement this" is Nope! with a capital N. Not because it's not physically possible, but because that leads to the "hot mess" referred to by previous posters. As to other means to implement, it's rather dependent on the nature of this tables data, and the data in the other tables. You may only need to check for the existence of a record in the master table with the Code value, but then, if you need to refer to any other value in the master table, you have a problem choosing which one of what could be a varying number of rows, depending on the Code value from the other table. How you fix that part of the problem depends on why you need to join to the master table in the first place. If it's only to validate the Code value, then an existence check is sufficient, but if you need to refer to any other column in the master table, then you are going to have to create a relationship table instead, that relates a specific row in the master table to any given Code value in this other table. The only other alternatives involve somehow adding the other part of the primary key to the other table. Finally, what does a check constraint or a UDF have to do with this? If you were going to use a check constraint to ensure that Code was unique in the master table, you wouldn't need that other column as part of the primary key. What a UDF would do for you in any of this is anyone's guess - what would such a UDF be intended to accomplish?

Steve?(aka sgmunson)?:) 🙂 :)?
Health & Nutrition
Rent Servers for Income (picks and shovels strategy)

ScottPletcher SSC Guru Points: 100262 More actions · Answer 4

You do need a separate codes table to do what you want. But that table could be fully maintained automatically by triggers on the "master table". That is, you wouldn't have to add code in any proc to load or maintain the table.

SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

Alan Burstein SSC Guru Points: 61136 More actions · Answer 5

In addition to what everyone else said -
it's worth noting that T-SQL scalar user defined functions (UDF) are generally horrible and cause all kinds of problems even when they are not referenced directly in a query.
Many people (not enough) know that scalar UDF's make queries slower, in part, because they kill parallelism. A lesser-known evil of scalar UDFs is that, when you use them for check constraints, or computed columns they kill parallelism when that table is referenced in a table, even when the column that uses them is not referenced in the query. I believe it was SQL Server 2014 (or 2012) that allowed for building indexes with a parallel plan... But not when the table contains a scalar UDF! Just the scent of a scalar UDF causes the optimizer to get all serial about life. Here's a couple great eye-opening articles by Erik Darling about this topic:

Another reason why scalar functions in computed columns is a bad idea
Another Hidden Parallelism Killer: Scalar UDFs In Check Constraints

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

jcelko212 32090 SSCrazy Eights Points: 9220 More actions · Answer 6

krishnaroopa - Tuesday, September 11, 2018 11:57 PM

>> We have a Master [sic] table that has a primary key with two columns (something_code CHAR(4), something_location INTEGER). We have child tables [sic] referring to this table, with only 1 column something_code. So I cannot create a Foreign Key constraint. <<

There are some serious problems here. The terms "master" and "child" are completely nonrelational. They applied to the old network databases and to magnetic tape files. We have referenced and referencing tables and that is a very different concept.

Given a compound key in the referenced table, the referencing tables have to refer to a not null, unique column (these are properties of a primary key, by definition).

>> So a workaround identified was, to have a CHECK() constraint with a UDF to CHECK() the column alone. <<

You've already been told in some detail why this is a really bad way to do it from a performance viewpoint. But the real problems are conceptual; you don't have a reference!

Please read a book on basic data modeling. There is no such thing as a universal generic "code"; it has to be a code for something (postal_code, area_code, etc.). Likewise, there is no universal generic "location"; it has to be the location of something in particular. This is a result of the most fundamental law in logic, the law of identity.

Codes use what is called a nominal scale, which can be character strings. However, a location shouldn't be an integer; what math were you going to do with it?

If the number of available codes or locations is "small" (whatever that means these days), then you can use a CHECK() constraint with an IN() predicate
that looks at the list to see that the code or the location is valid.

Please post DDL and follow ANSI/ISO standards when asking for help.