Archive

Posts Tagged ‘UDF’

Sorting Strings Containing Numbers

January 20th, 2011 No comments

Sorting strings does not require a lot of coding in .NET or SQL. However when the strings contain related numbers, the sort order is not ‘logical’ according to the numeric sequence. E.g.

label-8-x-99
label-8-x-100
label-9-x-100
label-10-x-9

Would be sorted as:

label-10-x-9
label-8-x-100
label-8-x-99
label-9-x-100

Which is not according to the numeric sequence. Here is a method in C# which can be used for converting strings to they can be sorted in numeric order. Note that the numbers considered can have up to 10 digits in this case:

public static string NumericSortableString( string s )
{
    if ( String.IsNullOrEmpty( s ) )
        return s;
 
    const int MAX_DIGITS = 10;
    StringBuilder sb = new StringBuilder( s.Length );
    int iDigits = 0;
 
    for ( int i = s.Length - 1; i >= 0; --i )
    {
        if ( s[ i ] >= '0' && s[ i ] <= '9' )
            ++iDigits;
        else if ( iDigits > 0 )
        {
            for ( int j = iDigits; j < MAX_DIGITS; ++j )
                sb.Insert( 0, '0' );
                    iDigits = 0;
        }
        sb.Insert( 0, s[ i ] );
    }
    if ( iDigits > 0 )
        for ( int j = iDigits; j < MAX_DIGITS; ++j )
            sb.Insert( 0, '0' );
    return sb.ToString();
}

The same can be achieved on SQL Server using the following T-SQL Scalar valued function:

ALTER FUNCTION [dbo].[NumericSortableString]( @s NVARCHAR(MAX) ) RETURNS NVARCHAR(MAX) AS 
BEGIN
    DECLARE @o NVARCHAR(MAX);
    DECLARE @c NCHAR(1);
    DECLARE @Digits INT, @MaxDigits INT, @i INT;
 
    IF @s IS NULL OR @s = ''
        RETURN @s;
 
    SET @Digits = 0;
    SET @MaxDigits = 10;
    SET @i = LEN( @s );
    SET @o = '';
 
    WHILE @i > 0
    BEGIN
        SET @c = SUBSTRING( @s, @i, 1 );
        IF @c >= '0' AND @c <= '9'
            SET @Digits = @Digits + 1;
        ELSE IF @Digits > 0
        BEGIN
            IF @Digits < @MaxDigits
                SET @o = REPLICATE( '0', @MaxDigits - @Digits ) + @o;
            SET @Digits = 0;
        END
        SET @o = @c + @o;
        SET @i = @i - 1;
    END
    IF @Digits > 0 AND @Digits < @MaxDigits
        SET @o = REPLICATE( '0', @MaxDigits - @Digits ) + @o;
    RETURN @o;
END

SQL Server 2005/2008 CLR support by example: crypt() UDF in C#

.NET programmability support has been one of the big novelties of SQL Server 2005. Sometimes I had the impression that CLR support was also one of the most misunderstood features of the new platform. Some people considered moving their existing pre-2005 T-SQL code to .NET in an attempt to gain more efficiency by using a ‘compiled’ programming language.

This was a bit of a common misbelief, since stored T-SQL procedures and functions are also optimized/compiled prior to execution. Performance gains by using .NET code instead of T-SQL are usually limited to highly iterative/procedural code, and not for optimizing anything which can be easily done within native SQL.

Let us have a look at a simple user defined function written in C#. This function takes a clear text string and a 2 character salt parameter and calculates a Unix crypt hash based on both.

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction( IsDeterministic=true )]
    public static SqlString Crypt( SqlString sPlainText, SqlChars cSalt )
    {
        if ( sPlainText.IsNull || cSalt.IsNull || cSalt.Length < 2 )
            return SqlString.Null;
        string sSalt = new string( cSalt.Value );
        sSalt = sSalt.Substring( 0, 2 );
        return new SqlString( UnixCrypt.Crypt( sSalt, sPlainText.Value ) );
    }
}

We just declare a class with a static function qualified as a SqlFunction that accepts and returns classes/structs of the System.Data.SqlTypes namespace. The UnixCrypt.Crypt method contains the actual code which I am excluding here. If you are interested in building this example, you can find a working implementation of the crypt class in C# at The Code Project.

After compiling the above code into a dll, you may load it into SQL Server and run it with the following commands:

-- CLR support is disabled by default
EXEC sp_configure 'clr enabled' , '1'
GO
reconfigure;
GO
-- load dll and create assembly
CREATE ASSEMBLY asmYourAssembly FROM 'D:\Your\Path\to\CryptUDF.dll' WITH PERMISSION_SET = SAFE
GO
-- add a function that references the CLR UDF
CREATE FUNCTION dbo.Crypt( @plain AS NVARCHAR(200), @salt AS NCHAR(2) )
    RETURNS NVARCHAR(200) AS EXTERNAL NAME asmYourAssembly.UserDefinedFunctions.Crypt
GO
SELECT dbo.Crypt( 'password', 'xy' );

Note PERMISSION_SET = SAFE, which reflects the external dependencies on framework libraries. Since we are doing all our calculations within the library, there is nothing additional we need to reference, which permits us to be running within the safest permission set. You can learn more about the CLR integration permission sets on MSDN.

When doing authentication, you could use the UDF as follows:

SELECT @UserID=[UserID] FROM dbo.tblAccounts 
    WHERE [UserName]=@Username COLLATE Latin1_General_CI_AS
    AND dbo.Crypt( @Password, [Password] )=[Password] COLLATE Latin1_General_CS_AS

Note that we are doing a case insensitive compare on the UserName column and a case sensitive compare on the crypt hash of the crypt hash column.

Why does it make sense to do authentication like this inside the database instead of calculating the password hash outside of the database and then just pass it to a WHERE Username=@Username AND Password=@Password query?

The reason lies in the nature of the crypt function: the hash it creates is based on a two character prefix or salt, which has to be used for hashing the candidate password prior to comparing it with the original password hash. Since the above approach can be completely encapsulated within a stored procedure, no hashes or salts need to be returned to the calling application for verifying the correctness of the password.

Where is this used today? While Unix and Linux user authentication is the classic usage of the crypt function, also some internet services are relying on it. E.g. the PayPal IPN feature for selling subscription accounts is transmitting passwords as crypt hashes only.

Extending SQLite functionality using .NET (UDF)

April 13th, 2010 No comments

Although SQLite does not contain builtin support for directly writing user defined functions (UDF) in PL/SQL, it features a very fast and convenient way for extended functionality using your programming language of choice.

Let me illustrate this by example: In this case we are looking at a string transformation algorithm that allows for sorting a Url string according to the top level domain, domain name and host name. E.g. http://www.wikipedia.de/favicon.ico should be sorted according to http://de.wikipedia.www/favicon.ico.

Using PL/SQL with its limited string processing functions, this can result in a lot of coding. Using .NET, it’s easy to write and surprisingly fast.

[SQLiteFunction( Name = "ReverseDomain", Arguments = 1, FuncType = FunctionType.Scalar )]
public class ReverseDomain: SQLiteFunction
{
    public override object Invoke( object[] args )
    {
        if ( args == null || args.Length < 1 || args[ 0 ] == null )
            return null;
        string[] ss = args[ 0 ].ToString().Split( new string[] { "://", "/" }, 3 );
        if ( ss.Length > 1 )
        {
            string[] sss = ss[ 1 ].Split( new char[] { '.' } );
            Array.Reverse( sss );
            return ss[ 0 ] + "://" + String.Join( ".", sss ) + "/" + ( ss.Length == 3 ? ss[ 2 ] : String.Empty );
        }
        return args[ 0 ];
    }
}

Alternatively, we could use the UriBuilder class:

[SQLiteFunction( Name = "ReverseDomain", Arguments = 1, FuncType = FunctionType.Scalar )]
public class ReverseDomain: SQLiteFunction
{
    public override object Invoke( object[] args )
    {
        if ( args == null || args.Length < 1 || args[ 0 ] == null )
            return null;
        UriBuilder u = new UriBuilder( args[ 0 ] as string );
        string[] ss = u.Host.Split( new char[] { '.' } );
        Array.Reverse( ss );
        u.Host = String.Join( ".", ss );
        return u.ToString();
    }
}

Where is the catch with UriBuilder? Compared to the custom string splitting solution it is about 5 times slower. It can transform only about 65000 urls per second on my machine, while the prior method can convert 350000 urls in the same time.

How is this new UDF called from SQL?

SELECT * FROM UrlTable ORDER BY ReverseDomain( Url )