Storing IP Addresses in SQL Server
Efficient storage of IP addresses can save a lot of disk and memory space while at the same time increasing query performance. Also it enables you to further process data based on subnets and network ranges.
Any IPv4 address can be represented by 32 bits. The “dotted-decimal notation” – which is the most widely known format – is just one form of writing the four bytes separated by dots. In this string representation, an IP address will consist of 7 (x.x.x.x) to 15 (xxx.xxx.xxx.xxx) characters. When encoded with an 8-bit character set, the storage space required is therefore at least 15 bytes, in contrast to 4 bytes for the binary representation. In SQL Server 2008, storing a table with a single INT column requires about 13 bytes of storage per row, whereas storing an IP address in VARCHAR(15) format requires about 26 bytes per row. Storing the address in four TINYINT columns or in a single 4 byte BINARY column also results in the same storage size as INT.
Besides saving space, storing in binary format will also allow for more efficient and smaller indexing, and hence result in faster lookup times.
As an added benefit, the binary representation enables us to perform searches based on subnets and arbitrary network ranges which is very difficult and slow when using dotted-decimal notation. You can find a real-life example in this post: Geolocation of IP Addresses.
Storing IP addresses in binary (integer) format usually requires conversion from/to the dotted-decimal notation. So the key question is how to transparently handle conversion for external applications. For us, using user defined scalar functions (UDF) has proven to be the most efficient way. Since there is no unsigned 32-bit integer format in SQL Server, we are converting all addresses into a signed integer format which only requires 4 bytes of storage and allows for efficient indexing.
CREATE FUNCTION [dbo].[GetLongIP] ( @StringIP AS VARCHAR(15) ) RETURNS INT BEGIN DECLARE @LongIP AS INT DECLARE @StartPos AS SMALLINT SET @LongIP = 0 SET @StartPos = 1 IF LEN( @StringIP ) > 15 OR LEN( @StringIP ) < 7 RETURN 0 ELSE BEGIN IF CHARINDEX( '.', @StringIP, @StartPos ) > 1 IF ISNUMERIC( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) ) = 1 IF CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT) <= 0xFF IF CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT ) >= 0x00 BEGIN SET @LongIP = (CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT ) - 0x80 ) * 0x01000000 SET @StartPos = CHARINDEX( '.', @StringIP, @StartPos ) + 1 END IF CHARINDEX( '.', @StringIP, @StartPos ) > 1 IF ISNUMERIC( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) ) = 1 IF CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT) <= 0xFF IF CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT ) >= 0x00 BEGIN SET @LongIP = @LongIP + CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT ) * 0x00010000 SET @StartPos = CHARINDEX( '.', @StringIP, @StartPos ) + 1 END IF CHARINDEX( '.', @StringIP, @StartPos ) > 1 IF ISNUMERIC( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) ) = 1 IF CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT) <= 0xFF IF CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT ) >= 0x00 BEGIN SET @LongIP = @LongIP + CAST( SUBSTRING( @StringIP, @StartPos, CHARINDEX( '.', @StringIP, @StartPos ) - @StartPos ) AS INT ) * 0x00000100 SET @StartPos = CHARINDEX( '.', @StringIP, @StartPos ) + 1 END IF ISNUMERIC( SUBSTRING( @StringIP, @StartPos, LEN( @StringIP ) - @StartPos + 1 ) ) = 1 IF CAST( SUBSTRING( @StringIP, @StartPos, LEN( @StringIP ) - @StartPos + 1 ) AS INT) <= 0xFF IF CAST( SUBSTRING( @StringIP, @StartPos, LEN( @StringIP ) - @StartPos + 1 ) AS INT ) >= 0x00 SET @LongIP = @LongIP + CAST( SUBSTRING( @StringIP, @StartPos, LEN( @StringIP ) - @StartPos + 1 ) AS INT ) END RETURN @LongIP END
The above looks a bit funky, but it does the job. There is also a hack using the PARSENAME system function, which should be mentioned as well. As an added benefit, the following UDF is more than 3 times faster than the above.
CREATE FUNCTION [dbo].[GetLongIP] ( @StringIP AS VARCHAR(15) ) RETURNS INT BEGIN RETURN ( ( CAST( PARSENAME( @StringIP, 4 ) AS INT ) - 0x80 ) * 0x01000000 + CAST( PARSENAME( @StringIP, 3 ) AS INT ) * 0x00010000 + CAST( PARSENAME( @StringIP, 2 ) AS INT ) * 0x00000100 + CAST( PARSENAME( @StringIP, 1 ) AS INT ) ) END
Converting IP addresses from signed integer format to dotted-decimal notation is straightforward:
CREATE FUNCTION [dbo].[GetStringIP] ( @IP AS INT ) RETURNS VARCHAR(15) BEGIN RETURN ( CAST( (@IP & 0xff000000)/0x01000000 + 0x80 AS VARCHAR ) + '.' + CAST( (@IP & 0x00ff0000)/0x00010000 AS VARCHAR ) + '.' + CAST( (@IP & 0x0000ff00)/0x00000100 AS VARCHAR ) + '.' + CAST( (@IP & 0x000000ff) AS VARCHAR ) ) END
Both UDFs are compatible with SQL Server 2000 and newer, and have been used in a production environment for almost a decade.
When using SQL Server 2000, you may also consider to use an extended stored procedure (XP) for this purpose. An XP is a dynamic link library written in native C. This dll could make use of the itoa() and atoi() APIs or use some custom parsing code. But unless you need to pull the last bit of performance out of your system, using an XP for this purpose is discouraged due to potential security implications.
With SQL Server 2005/2008 extending functionality has become much easier and less of a security concern due to the introduction of managed CLR functions that can be invoked in the same way as a UDF. In case you are interested, here is some C# code to do the job:
private static int GetLongIP( string s ) { string[] ss = s.Split( new char[] { '.' }, 4 ); return ( ( Int32.Parse( ss[ 0 ] ) - 0x80 ) << 24 ) | ( Int32.Parse( ss[ 1 ] ) << 16 ) | ( Int32.Parse( ss[ 2 ] ) << 8 ) | Int32.Parse( ss[ 3 ] ); } private static string GetStringIP( int i ) { return String.Format( "{0}.{1}.{2}.{3}", ( i >> 24 ) + 0x80, ( i >> 16 ) & 0xFF, ( i >> 8 ) & 0xFF, i & 0xFF ); }
This will run within a SAFE (default) permission set, since it does not reference any additional name spaces or libraries. You might be tempted to save even more work by calling member methods of the IPAddress class in the System.Net namespace, but that would require your code to run with EXTERNAL_ACCESS or UNSAFE permission set.