Thursday, September 2, 2010

Making Test SQL Data

Today I had the need to make a lot of dummy rows to test performance of a system. Often the case is when you design a query it performs great with a few thousand rows, but over the years as your data grows, things slow down. So, today I wanted to test my schema with a larger dataset. The first question was how to do that. SQL performs much better with sets, not iterations. The thought of running an INSERT statement millions of times did not appeal to me. Here's some tricks to make a lot of data "relatively" fast.  I say relatively because this will still take many minutes, but not many hours.  Remember, SQL is SET based!  Use sets whenever possible!


-- Don't want 25M "1 Row inserted" messages!
SET nocount ON

-- Some timing stuff, see how fast it is.
DECLARE @a DATETIME,
        @b DATETIME

SET @a = current_timestamp

-- Create a temporary table to create a multiplier factor.
DECLARE @myTable TABLE (id INT)

-- Add 5000 rows to our dummy table.
DECLARE @counter INT

SET @counter = 0
WHILE @counter < 5000
  BEGIN
      SET @counter = @counter + 1
      INSERT INTO @myTable VALUES (@counter)
  END

-- This is the big multiplier.. a cartesian product is 5000*5000=25M rows!  We 
-- don't even need to select from the tables, but if we want we could use the
-- numbers for something..
INSERT INTO uw_containers (type_id)
SELECT 1 FROM   @myTable a, @myTable b

-- Now create a history event for each of the new containers.
INSERT INTO uw_container_history
            (containerid,label,effectivestart,effectiveend,eventstartid)
SELECT  

    id,'C' + CAST(id AS varchar(50)),Getdate(),'12/31/9999',139059
FROM 

    uw_containers


SET @b = current_timestamp
SELECT Datediff(ms, @a, @b)