Saturday, May 10, 2008

Efficiently Copying Items from One Array to Another in C# (Also generally in .NET)

I've been searching recently for an efficient way in C# to copy items from an array to another. At first it seemed that efficient way simply didn't exist. In C++ you can do this very easily by copying the memory block using memcpy() or memmove() but in C# you can't do this in a safe way (sure you can write some unsafe code but this is not really what I wanted, I just wanted to do it the 'clean and safe way'). Anyway, my options were to use Array.Copy() or a for loop (iterate through the source array and copy the items to the destination array). Unfortunately the documentation for Array.Copy() clearly states that it's an O(n) operation where n is length, so, it seemed that the for loop and Array.Copy() were almost equivalent so I decided I could use any of them as it wouldn't make any real difference (my decision was based on what's written in the documentation about Array.Copy() being an O(n) operation)

Anyway, today I was using .NET Refelctor to view the code for List<T>.ToArray(), I found that the Array.Copy() was used in the code, see below the code for List<T>.ToArray() copied from .NET Reflector:

public T[] ToArray()
{
    T[] destinationArray = new T[this._size];
    Array.Copy(this._items, 0, destinationArray, 0, this._size);
    return destinationArray;
}

So, I wanted to have a look on Array.Copy(), this is how it shows in .NET Refelctor:

[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
public static void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length)
{
    Copy(sourceArray, sourceIndex, destinationArray, destinationIndex, length, false);
}

The code for Copy(sourceArray, sourceIndex, destinationArray, destinationIndex, length, false); can't be viewed using .NET Reflector, here's what you get:

[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);

To make a long story short I just was curious to see whether using Array.Copy() is really equivalent to using a simple for loop, so, motivated by my inability to view the code for Array.Copy(),I decided to do my own testing (see the code and results below). Well, despite what's written in the documentation about Array.Copy() being an O(n) operation, Array.Copy() was 10 times faster on my computer! So my guess is that it internally uses some memory copying so it's not really an O(n) operation.

Update:

Just to be clear, when I said the operation is not O(n), I was referring to the implementation not using a for loop to copy the items (ie. from the performance point of view). I realize it's technically incorrect to say it's not O(n). My apology for the confusion.

More details:

The release build was used in the test
Software used in the test:
  • Windows XP Pro SP2
  • .NET Framework 2.0

The specs of my computer (where the test was done) are:
  • CPU Intel 1.6GHz Dual Core
  • 1024 MB RAM


Used Code
using System;
using System.Collections.Generic;
using System.Text;

namespace array_copy
{
    class Program
    {
        static void Main(string[] args)
        {
            const int ArraySize = 5000;
            const int LoopCount = 100000;

            MyCustomObject[] customObjects1 = new MyCustomObject[ArraySize];
            MyCustomObject[] customObjects2 = new MyCustomObject[ArraySize];
            for (int i = 0; i < ArraySize; i++)
            {
                customObjects1[i] = new MyCustomObject(i);
                customObjects2[i] = new MyCustomObject(i);
            }

            // Array.Copy()
            //
            DateTime beforeArrayCopy = DateTime.Now;
            for (int j = 0; j < LoopCount; j++)
            {
                Array.Copy(customObjects1, 0, customObjects2, 1, customObjects1.Length - 1);
            }
            DateTime afterArrayCopy = DateTime.Now;

            // For Loop Copy
            //
            DateTime beforeForLoopCopy = DateTime.Now;
            for (int j = 0; j < LoopCount; j++)
            {
                int length = customObjects1.Length - 1;
                for (int i = 0; i < length; i++)
                {
                    customObjects2[i + 1] = customObjects1[i];
                }
            }
            DateTime afterForLoopCopy = DateTime.Now;

            Console.WriteLine("{0} Array.Copy()", afterArrayCopy.Subtract(beforeArrayCopy));
            Console.WriteLine("{0} For Loop Copy", afterForLoopCopy.Subtract(beforeForLoopCopy));

            Console.Read();
        }
    }

    public class MyCustomObject
    {
        public MyCustomObject(int var)
        {
            _var = var;
        }

        private int _var;
        public int Var
        {
            get
            {
                return _var;
            }
        }
    }
}
Results: 00:00:00.5937500 Array.Copy()
00:00:04.9062500 For Loop Copy

5 comments:

  1. Interesting And why is that? Why the difference?

    ReplyDelete
  2. Well, as I mentioned in my post the implementation of Array.Copy() can't be viewed using Reflector so it's not very clear why the difference, my guess is that it internally uses memory copying.

    ReplyDelete
  3. It is still O(n). You want to get n bytes from one memory block to another? Well, you have to look at and copy each of the n bytes, which sounds like O(n) to me.

    The speed difference is probably because Array.Copy is really just memcpy or some equivalent low-level implementation (note the extern).

    ReplyDelete
  4. I am afraid that O(n) is not what you think it is

    ReplyDelete
  5. @Anonymous, thanks for the comment and sorry for the confusion. I updated the post to explain what I meant.

    ReplyDelete