.Net, C#, Coding, GetHashTable, VS2005, VS2008

Explaining GetHashCode method.

Every object you ever created or used in .NET has GetHashCode method along with Equals, GetType, and ToString methods.  This method is an instance method of an Object class from which any other class derives.

GetHashCodeIntelisense

GetHashCode() method returns an integer that identifies an object instance.  I also can repeat MSDN documentation but you can read it on your own.

You would want to override this method along with Equals() method in your class if the object of your class is going to be used as a key in Hashtable.

The best way to explain it is to show an example. In I was using class Point as an example.  I’m going to continue with this class.  So let assume that we have class like this:

   1: public class Point
   2: {
   3:   private readonly int _x;
   4:   private readonly int _y;
   5:
   6:   public Point(int x, int y)
   7:   {
   8:     _x = x;
   9:     _y = y;
  10:   }
  11: }

Next we want to use Point as a key in a Hashtable.  Remember that the key of the Hasthable must be unique.  In the code bellow we have two identical keys (line 2 & 4). We should expect ArgumentException.

   1: Hashtable hashtable = new Hashtable();
   2: hashtable.Add(new Point(2, 3), "Point 1");
   3: hashtable.Add(new Point(5, 3), "Point 2");
   4: hashtable.Add(new Point(2, 3), "Point 3");

However, no exception was thrown.  There are two reasons why exception is not thrown:

  1. Equals() method that is also an instance method of Object class will always indicate that Point in line 2 and line 4 are different. This expected because two objects have different references.
  2. GetHashCode() method returns different number for these objects.

Try to execute code bellow:

   1: static void Main(string[] args)
   2:  {
   3:    var point1 = new Point(2, 3);
   4:    var point2 = new Point(2, 3);
   5:    Console.WriteLine("point1 Hash: {0}"
   6:         , point1.GetHashCode());
   7:    Console.WriteLine("point2 Hash: {0}"
   8:         , point2.GetHashCode());
   9:    Console.WriteLine("point1 equal to point2: {0}"
  10:         , point1.Equals(point2));
  11:  }

You will get output similar to this one:

point1 Hash: 58225482
point2 Hash: 54267293
point1 equal to point2: False

You can see that we got different hash codes.

To solve this problem we override Equal() and GetHashCode() in our Point class.

First we override Equals() method:

   1: public override bool Equals(object obj)
   2: {
   3:   if (ReferenceEquals(null, obj)) return false;
   4:   if (ReferenceEquals(this, obj)) return true;
   5:   return ((Point)obj)._x == _x && ((Point)obj)._y == _y;
   6: }

You can see that I use ReferenceEquals() method that is static method of Object class.  It checks if the specified objects are the same.

Next we override GetHashCode() method:

   1: public override int GetHashCode()
   2: {
   3:   unchecked
   4:   {
   5:     return (_x * 397) ^ _y;
   6:   }
   7: }

The whole Point would look like this:

   1: public class Point
   2: {
   3:   private readonly int _x;
   4:   private readonly int _y;
   5:
   6:   public Point(int x, int y)
   7:   {
   8:     _x = x;
   9:     _y = y;
  10:   }
  11:
  12:   public override bool Equals(object obj)
  13:   {
  14:     if (ReferenceEquals(null, obj)) return false;
  15:     if (ReferenceEquals(this, obj)) return true;
  16:     return ((Point)obj)._x == _x && ((Point)obj)._y == _y;
  17:   }
  18:
  19:   public override int GetHashCode()
  20:   {
  21:     unchecked
  22:     {
  23:       return (_x * 397) ^ _y;
  24:     }
  25:   }
  26: }

Now if you try to execute the application the output should look exactly like this:

point1 Hash: 793
point2 Hash: 793
point1 equal to point2: True

Also if we try to add new Point(2, 3) twice, we get an ArgumentException as expected.

kick it on DotNetKicks.com

15 thoughts on “Explaining GetHashCode method.

  1. How did you arrive at (x_ * 397) ^ _y for the value of Point.GetHashCode()? Why not (x_ * 396) ^ _y, for example, or (_x + _y).GetHashCode()?

  2. Chris,

    (_x + _y).GetHashCode() is not a good algorithm for a Hash Code. If _x = 2 and _y = 3, you will get the same hash code for:

    (2 + 3).GetHashCode() and (3 + 2).GetHashCode().

    Unfortunately I cannot answer your question about 397. I’ve seen that people smarter than I’m were using this magical number so I shamelessly copied it.

    If you or someone else will figure out the mystery behind 397, please drop me note.

  3. Aside from the flawed implementation of the hashcode, the implementation of Equals is incorrect – it will throw an exception if the object being compared to is not an instance of Point. You should check for this as well, e.g.
    if (ReferenceEquals(null, obj) && !(obj is Point)) return false;

  4. GetHashCode() is assumed to be cheap compared to Equals().

    Guess you have may many other classes with 2 numbers and they all implement
    GetHashCode as _x + _y

    In this case GetHashCode would be a bad indicator for equality and Equals() would be called many times when the objects are in fact totally different.

    That is you use *your prime* (397 here) as the random part to make sure that your class generates not the same hashcode as all the other classes with 2 properties.

  5. Thanks for your post I’ve found it very useful. I may be way off but I believe the explanation of (x_ * 397) ^ _y
    is as follow. ^ is a bitwise operator (XOR) which will return true in the following formula x_ ^ _y x=2,y=3 and x=3,y=2 As you can see the GetHashCode() function will return the same number for two points that only are different in the order of their value. Theres where the magic number comes into play if you multiple only one side of the equation by the number the result will be different if you interchange the order of value[(x=2,y=3) (x_ *397)^ _y] [(x=3,y=2) (x_ *397)^ _y] Does this make sense?

  6. Thank you for writing this article. It was helpful for me. Just a note, I think you meant to say override the GetHashCode() method, but you typed GetHashTable().
    – Lev

  7. Thank you for taking the time to write this article in such clear terms.

    Thank you also to the folks who took time out to write the follow-up comments.

  8. Thanks for this post. Just a quick question – what about Eric Lippert’s guideline (referred to below as The Guidline) that “the integer returned by GetHashCode() should never change”? http://blogs.msdn.com/b/ericlippert/archive/2011/02/28/guidelines-and-rules-for-gethashcode.aspx

    To me, this seems to be contradictory to his first rule (referred to later as The Rule), that two objects considered equal should return the same hash code. In the sense that using your Point class above, I could define Point p(1,1) and Point q(2,2) –> which would most likely have different hash codes. but then if i say p._x = 2; p._y = 2 –> then the hashcode of at least one of them should change according The Rule, but this directly contradicts The Guideline.

    Any light you can shed on this would be great.

  9. The idea of hashcoding is to split a collection to a set of small buckets. The hach code is the index of the bucket that contain the object.
    The hash coded collection will be represented like this:

    hash code 1 ===> ———————————————————————–
    | Object 1 | Object 2 | Object 3 | Object 4 |
    ———————————————————————–

    hash code 2 ===> ———————————————————————–
    | Object 5 | Object 6 | Object 7 | Object 8 |
    ———————————————————————–

    ….

    hash code n ===> ————————————————————————-
    | Object n-3 | Object n-2 | Object n-1 | Object n |
    ————————————————————————–

    We have to make the right choice of hash code to dispatch the data by the right way.
    So the hash code will help to organize this structure but it is not mandatory to be unique for each object. The equatibility will be calculate using the Equals method.

Leave a reply to ElLoco Cancel reply