Vadim's Weblog

Never stop learning.

Explaining GetHashCode method.

Posted by Vadim on July 5, 2008

Every object you ever created or used in .NET has GetHashCode method along with Equals, GetType, and ToString methods.  This method is an instance method of an Object class from which any other class derives.

GetHashCodeIntelisense

GetHashCode() method returns an integer that identifies an object instance.  I also can repeat MSDN documentation but you can read it on your own.

You would want to override this method along with Equals() method in your class if the object of your class is going to be used as a key in Hashtable.

The best way to explain it is to show an example. In I was using class Point as an example.  I’m going to continue with this class.  So let assume that we have class like this:

   1: public class Point
   2: {
   3:   private readonly int _x;
   4:   private readonly int _y;
   5:
   6:   public Point(int x, int y)
   7:   {
   8:     _x = x;
   9:     _y = y;
  10:   }
  11: }

Next we want to use Point as a key in a Hashtable.  Remember that the key of the Hasthable must be unique.  In the code bellow we have two identical keys (line 2 & 4). We should expect ArgumentException.

   1: Hashtable hashtable = new Hashtable();
   2: hashtable.Add(new Point(2, 3), "Point 1");
   3: hashtable.Add(new Point(5, 3), "Point 2");
   4: hashtable.Add(new Point(2, 3), "Point 3");

However, no exception was thrown.  There are two reasons why exception is not thrown:

  1. Equals() method that is also an instance method of Object class will always indicate that Point in line 2 and line 4 are different. This expected because two objects have different references.
  2. GetHashCode() method returns different number for these objects.

Try to execute code bellow:

   1: static void Main(string[] args)
   2:  {
   3:    var point1 = new Point(2, 3);
   4:    var point2 = new Point(2, 3);
   5:    Console.WriteLine("point1 Hash: {0}"
   6:         , point1.GetHashCode());
   7:    Console.WriteLine("point2 Hash: {0}"
   8:         , point2.GetHashCode());
   9:    Console.WriteLine("point1 equal to point2: {0}"
  10:         , point1.Equals(point2));
  11:  }

You will get output similar to this one:

point1 Hash: 58225482
point2 Hash: 54267293
point1 equal to point2: False

You can see that we got different hash codes.

To solve this problem we override Equal() and GetHashCode() in our Point class.

First we override Equals() method:

   1: public override bool Equals(object obj)
   2: {
   3:   if (ReferenceEquals(null, obj)) return false;
   4:   if (ReferenceEquals(this, obj)) return true;
   5:   return ((Point)obj)._x == _x && ((Point)obj)._y == _y;
   6: }

You can see that I use ReferenceEquals() method that is static method of Object class.  It checks if the specified objects are the same.

Next we override GetHashCode() method:

   1: public override int GetHashCode()
   2: {
   3:   unchecked
   4:   {
   5:     return (_x * 397) ^ _y;
   6:   }
   7: }

The whole Point would look like this:

   1: public class Point
   2: {
   3:   private readonly int _x;
   4:   private readonly int _y;
   5:
   6:   public Point(int x, int y)
   7:   {
   8:     _x = x;
   9:     _y = y;
  10:   }
  11:
  12:   public override bool Equals(object obj)
  13:   {
  14:     if (ReferenceEquals(null, obj)) return false;
  15:     if (ReferenceEquals(this, obj)) return true;
  16:     return ((Point)obj)._x == _x && ((Point)obj)._y == _y;
  17:   }
  18:
  19:   public override int GetHashCode()
  20:   {
  21:     unchecked
  22:     {
  23:       return (_x * 397) ^ _y;
  24:     }
  25:   }
  26: }

Now if you try to execute the application the output should look exactly like this:

point1 Hash: 793
point2 Hash: 793
point1 equal to point2: True

Also if we try to add new Point(2, 3) twice, we get an ArgumentException as expected.

kick it on DotNetKicks.com

About these ads

14 Responses to “Explaining GetHashCode method.”

  1. Chris said

    How did you arrive at (x_ * 397) ^ _y for the value of Point.GetHashCode()? Why not (x_ * 396) ^ _y, for example, or (_x + _y).GetHashCode()?

  2. Chris said

    Err, (_x.ToString() + _y.ToString()).GetHashCode()

  3. Vadim said

    Chris,

    (_x + _y).GetHashCode() is not a good algorithm for a Hash Code. If _x = 2 and _y = 3, you will get the same hash code for:

    (2 + 3).GetHashCode() and (3 + 2).GetHashCode().

    Unfortunately I cannot answer your question about 397. I’ve seen that people smarter than I’m were using this magical number so I shamelessly copied it.

    If you or someone else will figure out the mystery behind 397, please drop me note.

  4. abc_d said

    It’s just a prime that’s big enough to overflow.

  5. Vlad said

    This will be the best Hash for a point
    ((_x % Int16.MaxValue) << 16) + (_y % Int16.MaxValue)

  6. NBA said

    Aside from the flawed implementation of the hashcode, the implementation of Equals is incorrect – it will throw an exception if the object being compared to is not an instance of Point. You should check for this as well, e.g.
    if (ReferenceEquals(null, obj) && !(obj is Point)) return false;

  7. Diego Deberdt said

    The .NET framework documentation for GetHashCode(http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx) has some examples on how to calculate the hashcode in various classes. There does not seem to be a need to use a magic number…

  8. ElLoco said

    GetHashCode() is assumed to be cheap compared to Equals().

    Guess you have may many other classes with 2 numbers and they all implement
    GetHashCode as _x + _y

    In this case GetHashCode would be a bad indicator for equality and Equals() would be called many times when the objects are in fact totally different.

    That is you use *your prime* (397 here) as the random part to make sure that your class generates not the same hashcode as all the other classes with 2 properties.

  9. Julio said

    Thanks for your post I’ve found it very useful. I may be way off but I believe the explanation of (x_ * 397) ^ _y
    is as follow. ^ is a bitwise operator (XOR) which will return true in the following formula x_ ^ _y x=2,y=3 and x=3,y=2 As you can see the GetHashCode() function will return the same number for two points that only are different in the order of their value. Theres where the magic number comes into play if you multiple only one side of the equation by the number the result will be different if you interchange the order of value[(x=2,y=3) (x_ *397)^ _y] [(x=3,y=2) (x_ *397)^ _y] Does this make sense?

  10. Julio said

    what i meant was [(x=2,y=3) (x_ *397)^ _y] != [(x=3,y=2) (x_ *397)^ _y]

  11. Lev said

    Thank you for writing this article. It was helpful for me. Just a note, I think you meant to say override the GetHashCode() method, but you typed GetHashTable().
    - Lev

  12. Rahul said

    Thank you for taking the time to write this article in such clear terms.

    Thank you also to the folks who took time out to write the follow-up comments.

  13. Thanks for this post. Just a quick question – what about Eric Lippert’s guideline (referred to below as The Guidline) that “the integer returned by GetHashCode() should never change”? http://blogs.msdn.com/b/ericlippert/archive/2011/02/28/guidelines-and-rules-for-gethashcode.aspx

    To me, this seems to be contradictory to his first rule (referred to later as The Rule), that two objects considered equal should return the same hash code. In the sense that using your Point class above, I could define Point p(1,1) and Point q(2,2) –> which would most likely have different hash codes. but then if i say p._x = 2; p._y = 2 –> then the hashcode of at least one of them should change according The Rule, but this directly contradicts The Guideline.

    Any light you can shed on this would be great.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: