Hashing
Today I spent some time thinking about hashing in SymPy. Hashing is a central part of sympy and is implemented in the core. Now you might be thinking why is hashing so important in a CAS (Computer Algebra System) ... I'll try to explain: when you do things like "x+1 == 1+x" actualy you are comparing the hashes of these two objects (1+x and x+1). Now you might be wondering why does this return True and not False, because obviously these two expressions (x+1 and 1+x) are different so their hash should be different. The answer is because first the expressions is evaluated and the hashed, so the path that these objects follow is: x+1 -> evaluated -> 1+x -> hashed -> returns an int, and on the other side x+1 -> evaluated -> x+1 -> hashed -> returns an int. Now these two int's are equal, so we say that the two objects are equal. Now before revision 803, we had a .hash() method on every single class that inherits from Basic, which was really cumbersome, so I submitted a patch that implemented a generic .hash() method based on the string representation in class Basic, so all you have to do now is inherit from Basic and you'll get a hashing method that 'just works' across sympy. But unfortunately, it has some drawbacks:
- it isn't very strong
- it's a bit slower (like a 5%)
- It doesn't scale (neither did the previous algorithm)
The two first are can be easily corrected, but the third one is a bit more difficult to solve, due to way Add.eval() and Mul.eval() work. I'll put an example: the following code
In [7]: f = 0 In [8]: for i in range(1, 100): ... f += x**i
(executed from isympy) creates a polynomial with 100 terms, and takes 7 seconds to execute in my 2.0Ghz AMD. The answer to this slowness might be in the fact that each time you add a term to f, f is completely re-evaluated, re-ordered, and re-hashed (and not updated, which should be much faster). In fact, the time involved to sum more terms grows exponentially, to the point that trying to create a polynomial in 1000 variables takes at least 10 minutes (i just stopped it beyond that point). For more info, see issue 52. That was all for now (my first post in English!), must go to sleep, tomorrow we play with my band in jail (it is not a joke, we play for some non-profit organization)