unsigned longand another version which has
hash(unsigned char *str)
unsigned long hash = 5381;
while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
hash = ((hash << 5) + hash) ^ c;
The function itself is not bad for it's original use where very few lowest bits are used for bucketing in a hash table; but as a 32-bit hash, it stinks.
What's stupid is that if you search for djb2 on google, you see all sorts of people 'personally recommending' it as best and fastest simple hash, people trying to explain why it is good (when the answer is: it is not particularly good), people wondering why 5381 is better (it's not), people tracking the history of this "excellent" function, etc. All in all people presuming that 5381 and 33 got some special significance and are much better than e.g. 0 and 31.
What is so bad about it? For starters, even though the output of this function is 32 bits, not even for the 2 char alphanumeric ASCII strings do you have a guarantee for lack of collisions. In fact "cb" collides with "bC", in the version with addition, and "bA" collides with "ab" when using xor, just to name two examples out of hundreds. Each character except first provides only about 5 bits because that's how much you get out with *33.
That's not good. From a 32-bit hash, you would normally expect to get no collisions at all between 2 character strings, especially restricted to alphanumeric.
Most primes work no worse; you can use 257 and then your function at least will not collide on 2-character strings (it will still be crap though, especially if you use parts of hash; this doesn't need to be a prime, only needs to be odd and you ought to run code to select best for hashing some real data like list of all file names if you want a good number. I think 293 should be pretty good here). Furthermore, there are a lot of collisions between strings that differ by 2 characters, because 2 consecutive characters can be altered to keep same hash.
Got to give some credit though. In some very limited original usage (hash table of specific size, with specific key statistics, e.g. English words), which I do not know, and which you are highly unlikely to replicate, it may have been excellent. Or not too bad.
What is the significance of 5381 ? Apart from low 8 bits of 5381*33 (in the variation which has xor instead of add), it is pretty much totally irrelevant to collision resistance, it is just multiplied by 33n and added in. This function is pretty much as crap with start value of 5381 as with start value of 42 or 100 or 12345 - the only difference is that unexplained 5381 hints at some deep wisdom whereas 12345 does not.
All in all, you should not trust magical looking code. The best magical constants were selected for some very particular case which you know nothing about, by a method which you know nothing about, and are still most likely than not bad for whatever you want to do.
Do not trust internet advice or consensus either. Keep in mind that majority of acclaimed programming experts are experts at posting a lot of stuff online, being out to be noticed.
Keep in mind that majority of people in 'consensus' are simply repeating each other, and haven't devoted much brain time to thinking about the question (or the question they thought about may be a different question).
This is why science does not and cannot function by reference to authority, but only by reference to argument, to actual reasons, and why if no reasons are given you shouldn't assume that any exist.
edit: also, don't even get me started on "fast". If you want fast, you'd better do 4 chars at once, on a 32-bit machine.
edit: clarified on the version with + and version with ^, even though those have very similar properties.
edit: god damn that article sucked (I wrote it something like 8 years ago if not longer), rewrote a few bits.