Mostly Polynomial: November 2009

Monday, November 30, 2009

Chicken feather hydrogen storage.

If this actually works, it is totally freaking awesome. Many fancy carbon structures can be formed by charring various organic wastes... even those that we don't know how to manufacture, e.g. fractal carbon structures.
Someone should try various bio materials for supercapacitors, in particular different kinds of charcoal.

Saturday, November 28, 2009

Horrible Hashes

Let's talk about djb2 hash function (which was a subject of topcoder contest, where it's choice rendered the contest far too trivial).

unsigned long
 hash(unsigned char *str)
    {
        unsigned long hash = 5381;
        int c;

        while (c = *str++)
            hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

        return hash;
    }

and another version which has
hash = ((hash << 5) + hash) ^ c;
(reference: http://www.cse.yorku.ca/~oz/hash.html)
The function itself is not bad for it's original use where very few lowest bits are used for bucketing in a hash table; but as a 32-bit hash, it stinks.

What's stupid is that if you search for djb2 on google, you see all sorts of people 'personally recommending' it as best and fastest simple hash, people trying to explain why it is good (when the answer is: it is not particularly good), people wondering why 5381 is better (it's not), people tracking the history of this "excellent" function, etc. All in all people presuming that 5381 and 33 got some special significance and are much better than e.g. 0 and 31.

What is so bad about it? For starters, even though the output of this function is 32 bits, not even for the 2 char alphanumeric ASCII strings do you have a guarantee for lack of collisions. In fact "cb" collides with "bC", in the version with addition, and "bA" collides with "ab" when using xor, just to name two examples out of hundreds. Each character except first provides only about 5 bits because that's how much you get out with *33.
That's not good. From a 32-bit hash, you would normally expect to get no collisions at all between 2 character strings, especially restricted to alphanumeric.
Most primes work no worse; you can use 257 and then your function at least will not collide on 2-character strings (it will still be crap though, especially if you use parts of hash; this doesn't need to be a prime, only needs to be odd and you ought to run code to select best for hashing some real data like list of all file names if you want a good number. I think 293 should be pretty good here). Furthermore, there are a lot of collisions between strings that differ by 2 characters, because 2 consecutive characters can be altered to keep same hash.

Got to give some credit though. In some very limited original usage (hash table of specific size, with specific key statistics, e.g. English words), which I do not know, and which you are highly unlikely to replicate, it may have been excellent. Or not too bad.

What is the significance of 5381 ? Apart from low 8 bits of 5381*33 (in the variation which has xor instead of add), it is pretty much totally irrelevant to collision resistance, it is just multiplied by 33ⁿ and added in. This function is pretty much as crap with start value of 5381 as with start value of 42 or 100 or 12345 - the only difference is that unexplained 5381 hints at some deep wisdom whereas 12345 does not.

All in all, you should not trust magical looking code. The best magical constants were selected for some very particular case which you know nothing about, by a method which you know nothing about, and are still most likely than not bad for whatever you want to do.
Do not trust internet advice or consensus either. Keep in mind that majority of acclaimed programming experts are experts at posting a lot of stuff online, being out to be noticed.
Keep in mind that majority of people in 'consensus' are simply repeating each other, and haven't devoted much brain time to thinking about the question (or the question they thought about may be a different question).

This is why science does not and cannot function by reference to authority, but only by reference to argument, to actual reasons, and why if no reasons are given you shouldn't assume that any exist.

edit: also, don't even get me started on "fast". If you want fast, you'd better do 4 chars at once, on a 32-bit machine.

edit: clarified on the version with + and version with ^, even though those have very similar properties.
edit: god damn that article sucked (I wrote it something like 8 years ago if not longer), rewrote a few bits.

Thursday, November 26, 2009

What's with all those design concepts?!

What's with all the infestation of internet with "green" design concepts that cannot work but win awards?
I mean designs such as
The amazing plaane of the future . (With a freaking wind turbine on the tail! You use wind turbine to power the engines with the energy of motion of airplane through air!)
Gravity lamp (debunked)
mp3 player powered by spinning finger in a hole.

The mp3 player one has not been debunked properly yet, but it is very similar to this lamp in that it utilizes common lack of intuitive relation of mechanical to electrical to sonic or light power.
Assuming hundred percent efficiency, headphones consume 20 milliwatts at max power, or 1.2 Joules per minute. 1 Joule is about the energy of 1kg lifted to height 10 centimeters. Spinning a finger in a hole is about the least ergonomic way to generate power; the smallest possible leverage in the least convenient way (and you can't spin player around the finger 'cause of headphone cable). Lifting 1kg to weight 10cm every minute by spinning finger in the hole is obviously out of question, unless you're doing it constantly. (Furthermore, the efficiency of mp3 player is far below 100% due to the power consumed by mp3 decoder).

All in all, a hand powered mp3 player, lamp, or other 'low power' appliance will need occasional but fairly vigorous spinning of a crank, squeezing, or vigorous shaking (e.g. during exercise). It would of course be very cool to power mp3 player by occasional turn of a finger, absolutely amazing in fact because such player would have to include a perpetual motion device, that's why this sort of stuff seems cool and amazing.

And while you're at it, forget about powering laptop by opening a lid or with power of keypresses - it is possible but such laptop a: won't have backlight, b: won't run any modern applications (think of having 1..10 mhz cpu , with the computational power of pc from 20 years ago), and c: but it would work for months on regular battery and could recharge by solar, rendering the whole keystroke power issue nill because solar panel is going to take less space.

On topic of energy saving measures, I limit ThePolynomial's fps to your display's refresh rate by default; this OpenGL feature doesn't seem to be supported on ATI under Windows (according to user reports, didn't test myself), which if true is really downright despicable behavior on AMD's part (they probably do it because of the few gamers whom would think ATI having 1000 fps and NVidia having 70 fps in games that syncs to refresh by default would make ATI look better, or some other silly marketing related reason for not implementing refresh syncing).

Wednesday, November 25, 2009

Multiplayer progress.

I've been working on multiplayer for my game for past few weeks, hence the gap in updates. Turns out good multiplayer is kind of difficult, and I'm somewhat behind my schedule, but expect updates soon.

Sunday, November 15, 2009

TopCoder: lying again.

background: TopCoder is a programing spec work business (spec work is called "crowdsourcing" nowadays). They also run some regular programming competitions (which are not work for hire), sometimes with okay problems, sometimes with so-so problems, sometimes with problems that 1/3 of participants can solve exactly. I competed there a little once in a while purely for self evaluation purposes - they do somehow have a big community, and there are a few good programmers on top to compete with.

Anyways, where was I... yes, TopCoder lying in their press releases.
It's interesting how a company can't learn a lesson that lying in public releases is not always a great idea. A while ago, they had hired some girl in china - she may have been a good choice for the job - I've no information about this - and then lied a shitton about her qualifications and achievements [see original TC's press release which was then echoed by girl's university] resulting in massive PR success followed by even more massive PR fail in the china, totally ruining girl's reputation. The lying, for a public release, was not very outstanding - just massive exaggerations, pretty standard for small company's public release, a small company has to look big, but it did ruin the girl's reputation 'cause of cultural misunderstanding, its not everywhere customary for a company to exaggerate how great their new hire is. On darker side, I bet they got her to sign their "affidavit" beforehand which explicitly forbids you from suing TopCoder for damages arising from this sort of misrepresentation of you. [you need to sign this at notary if you participate in competition and get a prize; that's quite serious. I won a prize at TopCoder once and asked for legal advice on their affidavit, a friend told me of that girl's story, which I remembered 'cause its really scary how individual could get chewed up by gears of commerce and spit out]

Recently, there had been a "NASA-TopCoder" contest with '25 000 $ in prizes'. It seemed a little strange.

The NASA-TopCoder Challenge will be the first time the TopCoder community of more than 220,000 software enthusiasts is utilized by the world's leading aerospace organization. Long-term human space missions such as those being planned for Mars, will require higher levels of pre-planning and more analysis of available data than ever before. Biometric modeling and simulation programs are algorithmically-intensive as flight surgeons explore and evaluate every possible medical scenario that might occur on long-term missions. In this experiment, competitors will develop algorithms to help NASA's flight surgeons make decisions involved with optimizing the contents of the medical supplies kit that may one day be carried onboard long-term space missions. The submissions will be compared with the results of an existing computer model that has simulated the expected medical occurrences and outcomes for various mission scenarios.

Under closer examination (I registered for the contest because I was rather curious and because invitation email didn't quite made it clear who funded the experiment), it turned out that it indeed was a business research experiment (25000$ from research grant from some business university were used to run 24 tiny contests in parallel for some sort of business research). Needless to say, there were no NASA representatives on contest forum answering the questions about problem or asking questions about solutions [correct me if any did show up since I lost the interest]. Nothing of this sort. Typical programming competition, with a typical competition problem that has only superficial resemblance to real requirements for real software. Very simple model - much simpler than your 'model' when you visit pharmacy and decide what to buy. In real life if you get a splinter under your skin, you will need tweezers to remove it. Then you can use hydrogen peroxide or you can use iodine, or other antiseptic, and if you don't treat the cut with antiseptic you might need to use topical antibiotic later to treat inflammation. That is not simulated in contest - the supplies are not ever interchangeable and medical conditions are not dependent on prior conditions and treatment (worse than that, them are totally statistically independent from prior conditions). It's absurd to think that contents of medical kit for a Mars mission would be based on such simplistic assumptions, so much more simplistic than the ones you'd make when you visit pharmacy. Yet participants would believe it because it's happier to believe you contribute something to space exploration.

Furthermore, interestingly enough, in the "community of more than 220,000 software enthusiasts", only about 1700 registered and only 400 participated in the contest.

As NASA source indicates, the truth boils down to this:

The competition originated when professor Karim R. Lakhani of Harvard Business School and professor Kevin Boudreau of London Business School invited NASA to provide a compelling technical challenge to monitor and analyze the results from an open innovation management perspective. Their research project is funded by grants from the London Business School M-Lab and the Harvard Business School.

, and naturally "topcoder asks for and gets a simple contest-style problem from NASA for use in their business experiment" is a whole bit less impressive than "NASA employs topcoder to solve something for human spaceflight".

Tuesday, November 10, 2009

Version 00h

Minor changes for more compatibility with various Linux distros, should work on ubuntu karmic now.

Sunday, November 8, 2009

Cutting a disk from aluminium...

So, I need to cut a disk from aluminium plate...

Various parts: pieces from a wooden box, motor and gears from canon i250 printer which I picked up for parts other day, outdoors plastic chair, transformer from very old tv for 7v power (printer's powersupply was missing), a paperclip, piece of file, drill bit.

It features a very advanced and hi tech design, with automatic pressure adjustment (when this thing jams, pressure on cutter decreases 'cause of pull on that rope). It's going to take a long while to cut through tho, motor isn't very powerful, but as long as I'm not spinning this by hand, I don't mind.

Side idea: with 3 old printers, you can make a neat CNC machine or even reprap machine. You'll only need to cut a thread onto the paper feeding rod, and use it to move the table. There's even a very precise position encoder here. If I find 2 more broken printers, I might make this.

Mostly Polynomial