Sunday, July 26, 2015

Marathon Writer's Memoir. How it all began

Last week I had an anniversary to celebrate: 8 years ago TopCoder accepted my first Marathon problem. This marked a huge milestone in my life; I was approved as an SRM problem writer several weeks before that, but I dare say I've become a much better Marathon writer than an Algorithm one. Certainly a more prolific one: over the years I've written over 30 Marathon problems but only a dozen Single Round Matches (and half of them were not even adult SRMs but so called "High School SRMs", targeted at a younger and less experienced audience).

Of course, the most memorable of all these problems are the first ones I've created...

#1 DeepMining (TCCC07 Round 2)

My first accepted Marathon problem was based on a Flash game called Motherload. The very first idea I sent Lars Backstrom (Lars was Marathon problem coordinator till 2010) was based on a different game called Feeding Frenzy. It is a fun game to play, but impersonating a fish trying to avoid being eaten is probably not how most people want to spend a whole week (at that point matches used to last a week), and besides, the concept was similar to recently used pacman problem. Fortunately, the fishy idea has never seen light. We agreed on plan B, a mining machine-operating problem instead: the player operates a machine digging through the area gathering minerals, and the objective is to maximize the value of the minerals hauled to the surface before running out of fuel.

As a lot of first-ever things do, this match was not exactly a success.

First of all, it had a long and evolved problem statement with lots of implementation details to be clarified. It involved a randomly generated map, and back in 2007 problems included only a visualizer (a .jar file that allowed to test solutions locally) but not the visualizer source code, so people were unable to use it to understand the inner workings of the judging system. That meant that every little detail had to be clarified by the writer in the forums! On the night the match was launched I stayed up for five hours straight answering questions, and the new ones were arriving faster than I answered the old ones. Unfortunately, this was not an indication of a big success, quite the contrary. This match was not just any contest but the second elimination round of TopCoder Collegiate Challenge (TCCC used to be an annual tournament similar to TopCoder Open, but for college students only), and the frenzy in the forums was caused by participants desperate to try and advance to the next round struggling to make sense of an underspecified statement.

But even worse, the implementation of the visualizer (and the judging code as well) left much to be desired. It had several small glitches which we had to fix during the contest; fortunately, Marathon matches are more forgiving to such glitches than SRMs, as they are longer. It was not user-friendly (that week I learned to never use " " as a meaningful character in the input data - it can be worked around but it's frustrating and it doesn't add any value to the problem). Worst of all, it had a major bug, which I'm still ashamed of years later. The map of the potentially reachable area was too large to be generated in 2 sec, so I was generating it on the go whenever the machine discovered previously unseen regions. To make sure that the map stayed the same regardless of the traversal order, I called SecureRandom.setSeed with coordinates of the map cell to be generated before generating the cell's contents. It turned out that in Java this method doesn't reset all state of random generator, so different routes produced different maps after all. We didn't realize this until the end of the match, when participants could present specific examples to prove this, and the test case generation ended up being rewritten before system testing.

However, the match was still rated (one of the greatest fears for any problem writer is any issue which makes the competition unrated), and we've moved on to...

#2 PolygonArea (MM 23)

My second Marathon match went a lot better than the first one. I'd love to attribute this to the lessons I've learned from the first one, but in fact I came up with the idea for PolygonArea a month before DeepMining went live. The problem was inspired by Monte Carlo method: given an API method which allows you to check whether a point is inside the polygon or outside it, calculate the polygon area as precisely as possible with limited number of calls to the API. I still consider this problem to be the best I've ever written: the elegant concept leads to a short and clear statement, very few clarifications during the match, and a lot of participants.

Besides, this problem is a perfect illustration of the KISS principle applied to Marathons: the more minimalistic the problem is, the better. Originally I was thinking about generating a variety of concave polygons: star-shaped, C-shaped, even some that looked like sawtooth, and allowing a lot more measurements (several thousands) to encourage more Monte Carlo-ish solutions. But after some testing Lars suggested to use only convex polygons and only up to 300 measurements. It worked out beautifully, allowing the competitors to do lots of algorithmic stuff. The  easiest thing one can do is create a wrapper for the provided API: if the queried point is inside convex hull of points which are already known to be inside the polygon, it is guaranteed to also be inside the polygon, no need to waste an API call to confirm this. You can read participants' writeups of more advanced solutions in the match discussion.

This match's success strengthened my resolve to continue problemwriting. At that point I already had a bunch of new problem ideas, and while some of them were impossible to implement within Marathon matches framework or too boring to be done, others blossomed into good matches in time, and yet others are still waiting to be implemented. So, I guess, to be continued :-)

Wednesday, July 1, 2015

Declaratively solving Google Code Jam problems

This spring my hobby of writing articles about competitive programming has become more serious: I've teamed up with Sergii Dymchenko to write two papers about solving Google Code Jam problems using declarative paradigm for computer science conferences.

The first one - "Declaratively solving tricky Google Code Jam problems with Prolog-based ECLiPSe CLP system" - we presented at the 30th ACM/SIGAPP Symposium On Applied Computing in Salamanca, Spain. (preprint)

The second one - "Declaratively solving Google Code Jam problems with Picat" - we presented at Sixteenth International Symposium on Practical Aspects of Declarative Languages in Portland, OR, USA. (proceedings, preprint)

Google Code Jam problems are convenient for demoing capabilities of various programming languages because:
* they are available online, so correctness of a solution is easy to verify by submitting output file in practice mode (unlike the problems used in Facebook Hacker Cup, which uses a similar competition format but doesn't allow practicing problems after the contest)
* any language can be used to solve the problems, including the one we're writing about this time
* the problems are well-written and tested (have you seen Google's problem preparation guide?)
* the problems allow to use a variety of different approaches and techniques, so whatever you're writing about, chances are you'll be able to find a problem which can be solved this way.

We certainly plan to continue writing about competitive programming - there are tons of topics there which have never been covered before. Well, it probably won't be Code Jam for the next article :-)

Friday, May 15, 2015

ImageIdentify recognizes most food items as emmental

Today I've found this beautiful photo by Lernert & Sander on Twitter.

It got me thinking about AI and image recognition. Wouldn't this image be a great test of how much your system actually "understands" about the objects it processes, and how much it just relies on some random features of previously seen images? Each piece in the photo is a small cube of some kind of unprocessed food. The cubes look fairly different from the food you typically see in real life, and yet humans shouldn't have a problem recognizing most of the foods: some kind of meat in the top left corner, avocado to the right of it, a mandarin or an orange next to it and so on. How would an image recognition system do?

I've decided to try and feed some of the cubes to the recently announced Wolfram Language Image Identification Project. Here are some of my favorite findings:

1. At least 8 different food types are recognized as emmental (a kind of cheese). Surprisingly, none of theme really looks like cheese - I haven't tried yellowish-white cubes which could be anything, just the ones which have some distinctive features: corn, kiwi, dragon fruit, banana, yellow bell pepper, onion, pineapple, and this weird thing that looks nothing like any fruit I know (any guesses?).

2. At least 4 different food types are recognized as soap: avocado, watermelon, melon (I think I see a pattern emerging here...), and meat (...nope).

3. Some of the results are completely wrong but still beautiful:

mushroom (likely an Agaricus bisporus cultivar specimen) labeled as "chocolate truffle"

romanesco broccoli as "nougat" (ok, I didn't know the exact name of this thing, but I was one query away from it - just google "fractal cabbage")

purple cabbage as "container" (and a pretty one it would be!)

4. Two cubes of pomegranate and mandarin got recognized perfectly:

5. Another several cubes got "recognized" conceptually: the system classified tomato, orange bell pepper, cabbage, and a citrus as some kind of food but had no better idea. You could put together a salad :-)

6. The picture of all the cubes together? Jelly bean.

7. And finally, the gem of the collection: a fruit which the system identified perfectly while a human (me) failed miserably. In my defense, I have to say that I've never dealt with a papaya and I don't expect to deal with one in the nearest future :-)

Monday, March 2, 2015

Cognitive Biases in Competitive Programming

Cognitive biases are frequently observed systematic errors in judgment which follow certain patterns. They manifest in all areas of human life, including competitive programming, even though it is supposed to be practiced by some of the most rational individuals in the world. Let me present several biases which I feel are the strongest in competitive programming.

How to become red in 3 months? (a.k.a. survivorship bias)

The number of red members at both TopCoder and Codeforces is approximately 1% of all people ever rated at respective tracks: 590/63k for TopCoder Algorithm track, 40/6k for TopCoder Marathon track, 654/54k for Codeforces. Even when we add members who have been red at some point of time in the past but are not now, the share of red might grow to 2% but not beyond that. And yet the search for instructions on joining the elite squad is one of the most frequently asked questions in competitive programming, with slightly varying time intervals allotted.

Survivorship bias is the error of focusing on people who "survived" some process (in this case - became red) and overlooking those who didn't due to their lack of visibility. Everybody in the world of competitive programming focuses on survivors: you see red members in "top rated" and "top contributors" lists, in "record books", in contest leaderboards and discussions, in onsite finals of the tournaments, in interviews and feature articles... You just don't get to see people stuck in Div 2 after competing forever, and there are a lot more of the later than of the former.

Of course, this effect can also be attributed to illusory superiority (overestimate of one's desirable qualities compared to other people), depending on whether people seeking to become red in 3 months believe that becoming red is generally easy and available to anyone, or that becoming red is hard but they can do it easily. This is a topic for a separate study.

It's easy to prove... (a.k.a curse of knowledge)

Have you ever seen an editorial which frustratingly replaces an actual proof of a fact with a brief "it is easy to prove" / "it is evident that"? Or an editorial which says "It's a simple DP solution" without explaining the solution? Probably yes. Likely this is another bias at work: curse of knowledge leads better-informed parties (the problem writer) to find it genuinely difficult to think about problems from the perspective of lesser-informed parties (people who can't solve the problem on their own and have to read the editorial). The omitted proof is really evident to the writer, and the DP with 3 parameters is really simple, and how could anyone not see this? Of course, we can't rule out the possibility that the editorial writer was just feeling lazy :-)

Same goes for problems which were solved by a lot fewer participants than was anticipated by the writer: when you arrange problems into a set, you already know their solutions, so you tend to underestimate their complexity for people who solve it from scratch.

Ratingism (a.k.a. ad hominem fallacy)

First observations of this bias date back to 2009, but it was widespread much earlier than that. Ad hominem fallacy means that people judge arguments of others based on their personal traits, in this case - based on their rating. Similar statements issued by a red and a gray will yield a lot more upvotes for the first author.

The fall of TopCoder (a.k.a. confirmation bias)

"TopCoder is dying" is a popular idea nowadays; a lot of things, both related (like some issues with an SRM) and completely unrelated (like 5th anniversary of Codeforces), trigger an outburst of eulogies. At the same time strictly positive news (like return of for-fun Marathon matches) get hardly any reaction at all. That's confirmation bias: the tendency to filter and interpret information in a way that confirms one's existing beliefs. It also works the other way round: if you don't believe in the fall of TopCoder, you're not going to be swayed by one SRM gone wrong.

There are more biases involved in this effect, to name just a few:
  • illusory truth effect: people tend to believe things which are repeated often, even if they are not true;
  • backfire effect: given evidence against their beliefs, people can reject it and strengthen the beliefs.

I leave it as an exercise to the reader to name biases I exhibit in this article :-)