Last week I had an anniversary to celebrate: 8 years ago TopCoder accepted my first Marathon problem. This marked a huge milestone in my life; I was approved as an SRM problem writer several weeks before that, but I dare say I've become a much better Marathon writer than an Algorithm one. Certainly a more prolific one: over the years I've written over 30 Marathon problems but only a dozen Single Round Matches (and half of them were not even adult SRMs but so called "High School SRMs", targeted at a younger and less experienced audience).
Of course, the most memorable of all these problems are the first ones I've created...
My first accepted Marathon problem was based on a Flash game called Motherload. The very first idea I sent Lars Backstrom (Lars was Marathon problem coordinator till 2010) was based on a different game called Feeding Frenzy. It is a fun game to play, but impersonating a fish trying to avoid being eaten is probably not how most people want to spend a whole week (at that point matches used to last a week), and besides, the concept was similar to recently used pacman problem. Fortunately, the fishy idea has never seen light. We agreed on plan B, a mining machine-operating problem instead: the player operates a machine digging through the area gathering minerals, and the objective is to maximize the value of the minerals hauled to the surface before running out of fuel.
As a lot of first-ever things do, this match was not exactly a success.
First of all, it had a long and evolved problem statement with lots of implementation details to be clarified. It involved a randomly generated map, and back in 2007 problems included only a visualizer (a .jar file that allowed to test solutions locally) but not the visualizer source code, so people were unable to use it to understand the inner workings of the judging system. That meant that every little detail had to be clarified by the writer in the forums! On the night the match was launched I stayed up for five hours straight answering questions, and the new ones were arriving faster than I answered the old ones. Unfortunately, this was not an indication of a big success, quite the contrary. This match was not just any contest but the second elimination round of TopCoder Collegiate Challenge (TCCC used to be an annual tournament similar to TopCoder Open, but for college students only), and the frenzy in the forums was caused by participants desperate to try and advance to the next round struggling to make sense of an underspecified statement.
But even worse, the implementation of the visualizer (and the judging code as well) left much to be desired. It had several small glitches which we had to fix during the contest; fortunately, Marathon matches are more forgiving to such glitches than SRMs, as they are longer. It was not user-friendly (that week I learned to never use " " as a meaningful character in the input data - it can be worked around but it's frustrating and it doesn't add any value to the problem). Worst of all, it had a major bug, which I'm still ashamed of years later. The map of the potentially reachable area was too large to be generated in 2 sec, so I was generating it on the go whenever the machine discovered previously unseen regions. To make sure that the map stayed the same regardless of the traversal order, I called SecureRandom.setSeed with coordinates of the map cell to be generated before generating the cell's contents. It turned out that in Java this method doesn't reset all state of random generator, so different routes produced different maps after all. We didn't realize this until the end of the match, when participants could present specific examples to prove this, and the test case generation ended up being rewritten before system testing.
However, the match was still rated (one of the greatest fears for any problem writer is any issue which makes the competition unrated), and we've moved on to...
My second Marathon match went a lot better than the first one. I'd love to attribute this to the lessons I've learned from the first one, but in fact I came up with the idea for PolygonArea a month before DeepMining went live. The problem was inspired by Monte Carlo method: given an API method which allows you to check whether a point is inside the polygon or outside it, calculate the polygon area as precisely as possible with limited number of calls to the API. I still consider this problem to be the best I've ever written: the elegant concept leads to a short and clear statement, very few clarifications during the match, and a lot of participants.
Besides, this problem is a perfect illustration of the KISS principle applied to Marathons: the more minimalistic the problem is, the better. Originally I was thinking about generating a variety of concave polygons: star-shaped, C-shaped, even some that looked like sawtooth, and allowing a lot more measurements (several thousands) to encourage more Monte Carlo-ish solutions. But after some testing Lars suggested to use only convex polygons and only up to 300 measurements. It worked out beautifully, allowing the competitors to do lots of algorithmic stuff. The easiest thing one can do is create a wrapper for the provided API: if the queried point is inside convex hull of points which are already known to be inside the polygon, it is guaranteed to also be inside the polygon, no need to waste an API call to confirm this. You can read participants' writeups of more advanced solutions in the match discussion.
This match's success strengthened my resolve to continue problemwriting. At that point I already had a bunch of new problem ideas, and while some of them were impossible to implement within Marathon matches framework or too boring to be done, others blossomed into good matches in time, and yet others are still waiting to be implemented. So, I guess, to be continued :-)
Of course, the most memorable of all these problems are the first ones I've created...
#1 DeepMining (TCCC07 Round 2)
My first accepted Marathon problem was based on a Flash game called Motherload. The very first idea I sent Lars Backstrom (Lars was Marathon problem coordinator till 2010) was based on a different game called Feeding Frenzy. It is a fun game to play, but impersonating a fish trying to avoid being eaten is probably not how most people want to spend a whole week (at that point matches used to last a week), and besides, the concept was similar to recently used pacman problem. Fortunately, the fishy idea has never seen light. We agreed on plan B, a mining machine-operating problem instead: the player operates a machine digging through the area gathering minerals, and the objective is to maximize the value of the minerals hauled to the surface before running out of fuel.
As a lot of first-ever things do, this match was not exactly a success.
First of all, it had a long and evolved problem statement with lots of implementation details to be clarified. It involved a randomly generated map, and back in 2007 problems included only a visualizer (a .jar file that allowed to test solutions locally) but not the visualizer source code, so people were unable to use it to understand the inner workings of the judging system. That meant that every little detail had to be clarified by the writer in the forums! On the night the match was launched I stayed up for five hours straight answering questions, and the new ones were arriving faster than I answered the old ones. Unfortunately, this was not an indication of a big success, quite the contrary. This match was not just any contest but the second elimination round of TopCoder Collegiate Challenge (TCCC used to be an annual tournament similar to TopCoder Open, but for college students only), and the frenzy in the forums was caused by participants desperate to try and advance to the next round struggling to make sense of an underspecified statement.
But even worse, the implementation of the visualizer (and the judging code as well) left much to be desired. It had several small glitches which we had to fix during the contest; fortunately, Marathon matches are more forgiving to such glitches than SRMs, as they are longer. It was not user-friendly (that week I learned to never use " " as a meaningful character in the input data - it can be worked around but it's frustrating and it doesn't add any value to the problem). Worst of all, it had a major bug, which I'm still ashamed of years later. The map of the potentially reachable area was too large to be generated in 2 sec, so I was generating it on the go whenever the machine discovered previously unseen regions. To make sure that the map stayed the same regardless of the traversal order, I called SecureRandom.setSeed with coordinates of the map cell to be generated before generating the cell's contents. It turned out that in Java this method doesn't reset all state of random generator, so different routes produced different maps after all. We didn't realize this until the end of the match, when participants could present specific examples to prove this, and the test case generation ended up being rewritten before system testing.
However, the match was still rated (one of the greatest fears for any problem writer is any issue which makes the competition unrated), and we've moved on to...
#2 PolygonArea (MM 23)
My second Marathon match went a lot better than the first one. I'd love to attribute this to the lessons I've learned from the first one, but in fact I came up with the idea for PolygonArea a month before DeepMining went live. The problem was inspired by Monte Carlo method: given an API method which allows you to check whether a point is inside the polygon or outside it, calculate the polygon area as precisely as possible with limited number of calls to the API. I still consider this problem to be the best I've ever written: the elegant concept leads to a short and clear statement, very few clarifications during the match, and a lot of participants.
Besides, this problem is a perfect illustration of the KISS principle applied to Marathons: the more minimalistic the problem is, the better. Originally I was thinking about generating a variety of concave polygons: star-shaped, C-shaped, even some that looked like sawtooth, and allowing a lot more measurements (several thousands) to encourage more Monte Carlo-ish solutions. But after some testing Lars suggested to use only convex polygons and only up to 300 measurements. It worked out beautifully, allowing the competitors to do lots of algorithmic stuff. The easiest thing one can do is create a wrapper for the provided API: if the queried point is inside convex hull of points which are already known to be inside the polygon, it is guaranteed to also be inside the polygon, no need to waste an API call to confirm this. You can read participants' writeups of more advanced solutions in the match discussion.
This match's success strengthened my resolve to continue problemwriting. At that point I already had a bunch of new problem ideas, and while some of them were impossible to implement within Marathon matches framework or too boring to be done, others blossomed into good matches in time, and yet others are still waiting to be implemented. So, I guess, to be continued :-)
No comments:
Post a Comment