From: timin@netcom.com (Mitchell E. Timin) Subject: TSP CONTEST RESULTS - Final Writeup, part 1 Summary: Final Report fo the TSP contest Keywords: TSP contest Organization: NETCOM On-line Communication Services (408 261-4700 guest) Results of the TSP Competition (Part 1 of Final Writeup) This announcement concerns the TSP contest which was announced this past spring and summer. (For the curious reader who is not familiar with this contest, e-mail to timin@netcom.com asking for the original contest announcement.) First Place in this Competition goes to Henning Klaskala of Germany! Second Place goes to Jukka Honkanen of Finland! Third Place goes to Frank Sven Nestel of Germany. Fourth Place is for Randy Saint of Texas, U.S.A. There is a three-way tie for fifth place between: Chad Hurwitz Juergen Klose The Koeln group The above will receive a total of about $1500 retail value of software, (Street Wizard city/county maps of the U.S.A, by Adept Computer Solutions) Part 2 of this writeup, to appear soon, will have information about the contestants, their methods, and their prizes. A total of fifteen testable programs were received. (Several other programs were received but they were not testable for a variety of reasons, i.e. they didn't work right!) The programs were each tested with three sets of data, for 47, 110 and 111 points. (Details of the method of generating these files is given below, as is the method of ranking the results to determine the winners.) They were each run several times to find out if their results were randomly variable. Mr. Klaskala's program was of this type; its execution time was constant for a given problem file, but the routes differed from run to run. I ran his program seven times for each input file and used the simple average of the route cost for evaluation. All of the other leading programs were deterministic, giving the same route and execution time for every run. Timing was done with a stop watch, manually. The same computer was used for all runs; it is a 486DX66 with 8 megabytes of RAM and VESA local bus. Microsoft Smartdrv was active so that files could be read from RAM. All programs ran under DOS. As examples of the performance of some of the better programs, Mr. Honkanen's program required only 20.3 seconds to find a route with 111 points, and the cost of that route was 88370. Here is the route: 30 49 21 53 90 89 72 13 10 105 96 37 107 12 104 52 60 7 25 62 97 46 45 67 64 20 86 26 78 33 103 63 18 75 73 76 38 40 41 5 19 99 39 24 4 87 70 102 61 35 59 11 17 77 98 0 71 28 94 56 85 83 51 47 92 81 55 2 16 36 43 48 23 110 54 101 82 22 27 80 57 93 32 34 68 108 15 91 79 29 84 14 100 95 8 1 50 6 69 88 42 44 9 74 3 65 109 106 58 66 31 Mr. Klaskala's program only required 1.5 seconds to find the following route, with a cost of 88106: 30 104 52 60 7 62 25 37 107 12 21 49 53 90 89 72 13 79 91 29 34 32 15 108 68 84 14 100 95 8 1 69 6 50 58 44 42 88 93 57 80 27 106 22 65 109 71 28 74 3 101 82 54 110 92 47 51 85 83 94 56 9 102 61 35 59 11 77 0 98 17 87 70 39 24 4 99 19 5 41 38 40 76 73 81 55 2 16 36 18 75 43 48 63 103 20 86 26 33 78 64 67 23 45 46 97 66 105 96 10 31 (The seven runs of the program had costs ranging from 87445 to 88480.) The lowest cost route found for the 111 point case was done by Mr. Klose's program in 83.3 seconds; its cost was 86800. The route is: 30 104 10 13 79 90 89 72 53 49 21 52 60 7 37 107 12 25 62 97 46 45 67 64 20 86 26 78 33 103 63 18 75 36 16 55 2 92 81 73 76 38 40 41 51 47 85 83 94 56 9 102 61 35 50 6 69 8 1 95 100 14 59 11 17 87 70 39 24 4 99 19 5 0 98 77 71 28 74 3 65 109 44 42 88 58 57 93 32 34 91 29 84 68 108 15 80 27 106 22 101 82 54 110 43 48 23 66 105 96 31 Here are the lowest cost routes found for the other two problems: For the 47 point case the lowest cost found was by Mr. Honkanen's program in 2.9 seconds with this route: 39 31 25 43 18 4 20 42 5 15 12 17 33 21 35 0 14 37 13 27 6 46 22 41 38 11 29 40 32 2 1 16 23 26 10 8 19 36 28 9 34 3 45 44 7 24 30 The cost is 45750. For the 110 point case the lowest cost found was by Mr. Klose's program in 89.5 seconds with this route: 107 58 35 87 76 12 79 67 52 77 60 98 5 39 97 84 85 0 103 105 53 99 74 86 93 55 40 16 81 57 8 78 19 42 96 88 29 65 100 32 30 13 46 1 73 4 51 82 25 10 34 20 69 2 43 27 56 37 14 15 64 61 28 11 92 47 54 33 91 75 26 68 3 71 104 36 63 41 22 23 90 50 21 24 72 9 80 108 45 6 89 17 109 31 95 66 49 59 106 83 94 38 101 7 70 18 48 44 62 102 The cost is 87882 The three problem files, for 47, 110 and 111 points, are available by return e-mail, as is the initial contest announcement explaining how to interpret these files. Problem Generation Each problem, i.e. cost matrix, was generated by a program which takes N (the number of points enroute) as a command line parameter. (The program was implemented in Borland Turbo C++, ver. 3.0 for DOS.) The program outputs an ASCII file with N on the first line, followed by a line of costs from the starting point to each point enroute, then a line of costs from each point enroute to the ending point, and finally the N X N matrix of point-to-point costs. Here is an output of the program when N = 7: (The numbers will change with each run.) 7 3305 1515 2358 773 2255 3660 2710 3259 3231 3824 2478 1503 3594 2345 0 4132 3307 3315 2781 832 1331 3965 0 2225 1338 2860 4174 3488 3178 2253 0 2389 3319 3093 3292 3442 1436 2513 0 2204 3469 2671 2693 3011 3453 2192 0 2892 1832 703 4192 3069 3677 3041 0 1687 1504 3637 3329 2744 1650 1776 0 This program begins by choosing random points in three dimensional space. (The Borland random() function was used.) The points have integer valued coordinates and are within a rectangular region of dimensions 4000 X 4000 X 1000. The values of x and y range from -2000 to +2000; z goes from 0 to 1000. N points enroute are chosen, plus the starting point and ending point, hence N + 2 sets of three coordinate values are so chosen. These values are converted to double floating point so that the following calculations are done in floating point: Then for each pair of points the following steps are performed: 1. The three dimensional Euclidean distance is computed. (call this cost1) 2. cost2 = 2000 * (dist/2000)^.85 is computed. This has the effect of making travel between nearby points somewhat more expensive per unit distance, and travel between distant points somewhat less expensive, per unit distance. 3. A small random cost is added, producing cost3. The amount added is uniformly distributed between 0 and 295. 4. An amount is added that is a function of the x and y coordinates of the two points, as follows: Imagine a kind of a "mountain" sitting on the x - y plane. This mountain is a tetrahedron with a peak height of 1000. Its triangular base, with zero altitude, spans the domain of the x and y values. It reaches zero where x = 2000, and along the line from (-2000,0) to (2000,-2000), and along the line from (-2000,0) to (2000, 2000). The amount added depends on the (x,y) location of the midpoint of a line segment joining the two points. If this midoint is not under the mountain then nothing is added; if the midpoint is under the mountain then the height of the mountain over the midpoint is the amount added. 5. The double floating value is truncated to produce an integer cost. Evaluating and Ranking the Programs The procedure described below was repeated for each the the three problem files (47, 110 and 111 points). Each program was run several times to determine the elapsed time, the route found, and the cost of the route. A separate program, was run on each output file to make sure that the cost of the route was the same as that calculated by the program being tested. Timing, with a stopwatch, was done on the second and later runs, so that the problem file would be read from RAM rather than disk. (Microsoft Smartdrv was assumed to handle that.) Since we were interested in both short run times and low route cost, a quadratic penalty function was applied to both of these. In the case of the cost it was a quadratic function of the excess cost over that of the lowest cost found by any program. Hence the combined penalty function was: P(C,T) = K * T^2 + (C - C0)^2 Where C is cost, T is time, and C0 is the lowest known cost. K determines the relative importance of time compared to cost. When K is large this favors very fast programs that may only achieve mediocre costs. If K is small this favors programs that achieve very low cost, even if they spend considerable time searching for the best route. It was decided to use three different values of K. Our intermediate value was chosen such that a cost 5% greater than the lowest known cost would have the same penalty as a running time of 120 seconds. (That was for the 110 and 111 point cases; 60 seconds was used for the 47 point case.) Another value of K was obtained by multiplying the first value by 5, and the third value was obtained by dividing the first value by 5. Three sets of three K values were thus calculated, one set each for 47, 110 and 111 points. The three penalty functions were applied to each competitor, and for each of the three problem files. The best programs are of course those with the smallest calculated values of the penalty functions. The programs were ranked using each of the three penalty functions. In order to produce one ranking list instead of three, a single combined merit function was defined by normalizing and inverting the function values for a given K, so that the best program would have a score of 1000. This is the formula: score = 1000 * (P - med(P)) / (min(P)-med(P)) Where P is the penalty function for a fixed K, med(P) is a median value of P from the group of ten best performing programs, and min(P) is the lowest value of P. The best program has a score of 1000, the others have lower scores. Each program has three scores, one for each K. Finally the three scores for each program are simply summed to arrive at a total, and this total is the score used to rank the programs. (The "score" is negative for programs with P < med(P); such scores are converted to zero before summing.) This procedure was applied separately for the three problems. Here are the run times and route costs for the top three contestants: 47 pt. 110 pt. 111 pt. cost time cost time cost time Klaskala 45980 0.5 88689 1.5 87808 1.5 Honkanen 45750 2.9 88753 21.3 88370 20.3 Nestel 46235 1.2 88829 32.5 86835 32.1