### Intro

Sometime in early 2023 there was a tweet going in social media about how inelegant the optimal packing of 17 squares is. The idea is that given 17 squares of equal size, one has to find the arrangement that makes them fit in the smallest square possible. Here's a drawing to better understand:

Side Length =

**4.6755389909332266**

Here the 17 squares, which without loss of generality we can assume have edge length and area of 1, have been awkwardly rotated and squeezed together to make the overall square that surrounds them as small as possible. In particular, the side length of that square is

**4.6755389909332266**according to my computations.

This computations of mine do match with what's documented as the best known solution to this problem, which has been given as an upper bound of 4.6756. So, I might have been able to improve the best known result by a few nanounits, but in essence I landed in the same local minima as previous researchers. It is an open question whether this local minima is also a global minima, or if a better solution exists.

For completion here go the coordinates for the center of the 17 squares and their rotation angles. Naturally the origin of the coordinates is arbitrary, so any offset of the

**X**or

**Y**coordinates is still a valid solution:

X | Y | α |

0.0000000000000000 | 0.0000000000000000 | 0.0000000000000000 |

0.0000000000000000 | 2.6755315227781047 | 0.0000000000000000 |

0.0000000000000000 | 3.6755389909332270 | 0.0000000000000000 |

1.0000008688617834 | 3.6755389909332270 | 0.0000000000000000 |

1.8473248398366033 | 0.0000000000000000 | 0.0000000000000000 |

2.6617145214217546 | 3.6755389909332270 | 0.0000000000000000 |

3.6755380924375394 | 3.6755389909332270 | 0.0000000000000000 |

3.6755380924375394 | 2.6755378679362884 | 0.0000000000000000 |

3.6755380924375394 | 1.5620779956940087 | 0.0000000000000000 |

3.6755380924375394 | 0.0000000000000000 | 0.0000000000000000 |

2.7450846468924719 | 0.8030208635110467 | 0.6393848787276846 |

2.5271311108445103 | 2.0650137825521684 | 0.8760638127044440 |

1.8176733839134585 | 2.7757155957220658 | 0.8760638127044440 |

1.7886247981373979 | 1.3787193621490645 | 0.8760638127044440 |

1.0755090294321699 | 2.0851007078799766 | 0.8760638127044440 |

0.9360434551231871 | 0.7874870545544742 | 0.8760638127044440 |

0.204201745750164 | 61.4713285708853723 | 0.8760638127044440 |

Note that the solution requires

**three**distinct angles. One might wonder though, could we land on an equally good solution if we constrained the system to only have

**two**free angles, rather than three?

I did run some optimization for that case too, and I arrived to the following solution:

Side Length =

**4.6776523612755243**

Here we can see that this new solution is just 0.045% larger than the best known solution. Not too bad, maybe there's hopes for an optimal two angle solution?

These are the coordinates of the squares for the two angle configuration:

X | Y | α |

-1.1116242844068009 | -1.8402399406767236 | 0.0000000000000000 |

-1.1116255254306753 | 0.8374107457162734 | 0.0000000000000000 |

-1.1116243397978638 | 1.8374110383983384 | 0.0000000000000000 |

-1.0942047718814971 | -0.4350705241431123 | 0.0000000000000000 |

-0.1116237220571261 | 1.8374108411706447 | 0.0000000000000000 |

1.4763873860040919 | 1.8275484912641486 | 0.0000000000000000 |

2.5102600567909605 | 0.8374112886238265 | 0.0000000000000000 |

2.5660268358448488 | -0.1625897417734073 | 0.0000000000000000 |

2.5442201183853879 | 1.8374113153307905 | 0.0000000000000000 |

2.5660259331613458 | -1.8402407716431486 | 0.0000000000000000 |

-0.0939054021162802 | -1.1814632713418958 | 0.6917096374308753 |

-0.0308912432391038 | 0.1691640699375276 | 0.6917096374308753 |

0.6762516872783623 | -0.5436084389241539 | 0.6917096374308753 |

0.6650312953345429 | 0.9173199125052561 | 0.6917096374308753 |

0.9247541951530051 | -1.6362329699087730 | 0.6917096374308753 |

1.6949150276322920 | -0.9983813156063773 | 0.6917096374308753 |

1.3620209788782482 | 0.1961385532745990 | 0.6917096374308753 |

In order to find these results I wrote a small C program, which I did let run for almost a full 24 hours day. The code itself is probably not worth sharing since it wasn't a proper gradient descent optimized, but more of a stochastic coordinate descent one: for any particular current state in parameter space, I'd pick any of the 37 (17x2+3) coordinate at random, and evaluate the side length of the bounding square again for a small delta variation on that parameter. If the new side length was smaller than the previous, I'd accept the delta and resume. The delta started large and decreased exponentially over time, and also every now and then I applied really large deltas in the hopes of escaping local minima and exploring other regions of the parameter space. I also did quite a few manual tweaks during those 24 hours to get unstuck or to refine around potentially interesting areas of the parameter space as I sensed I was getting close to a minima.

The only interesting part of the code is perhaps the side length calculation, which is invoked millions of times in order to guide the coordinate descent. It looked something like this, in pseudo-C-code:

[double, bbox2d] = compute_side_length_and_bounding_box( const square *squares )
{
bbox2d box = { vec3d(1e20), -vec3d(1e20) };
for( int i=0; i<17; i++ )
{
vec2d ce = squares[i].get_center();
vec2d ve[4] = squares[i].get_four_vertices();
// expand bounding box
box = include( box, ve[0] );
box = include( box, ve[1] );
box = include( box, ve[2] );
box = include( box, ve[3] );
// if overlaps exist, return negative area to signal no valid solution
for( int j=0; j<17; j++ )
{
if( i != j )
{
// early skip if squares further than a full diagonal apart
if( distance_squared( ce, squares[j].get_center() ) > 2.0 ) continue;
if( inside( ve[0], squares[j] ) ) return -1.0;
if( inside( ve[1], squares[j] ) ) return -1.0;
if( inside( ve[2], squares[j] ) ) return -1.0;
if( inside( ve[3], squares[j] ) ) return -1.0;
if( inside( ce , squares[j] ) ) return -1.0;
}
}
}
// make square from box and return its side length
double dx = box.mMaxX - box.mMinX;
double dy = box.mMaxY - box.mMinY;
double side_length = (dx > dy) ? dx : dy;
return [ side_length, box ];
}

So, here include() expands the input box to contain the passed vertex. The function inside() computes whether a vertex is inside the square. Computing if a point

**v**is inside a unit square at the origin is as simple as doing:

return max(abs(v.x),abs(v.y)) < 0.5;

Now, before we can use this line of code we need to convert the point

**v**by inverse rotation and translation of our square, so that we can do this test in canonical space. However, computing the sine and cosine of the rotation can be so expensive, so it's a good idea to do it only once and reuse it in the 5 calls to inside(). A similar optimization can be done inside the get_four_vertices() function.