Greg Kuperberg’s calculus problem

“How good are you at calculus?”

This was the opening sentence of Greg Kuperberg’s Facebook status on July 4th, 2016.

“I have a joint paper (on isoperimetric inequalities in differential geometry) in which we need to know that

is non-negative for x and y non-negative and theta between 0 and pi. Also, the minimum only occurs for x=y=1/(tan(theta/2).”

Let’s take a moment to appreciate the complexity of the mathematical statement above. It is a non-linear inequality in three variables, mixing trigonometry with algebra and throwing in some arc-tangents for good measure. Greg, continued:

“We proved it, but only with the aid of symbolic algebra to factor an algebraic variety into irreducible components. The human part of our proof is also not really a cake walk.

A simpler proof would be way cool.”

I was hooked. The cubic terms looked a little intimidating, but if I converted x and y into tan(theta_x) and tan(theta_y), respectively, as one of the comments on Facebook promptly suggested, I could at least get rid of the annoying arc-tangents and then calculus and trigonometry would take me the rest of the way. Greg replied to my initial comment outlining a quick route to the proof: “Let me just caution that we found the problem unyielding.” Hmm… Then, Greg revealed that the paper containing the original proof was over three years old (had he been thinking about this since then? that’s what true love must be like.) Titled “The Cartan-Hadamard Conjecture and The Little Prince“, the above inequality makes its appearance as Lemma 7.1 on page 45 (of 63). To quote the paper: “Although the lemma is evident from contour plots, the authors found it surprisingly tricky to prove rigorously.”

As I filled pages of calculations and memorized every trigonometric identity known to man, I realized that Greg was right: the problem was highly intractable. The quick solution that was supposed to take me two to three days turned into two weeks of hell, until I decided to drop the original approach and stick to doing calculus with the known unknowns, x and y. The next week led me to a set of three non-linear equations mixing trigonometric functions with fourth powers of x and y, at which point I thought of giving up. I knew what I needed to do to finish the proof, but it looked freaking insane. Still, like the masochist that I am, I continued calculating away until my brain was mush. And then, yesterday, during a moment of clarity, I decided to go back to one of the three equations and rewrite it in a different way. That is when I noticed the error. I had solved for costheta in terms of x and y, but I had made a mistake that had cost me 10 days of intense work with no end in sight. Once I found the mistake, the whole proof came together within about an hour. At that moment, I felt a mix of happiness (duh), but also sadness, as if someone I had grown fond of no longer had a reason to spend time with me and, at the same time, I had ran out of made-up reasons to hang out with them. But, yeah, I mostly felt happiness.

Greg Kuperberg pondering about the universe of mathematics.

Greg Kuperberg pondering about the universe of mathematics.

Before I present the proof below, I want to take a moment to say a few words about Greg, whom I consider to be the John Preskill of mathematics: a lodestar of sanity in a sea of hyperbole (to paraphrase Scott Aaronson). When I started grad school at UC Davis back in 2003, quantum information theory and quantum computing were becoming “a thing” among some of the top universities around the US. So, I went to several of the mathematics faculty in the department asking if there was a course on quantum information theory I could take. The answer was to “read Nielsen and Chuang and then go talk to Professor Kuperberg”. Being a foolish young man, I skipped the first part and went straight to Greg to ask him to teach me (and four other brave souls) quantum “stuff”. Greg obliged with a course on… quantum probability and quantum groups. Not what I had in mind. This guy was hardcore. Needless to say, the five brave souls taking the class (mostly fourth year graduate students and me, the noob) quickly became three, then two gluttons for punishment (the other masochist became one of my best friends in grad school). I could not drop the class, not because I had asked Greg to do this as a favor to me, but because I knew that I was in the presence of greatness (or maybe it was Stockholm syndrome). My goal then, as an aspiring mathematician, became to one day have a conversation with Greg where, for some brief moment, I would not sound stupid. A man of incredible intelligence, Greg is that rare individual whose character matches his intellect. Much like the anti-heroes portrayed by Humphrey Bogart in Casablanca and the Maltese Falcon, Greg keeps a low-profile, seems almost cynical at times, but in the end, he works harder than everyone else to help those in need. For example, on MathOverflow, a question and answer website for professional mathematicians around the world, Greg is listed as one of the top contributors of all time.

But, back to the problem. The past four weeks thinking about it have oscillated between phases of “this is the most fun I’ve had in years!” to “this is Greg’s way of telling me I should drop math and become a go-go dancer”. Now that the ordeal is over, I can confidently say that the problem is anything but “dull” (which is how Greg felt others on MathOverflow would perceive it, so he never posted it there). In fact, if I ever have to teach Calculus, I will subject my students to the step-by-step proof of this problem. OK, here is the proof. This one is for you Greg. Thanks for being such a great role model. Sorry I didn’t get to tell you until now. And you are right not to offer a “bounty” for the solution. The journey (more like, a trip to Mordor and back) was all the money.

The proof: The first thing to note (and if I had read Greg’s paper earlier than today, I would have known as much weeks ago) is that the following equality holds (which can be verified quickly by differentiating both sides):

4 x - 6arctan(x) +2x/(1+x^2) = 4 int_0^x fracs^4(1+s^2)^2 ds.

Using the above equality (and the equivalent one for y), we get:

F(theta,x,y) = (sintheta)^3 xy + ((costheta)^3 -3costheta -2) (x+y) - (sintheta)^3-6sintheta -6theta + 6pi \ \4 int_0^x fracs^4(1+s^2)^2 ds+4 int_0^y fracs^4(1+s^2)^2 ds.

Now comes the fun part. We differentiate with respect to theta, x and y, and set to zero to find all the maxima and minima of F(theta,x,y) (though we are only interested in the global minimum, which is supposed to be at x=y=tan^-1(theta/2)). Some high-school level calculus yields:

partial_theta F(theta,x,y) = 0 implies sin^2(theta) (cos(theta) xy + sin(theta)(x+y)) = \ \ 2 (1+cos(theta))+sin^2(theta)cos(theta).

At this point, the most well-known trigonometric identity of all time, sin^2(theta)+cos^2(theta)=1, can be used to show that the right-hand-side can be re-written as:

2(1+cos(theta))+sin^2(theta)cos(theta) = sin^2(theta) (costheta tan^-2(theta/2) + 2sintheta tan^-1(theta/2)),

where I used (my now favorite) trigonometric identity: tan^-1(theta/2) = (1+costheta)/sin(theta). Putting it all together, we now have the very suggestive condition:

sin^2(theta) (cos(theta) (xy-tan^-2(theta/2)) + sin(theta)(x+y-2tan^-1(theta/2))) = 0,

noting that, despite appearances, theta = 0 is not a solution (as can be checked from the original form of this equality, unless x and y are infinite, in which case the expression is clearly non-negative, as we show towards the end of this post). This leaves us with theta = pi and

cos(theta) (tan^-2(theta/2)-xy) = sin(theta)(x+y-2tan^-1(theta/2)),

as candidates for where the minimum may be. A quick check shows that:

F(pi,x,y) = 4 int_0^x fracs^4(1+s^2)^2 ds+4 int_0^y fracs^4(1+s^2)^2 ds ge 0,

since x and y are non-negative. The following obvious substitution becomes our greatest ally for the rest of the proof:

x= alpha tan^-1(theta/2), , y = beta tan^-1(theta/2).

Substituting the above in the remaining condition for partial_theta F(theta,x,y) = 0, and using again that tan^-1(theta/2) = (1+costheta)/sintheta, we get:

costheta (1-alphabeta) = (1-costheta) ((alpha-1) + (beta-1)),

which can be further simplified to (if you are paying attention to minus signs and don’t waste a week on a wild-goose chase like I did):

costheta = frac11-beta+frac11-alpha.

As Greg loves to say, we are finally cooking with gas. Note that the expression is symmetric in alpha and beta, which should be obvious from the symmetry of F(theta,x,y) in x and y. That observation will come in handy when we take derivatives with respect to x and y now. Factoring (costheta)^3 -3costheta -2 = - (1+costheta)^2(2-costheta), we get:

partial_x F(theta,x,y) = 0 implies sin^3(theta) y + 4fracx^4(1+x^2)^2 = (1+costheta)^2 + sin^2theta (1+costheta).

Substituting x and y with alpha tan^-1(theta/2), beta tan^-1(theta/2), respectively and using the identities tan^-1(theta/2) = (1+costheta)/sintheta and tan^-2(theta/2) = (1+costheta)/(1-costheta), the above expression simplifies significantly to the following expression:

4alpha^4 =left((alpha^2-1)costheta+alpha^2+1right)^2 left(1 + (1-beta)(1-costheta)right).

Using costheta = frac11-beta+frac11-alpha, which we derived earlier by looking at the extrema of F(theta,x,y) with respect to theta, and noting that the global minimum would have to be an extremum with respect to all three variables, we get:

4alpha^4 (1-beta) = alpha (alpha-1) (1+alpha + alpha(1-beta))^2,

where we used 1 + (1-beta)(1-costheta) = alpha (1-beta) (alpha-1)^-1 and

(alpha^2-1)costheta+alpha^2+1 = (alpha+1)((alpha-1)costheta+1)+alpha(alpha-1) = \ (alpha-1)(1-beta)^-1 (2alpha + 1-alphabeta).

We may assume, without loss of generality, that x ge y. If alpha = 0, then alpha = beta = 0, which leads to the contradiction costheta = 2, unless the other condition, theta = pi, holds, which leads to F(pi,0,0) = 0. Dividing through by alpha and re-writing 4alpha^3(1-beta) = 4alpha(1+alpha)(alpha-1)(1-beta) + 4alpha(1-beta), yields:

4alpha (1-beta) = (alpha-1) (1+alpha - alpha(1-beta))^2 = (alpha-1)(1+alphabeta)^2,

which can be further modified to:

4alpha +(1-alphabeta)^2 = alpha (1+alphabeta)^2,

and, similarly for beta (due to symmetry):

4beta +(1-alphabeta)^2 = beta (1+alphabeta)^2.

Subtracting the two equations from each other, we get:

4(alpha-beta) = (alpha-beta)(1+alphabeta)^2,

which implies that alpha = beta and/or alphabeta =1. The first leads to 4alpha (1-alpha) = (alpha-1)(1+alpha^2)^2, which immediately implies alpha = 1 = beta (since the left and right side of the equality have opposite signs otherwise). The second one implies that either alpha+beta =2, or costheta =1, which follows from the earlier equation costheta (1-alphabeta) = (1-costheta) ((alpha-1) + (beta-1)). If alpha+beta =2 and 1 = alphabeta, it is easy to see that alpha=beta=1 is the only solution by expanding (sqrtalpha-sqrtbeta)^2=0. If, on the other hand, costheta = 1, then looking at the original form of F(theta,x,y), we see that F(0,x,y) = 6pi - 6arctan(x) +2x/(1+x^2) -6arctan(y) +2y/(1+y^2) ge 0, since x,y ge 0 implies arctan(x)+arctan(y) le pi.

And that concludes the proof, since the only cases for which all three conditions are met, lead to alpha = beta = 1 and, hence, x=y=tan^-1(theta/2). The minimum of F(theta, x,y) at these values is always zero. That’s right, all this work to end up with “nothing”. But, at least, the last four weeks have been anything but dull.