Any 3D engine draws it’s power and speed from the mathematical models and implementations within, and trust John Carmack of ID software for using really good hacks. As it turns out, a very interesting hack is used in Quake III to calculate an inverse square root.

Preface

ID software has recently released the source code of Quake III engine with a GPL license. In this article we’ll see Carmack work his black magic to calculate the square root of a floating point number blazingly fast.

Carmack’s Unusual Inverse Square Root

A fast glance at the file game/code/q_math.c reveals many interesting performance hacks. The first thing that pops out is the use of the number 0x5f3759df in the function Q_rsqrt, which calculates the inverse square root of a float and why does this function actually work?

Observe the original function from q_math.c:

float Q_rsqrt( float number )
{
  long i;
  float x2, y;
  const float threehalfs = 1.5F;

  x2 = number * 0.5F;
  y  = number;
  i  = * ( long * ) &y;  // evil floating point bit level hacking
  i  = 0x5f3759df - ( i >> 1 ); // what the fuck?
  y  = * ( float * ) &i;
  y  = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
  // y  = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed

  #ifndef Q3_VM
  #ifdef __linux__
    assert( !isnan(y) ); // bk010122 - FPE?
  #endif
  #endif
  return y;
}

Not only does it work, on some CPU’s Carmack’s Q_rsqrt runs up to 4 times faster than (float)(1.0/sqrt(x), eventhough sqrt() is usually implemented using the FSQRT assembley instruction!

In another file, code/common/cm_trace.c , a neater implementation of the same hack can be found. This time it is used for calculating the square root of a float – sqrt(x). Note that the only real difference is in the return value – instead of returning y, return number*y as the square root:

/*
================
SquareRootFloat
================
*/
float SquareRootFloat(float number) {
    long i;
    float x, y;
    const float f = 1.5F;

    x = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;
    i  = 0x5f3759df - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( f - ( x * y * y ) );
    y  = y * ( f - ( x * y * y ) );
    return number * y;
}

Newton’s Approximation of Roots

The code above implements the well known Newton Approximation of roots [3]. As in most other iterative approximation calculations, The Newton approximation is supposed to be ran in iterations; each iteration enhances the accuracy until enough iterations have been made for reaching the desired accuracy.

The general idea behind Newton’s approximation is that whenever we have a guess y for the value of the square root of a number x, we can perform a simple manipulation to get a better guess (one closer to the actual square root) by averaging y with x/y. For example we can compute the square root of 2 as follows; Suppose the initial guess is 1:

2/1 = 2 ;  (2 + 1) / 2 = 1.5
2/1.5 = 1.3333; ( 1.5 + 1.3333 ) / 2 = 1.4167
2/1.4167 = 1.4117;  ( 1.4167 + 1.4117 ) / 2 = 1.4142
And so on...

As mentioned before, Newton’s approximation is a well known method of fast root calculations. However, Carmack picked an unusual first guess for the first iteration that enhances the approximation drastically and lowers the number of required iterations in Quake III’s calculations to only one iteration!

A Witchcraft Number

The really interesting aspect of this function is the magic constant 0x5f3759df, used to calculate the initial guess, in :

i  = 0x5f3759df - ( i >> 1 );

Hence, dividing the input by 2 and subtracting it from the magic constant. This constant works almost perfectly – Only one Newton approximation iteration is required for a low relative error of 10^-3. . As the code illustrates in the commented out second iteration, this approximation is good enough for the Quake III engine.

It turns out that the magic constant 0x5f3759df is a real mystery. In the article “Fast Inverse Square Root” [2] , mathematician Chris Lomont of Purdue University researches the meaning of this magic constant. Using several elaborate techniques, Lomont tries to mathematically calculate this constant himself. The results are very surprising – Lomont’s well calculated mathematically optimal constant turns out to be slighltly different (0x5f37642f) , and in spite of being theoretically better, it yields worse results then the original constant used in the source code!! Indeed, John Carmack must have used genuine black magic to find this number.

Only by numerically searching for another better constant, Lomont found one, which is slightly better that the original. However, in practice the two constants are likely to yield similar results. Lomont proposes this function with the better constant:

float InvSqrt(float x)
{
  float xhalf = 0.5f*x;
  int i = *(int*)&x; // get bits for floating value
  i = 0x5f375a86- (i>>1); // gives initial guess y0
  x = *(float*)&i; // convert bits back to float
  x = x*(1.5f-xhalf*x*x); // Newton step, repeating increases accuracy
  return x;
}

References

Posted by Tomer Margolin | | [34] comments

34 Comments »

  1. For a better explanation about this, check out http://www.mceniry.net/papers/Fast%20Inverse%20Square%20Root.pdf

    Comment by JDT — November 15, 2007 @ 11:40 pm

  2. [...] Root Implementation In Quake III Carmack is a genius. A real one. (tags: programming games code) Last Modified : November 17th, 2007 Filed under : Del.icio.us Navigate : Previous post / Share: [...]

    Pingback by links for 2007-11-16 -- A Tempest of Thoughts — November 16, 2007 @ 11:30 pm

  3. Could the “magic” number have been found also using an iterative, monte-carlo approach?

    Comment by Dean — November 18, 2007 @ 5:53 pm

  4. Nice article, Tomer! FYI, “(i>>1)” is mostly there to divide the floating-point *exponent* by 2, since sqrt(x) is just x^(1/2). We then take “-(i>>1)” to negate the exponent, to compute 1.0/sqrt(x), or x^(-1/2). Finally, the funky constant’s high bits correct the IEEE exponent bias (where to store 2^n, the exponent field actually holds n-127). Chris Lomont’s article gives the gory details on funky constant’s low bits, which improve the Newton iteration’s guess for the mantissa.

    Comment by Dr_Lawlor — November 19, 2007 @ 3:36 am

  5. Carmack is really smart but he didn’t come up with this..
    Take a look at:
    http://www.beyond3d.com/content/articles/8/

    Comment by John S. — November 19, 2007 @ 6:17 am

  6. This is incredibly clever, I was reading about this the other day, but Id never even heard of this magic number, I will need to look into this when I get the time!

    Comment by peter — November 20, 2007 @ 4:06 pm

  7. Actually, it was not John Carmack who wrote the code. Maybe he wrote the comments ;-) A nice article at http://www.beyond3d.com/content/articles/8/ reveals it’s author is probably Gary Tarolli of 3dfx fame.

    Comment by Roman Kopac — November 20, 2007 @ 8:09 pm

  8. [...] rynmonar wrote an interesting post today onHere’s a quick excerpt [...]

    Pingback by CodeMaestro » Magical Square Root Implementation In Quake III — November 20, 2007 @ 11:51 pm

  9. This neat little function originated from Greg Walsh while at Ardent Computer, not from John Carmack.

    Comment by archie4oz — November 21, 2007 @ 6:03 am

  10. I would love to hear Mr Carmacks thoughts on this, how did he come by this, why does it work ?

    How could a genetic algorithim improve this constant ?

    Comment by mordain — November 21, 2007 @ 6:32 am

  11. AFAIR this is not Carmack’s invention, but rather a guy from MIT or somewhere else. And Quake 3 is not the first Quake to have this code. So check the history, and (as Carmack did) you may find some interesting stuff.

    Comment by jdz — November 21, 2007 @ 8:47 am

  12. [...] [link][more] [...]

    Pingback by pepemosca » Blog Archive » Magical Square Root Implementation In Quake III — November 21, 2007 @ 9:08 am

  13. And just to complete the References, here’s another excellent paper on the topic (probably more interesting than the one by Chris Lomont) http://www.mceniry.net/papers/Fast%20Inverse%20Square%20Root.pdf

    Comment by Daniel Mueller — November 21, 2007 @ 10:01 am

  14. The code is elegant but not written by Carmack.
    Here is a short piece of the email

    >Hi John,
    >
    >There’s a discussion on Beyond3D.com’s forums about who the author of
    >the following is:

    Not me, and I don’t think it is Michael. Terje Matheson perhaps?

    John Carmack

    And here you can find the rest of the search for the origin of this piece of code:

    http://www.beyond3d.com/content/articles/8/

    Comment by Mythor — November 21, 2007 @ 1:09 pm

  15. links for 2007-11-22…

    CodeMaestro » Magical Square Root Implementation In Quake III (tags: programming math) Mac Buyer’s Guide: Know When to Buy Your Mac, iPod or iPhone (tags: mac shopping)……

    Trackback by Primordial Ooze — November 22, 2007 @ 2:26 am

  16. Carmack didn’t invent this implementation. It’s an algorithm that has been around for about 20 years; Greg Walsh is suspected of inventing it. See the paper entitled, “Fast Inverse Square Root” by Chris Lomont for details. I think there’s a link to it on this blog; if not, you’ll find it on google for sure.

    Comment by Greg Dolley — November 22, 2007 @ 4:37 am

  17. [...] Article link [...]

    Pingback by Quest3D|Log » Blog Archive » Carmack’ magic constant — November 22, 2007 @ 10:09 am

  18. Is anyone aware of a double precision implementation? Presumably the magic number at the heart of the speed-up would need to be different.

    Comment by Graeme E. Smith — November 23, 2007 @ 10:26 am

  19. [...] CodeMaestro: Magical Square Root Implementation in Quake III [...]

    Pingback by Cornell CS 322 - Intro to Scientific Computing » Blog Archive » Quake3’s fast invSqrt() — January 30, 2008 @ 4:01 am

  20. What the ???

    Why can’t someone ask Carmack how he conjured the magic number :)

    And why isn’t Newton’s Approximation used in common mathematics?

    Comment by ATOzTOA — February 5, 2008 @ 9:26 am

  21. [...] After spending a lot of years writing C code it is sometimes hard to get some of those “tricks” out of your mind. The thing I always felt about C was it seemed to encourage you figure out or use cool tricks. Couple that with graphics coding you can get some strange stuff like the infamous 0×5f3759df q_sqrt function. Even though some of them are fairly worthless in modern computing, they are fun to look at and think about why they work! [...]

    Pingback by Those old C bit tricks | Oh Null! — March 5, 2008 @ 1:34 am

  22. [...] For those who don’t know, Ken Silverman is the programmer that created the game-engine for Duke Nukem 3D. Silverman received lots of praise from John Carmack – which says a lot because Carmack is co-founder of ID Software and lead developer for games like Doom and Quake (he also wrote what is likely the fastest inverse-square-root function, but that’s a topic for another day). [...]

    Pingback by PNGout : Ramin Hossaini (blog) — April 10, 2008 @ 10:45 pm

  23. What I don’t understand is what those long/float casts do:

    i = * ( long * ) &y; // evil floating point bit level hacking
    i = 0x5f3759df – ( i >> 1 ); // what the fuck?
    y = * ( float * ) &i;

    Is there a float value you could use in place of the integer value 0x5f3759df (without doing the casts) that would give you the same result (albeit not as quickly)?

    Comment by Paul Reiners — May 29, 2009 @ 7:19 pm

  24. In a word – no.
    Notice the (i >> 1) indicates that a bit shifting operation is in play and unless you’re into messy…

    Here’s the equivalent float for you to play with:
    13211836172961054000

    (evil chuckle)

    Comment by Brian Paterson — October 26, 2009 @ 11:37 pm

  25. I remember reading an article where somebody tried to track this down, it wasn’t Carmack who created it, apparently it was on some 3dfx SDK, but they were unable to pin it down. Probably one of the programmers who did the bring up for those early 3D cards.

    Comment by pete — November 18, 2009 @ 3:24 pm

  26. Maybe. But you’d have a chance of mistake between 4-bytes, 8-bytes and 10-bytes float types. You would not have shift-right “>>” operation no more. And subtraction would have very different meaning and probably a bit less of speed. Also you don’t have straight-forward way to specify each and very bits of the number in float form.

    Though i’d prefer C++ way of doing this:
    int& i = &f;

    Since it is bit-hackery in levels deeper than assembler, then little trick of “treat this as integer” is much more important than replacing blackmagic integer constant with anothe blackmagic human-unreadable float one.

    Comment by Arioch — December 22, 2009 @ 11:23 am

  27. CodeMaestro » Magical Square Root Implementation In Quake III…

    Trackback by wobbleing soup — March 1, 2010 @ 10:27 pm

  28. You’re right Pete. There’s the article in question:
    http://www.beyond3d.com/content/articles/8/

    Comment by Nico — March 29, 2010 @ 8:32 pm

  29. [...] your life. It’s the code that will not rot. How many years have passed since the invention of Quake 3′s fast square root function? That, my friends, is elegance in its purest form. You won’t find that in your [...]

    Pingback by Too Much BS in CS « # Victor Petrov — September 27, 2010 @ 2:22 am

  30. [...] John Carmack hack was faster, but it gave incorrect results starting at n=410881. However, as suggested by [...]

    Pingback by Fastest way to determine if an integer’s square root is an integer — HTMLCoderHelper.com — October 4, 2010 @ 6:54 pm

  31. Who is the best, most effective ‘PR guy’ of the art of programming?…

    From Wikipedia article: John D. Carmack II (born August 20, 1970) is an American game programmer and the co-founder of id Software. Carmack was the lead programmer of the id computer games Wolfenstein 3D, Doom, Quake, their sequels and the Commander Ke…

    Trackback by Quora — January 17, 2011 @ 10:25 pm

  32. [...] a tal proposito quando, appena passato l’esame di analisi/calcolo numerico, lessi e compresi questo post in cui si parlava dell’approssimazione del reciproco della radice quadrata, algoritmo basato [...]

    Pingback by Il ruolo della matematica nell’informatica… « JP's Web Place — March 14, 2011 @ 6:08 am

  33. [...] a tal proposito quando, appena passato l’esame di analisi/calcolo numerico, lessi e compresi questo post in cui si parlava dell’approssimazione del reciproco della radice quadrata, algoritmo basato [...]

    Pingback by Il ruolo della matematica nell’istruzione informatica… « JP's Web Place — March 14, 2011 @ 7:56 am

  34. Oh, well that’s easy. sqrt(2^127) is 13043817825332782212, reasonably close to your 13211836172961054000. The 127 is because IEEE 32-bit float uses an excess-127 exponent. The small 1.3% discrepancy brings the two straight lines closer on average to the desired parabola.

    > Here’s the equivalent float for you to play with:
    >
    > 13211836172961054000
    >
    > (evil chuckle)

    Comment by Robert Munafo — April 14, 2011 @ 8:35 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment