To challenge myself, I'm going to write a blog post every day in October. These might not be particularly long or good, and I might queue up a few blog posts in advance. “Blogtober” has been done before, and the general premise is to blog every day in October. Sort of like NaNoWriMo, except it seems like a more useful endeavor for my purposes. Each post will take about an hour - judge accordingly.
In light of SB 1047 flopping, I want to talk about the use of FLOPs in AI policy. Specifically, 1026 FLOP. Where does that number come from?
Executive order 14110 specifies that reporting applies to “any model that was trained using a quantity of computing power greater than 1026 integer or floating-point operations.”1 SB 1047 would have applied to models that trained with 1026 FLOP and cost over $100,000,000. The EU AI act applies to models trained with 1025 FLOP. Where are these numbers coming from?
Jack Clark did a great writeup highlighting the difference between 1025 and 1026 in financial terms. A 1025 training run would cost ~$10.4m, a 1026 run would cost ~$104m. That seems suspiciously even. Did 1026 as a threshold come from someone leaning back in their office chair and just going one order of magnitude higher than the current state of the art?
My best guess as to where 1026 came from: 2023’s Frontier AI Regulation paper. It seemed like 1026 FLOP as a threshold for regulation started appearing in 2023, and it was a notable paper. So I did a little CTRL+f snooping to see the context in which this number first appears, and I got this:
[50] links to an Our World in Data table listing the number of petaFLOP used to train some of the most notable AI systems. According to Our World in Data, the largest training run (at the time of publication) used 2.1 x 1025 FLOP. Currently, the largest is 5 x 1025 FLOP.
So, yes - 1026 maybe was chosen because it was a bit beyond the state of the art at the time. Whether it’s still beyond the state of the art is harder to say. But it’s interesting how a number that perhaps started as a ballpark became codified in policy.
PS: for those curious, the correct nomenclature is FLOP when referring to quantity, and FLOP/s when referring to performance. h/t Lennart Heim
There are exceptions - eg if using biological sequence data, reporting starts at 1023
This is a nice guess at what happened. Another interesting codification of a seemingly random number in policy is central banks targeting 2% inflation, which came from a pretty arbitrary number that New Zealand started shooting for. https://www.reuters.com/markets/mouse-that-roared-new-zealand-worlds-2-inflation-target-2023-01-30/