In their paper "Bootstrapping the Battle of Britain" which was published in The Journal of Military History (84(1), 151-186), Fagan, BT, Horwood, I, MacKay, NJ, Price, C, Richards, E & Wood, AJ discussed using the statistical technique of weighted bootstrapping to model a hypothesis about the battle. I initially started by reproducing the original analysis but as I have my own data set that starts earlier I have started exploring using bootstrap simulations to investigate aspects of the battle.

What is Bootstrapping?

Bootstrapping uses sampling with replacement to generate an alternative data set based on the original. For example, if we have 30 data points then to create an equivalent bootstrap sample we would randomly select one of those data points 30 times, picking from the full data set each time meaning each original data point could appear more than once (or not at all) in the bootstrap sample. This is what's meant by "replacement", each time we pick a data point we place it back into the sample pool we are drawing from. By running this sampling process a large number of times the aggregate bootstrap data sets should be representative of the original data set.

Block bootstrapping

The Battle of Britain is typically broken up into phases (e.g. Kanalcampf, Adlerangriff, the Blitz) so for bootstrapping the whole data set can be broken down into sub-sets that correspond to these phases. For example, if we define Adlerangriff as lasting from 12 August to 17 September then we can create bootstrap data sets from this block to represent a hypothetical Adlerangriff.


Using blocks enables the creation of "what if?" scenarios where we model what would happen if Adlerangriff had been started earlier and run for longer by creating bootstrap data sets that are weighted differently to the original.


For my modelling I have taken the approach of using a starting value (S0) and walking through the data set applying the value for each day (Lt) to create a simulated run of the battle.

The value on each day is thus: St = S(t-1) + Lt

Back testing

In order to test the validity of the model we need to run it against the historical data to check it at least approximates the historical results, if it doesn't resemble the historical data then it suggests that there's something wrong with our model or assumptions.

For example if we look at the classic example of R.A.F. fighter pilot strength we can use the historical values for this during the battle given in Appendix 11 of The Narrow Margin:

Date Strength
1940-06-15 1,094
1940-06-30 1,200
1940-07-06 1,259
1940-07-13 1,341
1940-07-20 1,365
1940-07-27 1,377
1940-08-03 1,434
1940-08-10 1,396
1940-08-17 1,379
1940-08-24 1,377
1940-08-31 1,422
1940-09-07 1,381
1940-09-14 1,492
1940-09-21 1,509
1940-09-28 1,581

To run our model we start with the value for S0 of 1,094 on 15 June 1940 (as that's the first day that is in both our data set and the historical values) then for each day in our data set we subtract the number of pilots lost or wounded (Lt) and add on the number of replacement pilots per day that Fighter Command was receiving (r) this gives us a figure for the R.A.F.'s pilot strength on each day:

St = S(t-1) - Lt + r

Before creating our bootstrapped data sets we run this on our data set in chronological order to establish a value for r that fits the historical data. The initial runs with a single value for r didn't fit the correct shape of the recorded numbers so, by means of experimentation, I discovered that multiplying r by 1.6 prior to 14 July and 2.25 after 30 August made for a better fit.

As the shape of these graphs matches the recorded pilot numbers it indicates that the model is valid and a value for r of six provides a reasonable fit for this data set.

The value of r used here is not to suggest that this was the actual pilot replacement value but merely a value that is required to make our model "work" - it's a fudge factor. (An actual value for r derived from a primary source would be about 9 - 280 new pilots for the month of August. )


Now we have some confidence in our model it's time to put the bootstrap to work in order to investigate some aspects of the battle.

Initially it's helpful to understand what the bootstrap simulation does. For the back test we used the values of Lt in chronological order. For the bootstrapped model we create new, randomly selected, sequences of days to run through the model and then then average (rounded mean) the values for St from each day of all the runs to give us a sequence of values. Unless otherwise stated, the simulations here are using 10,000 runs.

If we treat the period we're examining as a single phase the bootstrapped result gives us a straight line between the starting strength and the ending strength. This intuitively makes sense as we're taking random samples of each day and averaging them out so we would expect it to produce a straight line average across all days.

I take two things away from this: The first is that viewing this data as a single phase isn't particularly helpful as the simulated result isn't a good match for the shape of historical data. The second is that picking the start and end point of phases is going to be very important as the result of that phase will be the average of the start and end points.


The Battle of Britain is typically viewed as series of distinct phases with those established in 1943 by the official R.A.F. history of the battle being the most common definition.

Dates Phase
1940-07-10 - 1940-08-07 Phase I: Channel Battles
1940-08-08 - 1940-08-18 Phase II
1940-08-19 - 1940-08-23 Pause
1940-08-24 - 1940-09-06 Phase III
1940-09-07 - 1940-10-21 Phase IV

Running a bootstrap using these phases gives a good fit for the model.

Whilst the R.A.F. phases match up well with R.A.F. pilot numbers they don't cover the initial, lower intensity, fighting and they don't match up with the phases of strategy employed by the Luftwaffe which are outlined as follows in "To Defeat the Few":

Dates Phase
1940-07-02 - 1940-08-10 Kanalcampf
1940-08-12 - 1940-08-16 Adlerangriff Stage 1
1940-08-18 - 1940-08-22 Adlerangriff Stage 2
1940-08-24 - 1940-09-06 Adlerangriff Stage 3
1940-09-07 - 1940-09-17 Adlerangriff Stage 4

This ends on 17 September as this was when Seelöwe was officially cancelled so, whilst fighting continued, it marks the end of the Luftwaffe's attempt to obtain air superiority for an invasion.

Notably the following dates are not assigned to a phase: 1940-08-11, 1940-08-17 & 1940-08-23. For the purposes of modelling these have been assigned to a "Pause" phase which is used for these three dates in the sequence.

Running a bootstrap for these phases doesn't give as good a fit as the R.A.F. phases - although it's still in the ballpark of the historically recorded figures.

Digging deeper into this there is one day that stands out as having a disproportionate effect on the result: 11 August. This is not assigned to a phase but has a high number of pilot losses (Lt) and looking at the events of that day there was a large attack from Luftflotten 3 on Portland docks so, in terms of tactics employed, it's probably a better fit for Kanalcampf. Running a bootstrap for this revised phasing gives a better fit for the first and third stages of Adlerangriff.

There is still one outlier here and that's 18th August which, with attacks on satellite airfields, is more closely aligned to stage two of Adlerangriff. Assigning the two remaining "pause" phases to the nearest applicable stage gives us a phasing that is similar to the R.A.F. assignment.

Dates Phase
1940-07-02 - 1940-08-11 Kanalcampf
1940-08-12 - 1940-08-18 Adlerangriff Stage 1
1940-08-19 - 1940-08-23 Adlerangriff Stage 2
1940-08-24 - 1940-09-06 Adlerangriff Stage 3
1940-09-07 - 1940-09-17 Adlerangriff Stage 4

Running a bootstrap for this gives a good fit for the stages of Luftwaffe operations whilst matching their strategic phasing.

This now gives us a reasonable basis for examining alternate scenarios of Luftwaffe strategy.