In their paper "Bootstrapping the Battle of Britain" which was published in The Journal of Military History (84(1), 151-186), Fagan, BT, Horwood, I, MacKay, NJ, Price, C, Richards, E & Wood, AJ discussed using the statistical technique of weighted bootstrapping to model a hypothesis about the battle. As I have my own data set I have explored using boostrap simulations to investigate aspects of the battle.

What is Boostrapping?

Bootstrapping uses sampling with replacement to generate an alternative data set based on the original. For example, if we have 30 data points then to create an equivalent bootstrap sample we would randomly select one of those data points 30 times, picking from the full data set each time meaning each original data point could appear more than once (or not at all) in the bootstrap sample. This is what's meant by "replacement", each time we pick a data point we place it back into the sample pool we are drawing from. By running this sampling process a large number of times the aggregate bootstrap data sets should be representative of the original data set.

Block boostrapping

The Battle of Britain is typically broken up into phases (e.g. Kanalcampf, Adlerangriff, the Blitz) so for bootstrapping the whole data set can be broken down into sub-sets that correspond to these phases. For example, if we define Adlerangriff as lasting from 12 August to 17 September then we can create bootstrap data sets from this block to represent a hypothetical Adlerangriff.


Using blocks enables the creation of "what if?" scenarios where we model what would happen if Adlerangriff had been started earlier and run for longer by creating bootstrap data sets that are weighted differently to the original.


For my modelling I have taken the approach of using a starting value (S0) and walking through the data set applying the value for each day (Lt) to create a simulated run of the battle.

The value on each day is thus: St = S(t-1) + Lt


In order to test the validity of the model we need to run it against the historical data to check it at least approximates the historical results, if it doesn't resemble the historical data then it suggests that there's something wrong with our model or assumptions.

For example if we look at the classic example of R.A.F. fighter pilot strength we can use the historical values for this during the battle given in Appendix 11 of The Narrow Margin:

Date Strength
1940-07-06 1,259
1940-07-13 1,341
1940-07-20 1,365
1940-07-27 1,377
1940-08-03 1,434
1940-08-10 1,396
1940-08-17 1,379
1940-08-24 1,377
1940-08-31 1,422

To run our model we start with the value for S0 of 1,341 on 13 July 1940 (as that's the first day that is in both our data set and the historical values) then for each day in our data set we subtract the number of pilots lost or wounded (Lt) and add on the number of replacement pilots per day that Fighter Command was receiving (r) this gives us a figure for the R.A.F.'s pilot strength on each day:

St = S(t-1) - Lt + r

Before creating our bootstrapped data sets we run this on our data set in chronological order. As r is not a figure for which I have a source, runs were done with different values to establish a best fit.

Bootstrappd R.A.F. pilot strength (simulated with replacement rate)

As the shape of these graphs matches the recorded pilot numbers it indicates that the model is valid and a value for r of six provides a reasonable fit for this data set.

This value of r is not to suggest that this was the actual pilot replacement value but merely a value that is required to make our model "work" - it's a fudge factor.

(There is a notable divergence after 31 August but, as that's the limit of my data set, I'm ignoring it for now. That is something that will need more data to investigate but my initial guess is that the value for r will need to change at this point if the model is to follow the historical data.)


Now we have some confidence in our model it's time to put the bootstrap to work in order to investigate some aspects of the battle.

Initially it's helpful to understand what the bootstrap simulation does. For the backtest we used the values of Lt in chronological order. For the boostrapped model we create new, randomly selected, sequences of days to run through the model and then then average (rounded mean) the values for St from each day of all the runs to give us a sequence of values. Unless otherwise stated, the simulations here are using 10,000 runs.

If we treat the period we're examining as a single phase the boostrapped result gives us a straight line between the starting strength and the ending strength. This intuitively makes sense as we're taking random samples of each day and averaging them out so we would expect it to produce a straight line average across all days.

Bootstrappd R.A.F. pilot strength (single phase)

I take two things away from this: The first is that viewing this data as a single phase isn't particularly helpful as the simulated result isn't a good match for the shape of historical data. The second is that picking the start and end point of phases is going to be very important as the result of that phase will be the average of the start and end points.


The Battle of Britain is typically viewed as series of distinct phases with those established in 1943 by the official R.A.F. history of the battle being the most common definition.

Dates Phase
1940-07-10 - 1940-08-07 Phase I: Channel Battles
1940-08-08 - 1940-08-18 Phase II
1940-08-19 - 1940-08-23 Pause
1940-08-24 - 1940-09-06 Phase III
1940-09-07 - 1940-10-21 Phase IV

Running a bootstrap using these phases gives a good fit for the model.

Bootstrappd R.A.F. pilot strength (R.A.F. phase asignment)

Whilst the R.A.F. phases match up well with R.A.F. pilot numbers they don't match up with the phases of strategy employed by the Luftwaffe which are outlined as follows in "To Defeat the Few":

Dates Phase
1940-07-02 - 1940-08-10 Kanalcampf
1940-08-12 - 1940-08-16 Adlerangriff stage 1
1940-08-18 - 1940-08-22 Adlerangriff stage 2
1940-08-24 - 1940-09-06 Adlerangriff stage 3
1940-09-07 - 1940-09-17 Adlerangriff stage 4

This ends on 17 September as this was when Seelöwe was officially cancelled so, whilst fighting continued, it marks the end of the Luftwaffe's attempt to obtain air superiority for an invasion.

Notably the following dates are not assigned to a phase: 1940-08-11, 1940-08-17 & 1940-08-23. For the purposes of modelling these have been assigned to a "Pause" phase which is used for these three dates in the sequence.

Running a bootstrap for these phases doesn't give as good a fit as the R.A.F. phases - although it's still in the ballpark of the historically recorded figures.

Bootstrappd R.A.F. pilot strength (Luftwaffe phase asignment)

Digging deeper into this there is one day that stands out as having a disproportionate effect on the result: 11 August. This is not assigned to a phase but has a high number of pilot losses (Lt) and looking at the events of that day there was a large attack from Luftflotten 3 on Portland docks so, in terms of tactics employed, it's probably a better fit for Kanalcampf. Running a bootstrap for this revised phasing gives a better fit for the first and third stages of Adlerangriff.

Bootstrappd R.A.F. pilot strength (Revised Kanalcampf asignment)

There is still one outlier here and that's 18th August which, with attacks on satellite airfields, is more closely aligned to stage two of Adlerangriff. Assigning the two remaining "pause" phases to the nearest applicable stage gives us a phasing that is similar to the R.A.F. assignment.

Dates Phase
1940-07-02 - 1940-08-11 Kanalcampf
1940-08-12 - 1940-08-18 Adlerangriff stage 1
1940-08-19 - 1940-08-23 Adlerangriff stage 2
1940-08-24 - 1940-09-06 Adlerangriff stage 3
1940-09-07 - 1940-09-17 Adlerangriff stage 4

Running a bootstrap for this gives a good fit for the stages of Luftwaffe operations whilst matching their strategic phasing.

Bootstrappd R.A.F. pilot strength (Fitted Adlerangriff phase asignment)

This now gives us a reasonable basis for examining alternate scenarios of Luftwaffe strategy.