To determine the threshold, the divide the historical data into two periods: first a learning phase, and then a prediction phase.

The learning phase is used to find that value of the threshold which gives the "best" performance in terms of minimizing false alarm rate and maximizing the hit rate.

(Note these are two conflicting goals, so I there must be some subjective judgement about what gives the best performance, no?)

This threshold is then used in the prediction phase.