My fitness center after the lockdown: I haven’t seen my friend Norbert in weeks — Calculations and Monte Carlo Simulations for that Probability

Gerhard Svolba
7 min readJul 31, 2020

Starting with May 29th, fitness centers in Austria were allowed to re-open under certain restrictions. As my pause in physical exercises during the lock-down left some visible traces, I am back in my fitness center twice a week.

However I have not met my friend Norbert there for the last 9 weeks. Just bad luck? Non-overlapping time plans? Did Norbert reduce his training efforts? The statistician in me wants to know the probability of not meeting Norbert. This article shows two approaches to find the probability for that event, and a solution to overcome the probability ;-)

  • Probability Calculations (using paper and pencil)
  • Monte Carlo Simulations (using SAS Viya)

Defining the Analysis Problem

Let’s assume that Norbert and I go to the fitness twice a week from Monday to Friday. There is also the recommendation that for better recovery you should not go to the fitness center on two consecutive days. Tuesday+Wednesday, for example, is not an option. Tuesday+Thursday would work.

Our fitness center is open from 6:30 am to 9:30 pm. You can book the slots on a 15-minute schedule. We usually spend 1 hour there; including check-in, exercises, check-out.

  • Thus we would meet on the same day, when our time slots overlap
  • or when the slots adjoin each other (e.g. Gerhard books a slot at 8:30 am and Norbert books a slot at 9:30 am. We would meet, when I leave and he enters the fitness center).

Calculating the Probability

In order to get the probability that we did not meet for the last nine weeks, we calculate the probability for meeting in one week.

Here we assume that both, the selection of the weekdays and the selection of the time slots are independent from each other. There is no preference for the selection of the time slots (like “I came early on Monday, so I book a slot late on Thursday”). And there is no preference for the selection of the weekdays. We only observe the recommendation to pause after a training day.

1. The probability to select the same weekday for our training

For simplicity we enumerate the weekdays from 1–5 (Monday=1, … Friday=5). The possible combinations of days are now 13, 14, 15, 24, 25, 35 and we assume that, Gerhard and Norbert randomly selects an individual combination for each week.

This means, we would have selected the same day, if Gerhard selects Monday+Wednesday (13) and Norbert selects Wednesday+Friday (35). When we arrange these combinations in a matrix, we have the following picture.

You see that there are

  • 6 combinations where we select two identical days (13+13, 14+14, …). Prob=6/36 = 0.16666
  • 18 combinations where we have one overlapping day in our selection. Prob=18/36=0.5
  • 12 combination where we do not overlap at all. Prob=12/36=0.33333

The differentiation of selecting one or two overlapping days per week is important as this also increases our chance to have selected the same time slot on a particular day and to meet each other.

2. The probability for selecting the same time-slot

Next consider the time-slots. As mentioned above we meet, when our time slots overlap or when they adjoin each other. The opening hours are 6:30 am to 9:30 pm, thus the bookable slots range from 6:30 am to 8:30 pm.

Gerhard and Norbert can select from 57 time slots (14 hours from 6:30am to 8:30pm multiplied with 4 slots per hour = 56 + 1 slots at 8:30pm). As Gerhard and Norbert make their choices independently from each other we have 57 x 57 = 3249 possible combinations of time slots.

How many of them are a “G meets N” events? If Gerhards selects for example 9:30 am, Norbert has 9 possible slots 8:30am/8:45am/../10:30am where they would meet. This is true for all the slots “during” the day, however not for the slots at the beginning or the end of the of they day.

  • If Gerhard selects 6:30am, Norbert only has 5 slots (6:30/6:45/.../7:30) avaialable to overlap.
  • If Gerhard selects 6:45am, Norbert has 6 slots (6:30,6:45/…/7:45) available to overlap.

The same pattern exist at the end of the day. When we take Gerhard selecting his slots

  • we have 441 overlapping slots from 7:30 am to 19:30 pm, (49 x 9 )
  • we have 26 overlapping slots from 6:30 am to 7:15 am (5+6+7+8=26)
  • we have 26 overlapping slots from 19:45 pm to 20:30 am (5+6+7+8=26)

In total we have 493 slots where we overlap and meet, out from 3249 possible slots we have a probability P(TimeSlotOverlap)=0.1517.

In case Norbert and I selected the same day for a visit to our fitness center, there would be a probability of 15.17 % that we met there during our training.

3. Calculating the probability to meet in one week

Let’s now combine this with the probability for the selection of the same day(s):

  • 0 overlapping days/week: 0.3333 x 0 = 0
  • 1 overlapping day/week: 0.5 x 0.1517 = 0.07585
  • 2 overlapping days/week: 0.1666 x (1-(1–0.1517)²)= 0,1666 x 0,2805= = 0,04673. Note that here the probability of meeting once or twice is calculated as the counter-probability of not meeting at both occasions.

Adding the weighted probabilities up results into P(G_meets_N)=0,1226 (0+0.07587+0.04674). This is the probability that Gerhard meets Norbert in one week. So with 12.26 % probability I should have seen Norbert in one of these 9 weeks.

4. Probability for not meeting Norbert in 9 weeks

In order to calculate the the probability for not meeting Norbert in 9 weeks, the counter probability of meeting him is taken to the power of 9:

(1–0.12261)**9= 0,30812

The probability for the event that happened to me (not meeting Norbert in 9 weeks) is 30,81%.

Conclusion

What is the best way to increase the probability to meet Norbert?

  • Wait for a few more weeks?
  • Go to the fitness center on a daily basis?
  • Stay longer in the fitness center?

I have a better advise: Pick up the phone and call Norbert! And this is what I did this week and we met in a restaurant and had a nice chat.

One of the learnings of this article could be: pick up your phone and call a friend you haven’t met or spoken to for a long time. Don’t wait or postpone. Sometimes you have to overcome the probabilities and not wait for random events to happen.

Running Monte Carlo Simulations

I also ran a Monte Carlo Simulation in SAS Viya to simulate these probabilities. After my holidays (End of August) I will publish an article on SAS Communities which discusses the simulation framework and the SAS code for that.

Here I am just summarizing a few findings and learnings when implementing the Monte Carlo simulation.

Large number of iterations needed

Only when I increased the number of iterations to 1,000,000 and higher, I received quite stable estimates for the probabilities “TimeSlotOverlap” and “G_meets_N”.

With 100,000 iterations I still had a very high variability. It looks like that large number of time slots and deriving the overlapping probability by simulation requires a large number if iterations.

Be careful when assigning the visit days in your simulation

Finally I chose the following solution to assign the visit days for Gerhard and Norbert. I randomly selected days for Gerhard and Norbert and only retained the records that are not consecutive days.

Days_G[1] = rand(‘Table’,1/5,1/5,1/5,1/5,1/5);Days_G[2] = rand(‘Table’,1/5,1/5,1/5,1/5,1/5);Days_N[1] = rand(‘Table’,1/5,1/5,1/5,1/5,1/5);Days_N[2] = rand(‘Table’,1/5,1/5,1/5,1/5,1/5);if abs(Days_G[1] — Days_G[2]) > 1 and abs(Days_N[1] — Days_N[2]) > 1 then do;

This procedure makes sure that I have equal probabilities for all combinations Mon+Wed, Mon+Thu, … Otherwise you get unequal probabilities.

E.g. when you program the following way, where you assign the first visit day randomly from Mon-Wednesday and then assign the 2nd day based on the value of the first day.

Days_G[1] = rand(‘Table’,1/3,1/3,1/3,0,0);if Days_G[1] = 1 then Days_G[2] = rand(‘Table’,0,0,1/3,1/3,1/3);else if Days_G[1] = 2 then Days_G[2] = rand(‘Table’,0,0,0,1/2,1/2);else if Days_G[1] = 3 then Days_G[2] = 5;

At the first sight this code looks might look fine, however when you run the simulation you find out that the frequency of the days Thursday and Friday is much higher.

It is not enough that your simulations run through without errors

This might sound obvious: The fact that the simulation code does not produce errors, does not mean that it is correct. You definitely know that.

Especially when a lot of calculations cascade in you simulation procedure, it might be hard to detect that something went wrong. At the end you receive a result. And it is hard to decide whether it is right or wrong as you most likely simulated it because it is hard to calculate it.

I consequently add checks for distributions for my parameters into my simulation code. Only after I checked some of the univariate and multivariate frequencies of my temporary simulation results I realized two (stupid) mistakes in my code.

--

--

Gerhard Svolba

Applying data science and machine learning methods-Generating relevant findings to better understand business processes