Introduction to Monte Carlo Simulations
Source:vignettes/fin_intro-to-monte-carlo-simulations.Rmd
fin_intro-to-monte-carlo-simulations.Rmd
Introduction and History
Monte Carlo simulations, named after the famous casino in Monaco, have a rich history dating back to the 1940s. The method was developed by Stanislaw Ulam, a mathematician working on the Manhattan Project, and was named by Nicholas Metropolis, inspired by Ulam’s uncle’s interest in Monaco’s casinos. Originally used in physics to simulate neutron diffusion in fissile material, Monte Carlo methods quickly found applications in various fields, including finance.
The essence of Monte Carlo simulation lies in using random sampling to solve problems that might be deterministic in principle. In finance, this approach has become invaluable for dealing with the inherent uncertainties of markets and complex financial instruments. In bioinformatics and epidemiology, Monte Carlo simulations are used to model the spread of diseases and predict outcomes.
Licensing
The dmplot
package is released under the MIT license,
allowing free use and modification. Users must:
- Cite the original author (see LICENSE for details).
- Include the license in any redistribution.
Loading Sample Data
We’ll use the same sample data as in the README:
ticker <- "BTC/USDT"
data <- get_market_data(
symbols = ticker,
from = lubridate::now() - lubridate::days(7),
to = lubridate::now(),
frequency = "1 hour"
)
head(data)
#> symbol datetime open high low close volume
#> 4439 BTC/USDT 2024-01-15 05:00:00 42596.6 42814.7 42571.2 42734.3 77.96806
#> 4440 BTC/USDT 2024-01-15 06:00:00 42740.5 42797.0 42613.8 42648.2 68.88966
#> 4441 BTC/USDT 2024-01-15 07:00:00 42648.2 42766.3 42519.8 42709.8 84.02044
#> 4442 BTC/USDT 2024-01-15 08:00:00 42715.1 42772.0 42641.0 42682.8 64.09779
#> 4443 BTC/USDT 2024-01-15 09:00:00 42682.8 42756.6 42593.1 42746.8 65.64199
#> 4444 BTC/USDT 2024-01-15 10:00:00 42746.9 42764.0 42540.6 42562.1 56.32170
#> turnover
#> 4439 3330003
#> 4440 2941701
#> 4441 3582246
#> 4442 2736698
#> 4443 2801211
#> 4444 2401779
What are Monte Carlo Simulations?
Monte Carlo simulations are computational algorithms that rely on repeated random sampling to obtain numerical results. In the context of finance, they’re used to model the probability of different outcomes in a process that cannot easily be predicted due to the intervention of random variables.
Uses in Finance
Monte Carlo simulations have numerous applications in finance:
- Asset Pricing: Estimating the future value of assets or portfolios.
- Risk Management: Assessing potential losses and the probability of different risk scenarios.
- Derivatives Pricing: Valuing complex derivatives, especially those with path-dependent payoffs.
- Portfolio Optimisation: Determining optimal asset allocations under various constraints and market scenarios.
- Value at Risk (VaR) Calculations: Estimating the potential loss in value of a portfolio.
- Real Options Analysis: Valuing flexibility in business and investment decisions.
Usage of dmplot
’s MonteCarlo Class
This creates a new MonteCarlo
object, runs 1000
simulations for 30 days into the future, and plots the resulting price
paths.
box::use(dmplot[ MonteCarlo ])
monte <- MonteCarlo$new(data, number_sims = 2500, project_days = 30 * 6)
# run Monte Carlo simulation
monte$carlo()
# the results
monte$data
#> symbol datetime open high low close volume
#> <char> <POSc> <num> <num> <num> <num> <num>
#> 1: BTC/USDT 2024-01-15 05:00:00 42596.6 42814.7 42571.2 42734.3 77.96806
#> 2: BTC/USDT 2024-01-15 06:00:00 42740.5 42797.0 42613.8 42648.2 68.88966
#> 3: BTC/USDT 2024-01-15 07:00:00 42648.2 42766.3 42519.8 42709.8 84.02044
#> ---
#> 4316: BTC/USDT 2024-07-13 03:00:00 57942.8 57985.0 57787.0 57818.5 24.15110
#> 4317: BTC/USDT 2024-07-13 04:00:00 57818.6 57857.5 57773.5 57779.8 33.79860
#> 4318: BTC/USDT 2024-07-13 05:00:00 57779.9 57967.1 57762.9 57931.7 23.49568
#> 1 variable(s) not shown: [turnover <num>]
# the predicted prices
monte$simulation_results
#> close simulation datetime
#> <num> <int> <POSc>
#> 1: 57931.70 1 2024-07-13 05:00:00
#> 2: 57486.15 1 2024-07-14 05:00:00
#> 3: 57566.78 1 2024-07-15 05:00:00
#> ---
#> 449998: 62469.40 2500 2025-01-06 05:00:00
#> 449999: 62706.68 2500 2025-01-07 05:00:00
#> 450000: 63334.52 2500 2025-01-08 05:00:00
# the final prices of each simulation
monte$end_prices
#> close simulation datetime
#> <num> <int> <POSc>
#> 1: 60020.55 1 2025-01-08 05:00:00
#> 2: 65545.35 2 2025-01-08 05:00:00
#> 3: 53832.18 3 2025-01-08 05:00:00
#> ---
#> 2498: 53123.07 2498 2025-01-08 05:00:00
#> 2499: 56955.83 2499 2025-01-08 05:00:00
#> 2500: 63334.52 2500 2025-01-08 05:00:00
Visualisation
dmplot
provides several visualisation methods to help
interpret the results.
Below we plot the simulated price paths:
monte$plot_prices()
We can also visualise the distribution as violin plots these help us understand the range of possible outcomes and the distribution of final prices:
monte$plot_distribution()
Finally, we can combine historical data with the simulations to see how the modelled scenarios compare to reality:
monte$plot_prices_and_predictions()
These visualisations can provide insights into the range of possible outcomes and the likelihood of different scenarios.
Implementation and design
The basic steps for implementing a Monte Carlo simulation in finance are:
- Define the parameters and inputs of the model.
- Generate random scenarios based on the input parameters.
- Calculate the outcome for each scenario.
- Aggregate the results of all scenarios.
- Analyse the distribution of outcomes.
Our package makes this process extremely accessible and a three step
process through the MonteCarlo
R6 class:
- Create a new
MonteCarlo
object with historical price data. - Run the simulation with the desired number of simulations and projection days.
- Visualise the results using the provided plotting methods.
monte <- MonteCarlo$new(
data,
number_sims = 2500,
project_days = 30 * 6
)
monte$carlo()
monte$plot_prices()
Mathematics of Monte Carlo Simulations in Finance
The core of Monte Carlo simulations in finance is often based on the assumption that asset prices follow a geometric Brownian motion, described by the stochastic differential equation:
\[ dS = μSdt + σSdW \]
Where:
- S is the asset price
- μ is the drift (expected return)
- σ is the volatility
- dW is a Wiener process
In discrete time, this can be approximated as:
\[ S(t+Δt) = S(t) * exp((μ - 0.5σ²)Δt + σ√Δt * ε) \]
Where ε is a standard normal random variable.
Our implementation uses this formula to generate price paths, with the daily volatility estimated from historical data.
Best Practices and What Not to Do
While Monte Carlo simulations are powerful, they should be used carefully:
- Do: Understand your inputs. The quality of your simulation depends heavily on the quality of your input parameters.
- Don’t: Rely solely on historical data for parameter estimation. Past performance doesn’t guarantee future results.
- Do: Run a sufficient number of simulations. More simulations generally lead to more accurate results, but there’s a trade-off with computational time.
- Don’t: Ignore the limitations of your model. All models are simplifications of reality.
- Do: Validate your model against real-world data when possible.
- Don’t: Forget about extreme events. Standard models often underestimate the probability of extreme events.
Monte Carlo Simulation Implementation Details
C++ implementation algorithm explanation
The core of our Monte Carlo simulation is implemented in C++ for
optimal performance. Let’s break down the monte_carlo
function:
Rcpp::List monte_carlo(double seed_price, double daily_vol, int num_sims, int num_days) {
int total_rows = num_sims * num_days;
Rcpp::NumericVector close(total_rows);
Rcpp::IntegerVector sim_idx(total_rows);
Rcpp::NumericVector end_price(num_sims);
Rcpp::IntegerVector end_idx(num_sims);
int row_index = 0;
for (int i = 0; i < num_sims; ++i) {
double current_price = seed_price;
for (int j = 0; j < num_days; ++j) {
current_price *= (1 + R::rnorm(0, daily_vol));
close[row_index] = current_price;
sim_idx[row_index] = i + 1;
++row_index;
}
end_price[i] = current_price;
end_idx[i] = i + 1;
}
Rcpp::DataFrame sim_df = Rcpp::DataFrame::create(
Rcpp::_["close"] = close,
Rcpp::_["simulation"] = sim_idx
);
Rcpp::DataFrame end_df = Rcpp::DataFrame::create(
Rcpp::_["close"] = end_price,
Rcpp::_["simulation"] = end_idx
);
return Rcpp::List::create(
Rcpp::_["simulations"] = sim_df,
Rcpp::_["end_prices"] = end_df
);
}
Initialisation: We create vectors to store the simulated prices (
close
), simulation indices (sim_idx
), final prices (end_price
), and final simulation indices (end_idx
).Simulation Loop: We iterate
num_sims
times, each representing a complete price path.Price Path Generation: For each simulation, we start with the
seed_price
and generatenum_days
of price movements.-
Daily Price Movement: Each day’s price is calculated using the formula:
This implements a geometric Brownian motion, where:
-
R::rnorm(0, daily_vol)
generates a random number from a normal distribution with mean 0 and standard deviationdaily_vol
. - Multiplying by
(1 + ...)
ensures the price changes proportionally.
-
Data Storage: We store each day’s price and its corresponding simulation index.
-
Results Compilation: After all simulations, we create two data frames:
-
simulations
: Contains all simulated prices and their corresponding simulation indices. -
end_prices
: Contains only the final price of each simulation path.
-
Implementation Notes
- We use Rcpp’s random number generator (
R::rnorm
) for consistency with R’s random number generation. - The function is optimised for speed by pre-allocating memory for all results and using as few loops as necessary for all calculations.
- The results are returned as
R
data.frames
for easy integration withR
code.
R6 Class Implementation
The MonteCarlo
R6
class provides a
user-friendly interface for running Monte Carlo simulations and
analysing the results. Here’s an overview of its structure and
functionality:
The MonteCarlo
class is organised into three main
sections:
- Private fields and methods: These handle internal data validation and preparation.
- Active bindings: Provide access to simulation results.
- Public fields and methods: Allow users to configure, run, and visualise simulations.
Some of the key components are:
-
Data Preparation: The
prepare
method calculates historical returns and volatility from the input data. -
Simulation Execution: The
carlo
method calls the C++monte_carlo
function and processes the results. -
Visualisation: Three methods
(
plot_prices
,plot_distribution
,plot_prices_and_predictions
) provide different ways to visualise the simulation results.
Here’s a quick overview of the public methods available in the
MonteCarlo
class:
#' Monte Carlo Simulation R6 Class
#'
#' @description
#' An R6 class for performing Monte Carlo simulations on financial time series data.
#' This class provides methods for data preparation, simulation execution, and result visualization.
#'
#' @details
#' The MonteCarlo class uses historical price data to calculate volatility and perform
#' Monte Carlo simulations for future price movements. It leverages the C++ implementation
#' of the Monte Carlo algorithm for efficiency.
#'
#' @field data A data.table containing the historical price data.
#' @field simulation_results A data.table containing the results of the Monte Carlo simulation.
#' @field end_prices A data.table containing the final prices from each simulation path.
#' @field log_historical Logical. Whether to use log returns for historical volatility calculation.
#' @field number_sims Integer. The number of simulation paths to generate.
#' @field project_days Integer. The number of days to project into the future.
#' @field start_date POSIXct. The start date for the simulation (last date of historical data).
#' @field verbose Logical. Whether to print progress messages.
#'
#' @export
MonteCarlo <- R6::R6Class(
"MonteCarlo",
private = list(
validate_data = \()
prepare = \(log_historical = FALSE)
seed_price = NA_real_,
daily_vol = NA_real_
),
active = list(
#' @field results A list containing simulation results and end prices.
results = \()
),
public = list(
data = NULL,
simulation_results = NULL,
end_prices = NULL,
log_historical = FALSE,
number_sims = 1000,
project_days = 30,
start_date = NULL,
verbose = FALSE,
#' @description
#' Create a new MonteCarlo object.
#' @param dt A data.table containing historical price data.
#' @param log_historical Logical. Whether to use log returns for historical volatility calculation.
#' @param number_sims Integer. The number of simulation paths to generate.
#' @param project_days Integer. The number of days to project into the future.
#' @param verbose Logical. Whether to print progress messages.
initialize = \(dt, log_historical = FALSE, number_sims = 1000, project_days = 30, verbose = FALSE)
#' @description
#' Run the Monte Carlo simulation.
carlo = \()
#' @description
#' Plot the simulated price paths.
#' @return A ggplot object showing the simulated price paths.
plot_prices = \()
#' @description
#' Plot the distribution of final prices.
#' @return A ggplot object showing the distribution of final prices.
plot_distribution = \()
#' @description
#' Plot historical prices and simulated future prices.
#' @return A ggplot object showing historical and simulated prices.
plot_prices_and_predictions = \()
)
)
Performance Considerations
Monte Carlo simulations can be computationally intensive, especially
with a large number of simulations or complex models. Our package
addresses this by implementing the core simulation logic in
C++
via Rcpp
. This results in significantly
faster execution compared to pure R
implementations.
Conclusion
Monte Carlo simulations are a powerful tool in the financial analyst’s toolkit. They allow us to model complex, real-world systems and make probabilistic forecasts. However, they should be used judiciously, with a clear understanding of their assumptions and limitations.
Our package aims to make these sophisticated techniques accessible and efficient, allowing analysts to focus on interpreting results rather than implementation details.
Remember, while Monte Carlo simulations can provide valuable insights, they are not crystal balls. They are tools to help inform decision-making, not to predict the future with certainty.