Getting started with the dmplot framework
Source:vignettes/fin_getting-started-with-the-dmplot-framework-for-finance.Rmd
fin_getting-started-with-the-dmplot-framework-for-finance.Rmd
Introduction
In this vignette I explain the conventions of dmplot
and
the design choices - this will guide you in working with data and
effectively leveraging dmplot
and the ggplot2
framework to derive insight from your analyses.
dmplot
is not limited to financial data, indeed it’s
conventions and design choices are applicable to any time series data.
However, the package does indeed have a focus on financial data analysis
and visualisation.
Licensing
The dmplot
package is released under the MIT license,
allowing free use and modification. Users must:
- Cite the original author (see LICENSE for details).
- Include the license in any redistribution.
Setup
Let’s install the necessary libraries.
I often use the following packages:
-
data.table
for working with large datasets. -
dmplot
for plotting financial and time series datasets. -
kucoin
for interacting with the KuCoin API - getting cryptocurrency financial data. -
ggplot2
for plotting. -
box
for loading modules in R.
I strongly recommend using data.table
for any work in
finance. This is indeed one of the primary reasons why
data.table
was created - to work with large datasets
efficiently. For installing data.table
on M1
MacOS consult this guide for building from source: gist.github.com/dereckmezquita/ed860601138a46cf591a1bdcc95db0a2
install.packages("data.table", type = "source")
install.packages("TTR")
install.packages("ggplot2")
install.packages("box")
remotes::install_github("dereckmezquita/dmplot")
remotes::install_github("dereckmezquita/kucoin")
Now, let’s load the required libraries:
dmplot
and “Tidy Data” for Financial Data Analysis and
Visualization
The dmplot
package provides a toolkit for plotting
financial and time series datasets in the ggplot2
framework. It includes functions for plotting candlestick charts, moving
averages, Bollinger Bands, MACD, RSI, and Stochastic Oscillator.
In order to best leverage ggplot2
to, and thus visualise
our financial analyses we must adhere to the “tidy data” convention.
Whereby each column is a variable, each row is an observation, and each
cell is a single value. This is the format that dmplot
expects.
Thus, any calculated indicators should be added as new columns to the
dataset and this is what dmplot
expects.
I offer two points of guidance:
- Use the
data.table
. - Functions must return a named
list
of values.
The reason for the first point is that data.table
is a
powerful package for working with large datasets and is highly
efficient. The second point is that if you return a named
list
of values you can easily use such function within
data.table
to create new columns.
petal_ratios <- function(petal_length, petal_width, sepal_length, sepal_width) {
petal_ratio <- petal_length / petal_width
sepal_ratio <- sepal_length / sepal_width
return(list(petal_ratio = petal_ratio, sepal_ratio = sepal_ratio))
}
iris2 <- dt$as.data.table(iris)
head(iris2)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <num> <num> <num> <num> <fctr>
#> 1: 5.1 3.5 1.4 0.2 setosa
#> 2: 4.9 3.0 1.4 0.2 setosa
#> 3: 4.7 3.2 1.3 0.2 setosa
#> 4: 4.6 3.1 1.5 0.2 setosa
#> 5: 5.0 3.6 1.4 0.2 setosa
#> 6: 5.4 3.9 1.7 0.4 setosa
iris2[,
c("petal_ratio", "sepal_ratio") := petal_ratios(
Petal.Length,
Petal.Width,
Sepal.Length,
Sepal.Width
)
]
head(iris2)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species petal_ratio
#> <num> <num> <num> <num> <fctr> <num>
#> 1: 5.1 3.5 1.4 0.2 setosa 7.00
#> 2: 4.9 3.0 1.4 0.2 setosa 7.00
#> 3: 4.7 3.2 1.3 0.2 setosa 6.50
#> 4: 4.6 3.1 1.5 0.2 setosa 7.50
#> 5: 5.0 3.6 1.4 0.2 setosa 7.00
#> 6: 5.4 3.9 1.7 0.4 setosa 4.25
#> 1 variable(s) not shown: [sepal_ratio <num>]
As you can see sticking to this convention allows us to easily and
efficiently leverage the data.table
framework to calculate
new columns and thus new indicators.
Loading Sample Data
We’ll use the same sample data as in the README:
ticker <- "BTC/USDT"
data <- get_market_data(
symbols = ticker,
from = lubridate::now() - lubridate::days(7),
to = lubridate::now(),
frequency = "1 hour"
)
head(data)
#> symbol datetime open high low close volume
#> <char> <POSc> <num> <num> <num> <num> <num>
#> 1: BTC/USDT 2024-07-06 05:00:00 56057.9 56485.8 56025.2 56363.1 79.48139
#> 2: BTC/USDT 2024-07-06 06:00:00 56363.1 56573.3 56346.6 56413.8 41.94569
#> 3: BTC/USDT 2024-07-06 07:00:00 56413.7 56672.4 56402.5 56602.2 98.02022
#> 4: BTC/USDT 2024-07-06 08:00:00 56602.2 56655.3 56508.8 56555.8 49.06419
#> 5: BTC/USDT 2024-07-06 09:00:00 56555.9 56767.0 56441.6 56759.9 42.89421
#> 6: BTC/USDT 2024-07-06 10:00:00 56757.4 56887.1 56645.3 56786.6 45.64148
#> 1 variable(s) not shown: [turnover <num>]
Calculating Financial Indicators
I provide a host of functions for calculating financial indicators in
the dmplot
package. These functions are designed to be used
within the data.table
framework and return a named
list
of values.
However, if you find the need use outside functions you can easily
wrap them in a function which returns a named list
of
values.
Here we demonstrate with TTT::EMA
and
TTT::BBands
:
box::use(TTR[EMA, BBands])
box::use(dmplot[ bb ])
# redifine our function to return a list
ema <- function(x, n, wilder = TRUE) {
return(as.list(as.data.frame(EMA(x, n = n, wilder = wilder))))
}
# calculate the short and long moving averages
data[, ema_short := ema(close, n = 10, wilder = TRUE)]
data[, ema_long := ema(close, n = 50, wilder = TRUE)]
# calculate the bollinger bands
data[,
c("bb_lower", "bb_mavg", "bb_upper", "bb_pct") := bb(
close, n = 10,
sd = 2
)
]
head(data)
#> symbol datetime open high low close volume
#> <char> <POSc> <num> <num> <num> <num> <num>
#> 1: BTC/USDT 2024-07-06 05:00:00 56057.9 56485.8 56025.2 56363.1 79.48139
#> 2: BTC/USDT 2024-07-06 06:00:00 56363.1 56573.3 56346.6 56413.8 41.94569
#> 3: BTC/USDT 2024-07-06 07:00:00 56413.7 56672.4 56402.5 56602.2 98.02022
#> 4: BTC/USDT 2024-07-06 08:00:00 56602.2 56655.3 56508.8 56555.8 49.06419
#> 5: BTC/USDT 2024-07-06 09:00:00 56555.9 56767.0 56441.6 56759.9 42.89421
#> 6: BTC/USDT 2024-07-06 10:00:00 56757.4 56887.1 56645.3 56786.6 45.64148
#> 7 variable(s) not shown: [turnover <num>, ema_short <num>, ema_long <num>, bb_lower <num>, bb_mavg <num>, bb_upper <num>, bb_pct <num>]