## A easy but efficient technique to enhance the period of time sequence knowledge

*This weblog submit is offered as a **jupyter pocket book on GitHub**.*

Augmentations have change into an indispensable part within the realm of laptop imaginative and prescient pipelines. Nonetheless, their reputation hasn’t reached the identical heights in different domains, equivalent to time sequence. On this tutorial, I’ll delve into the world of time sequence augmentations, shedding gentle on their significance and offering concrete examples of their software utilizing the highly effective generative time sequence modeling library, TSGM [5].

Our start line is a dataset denoted (𝐗, 𝐲). Right here, 𝐱ᵢ ∈ 𝐗 are multivariate (that means, every time level is a a number of dimensional function vector) time sequence, and y are labels. Predicting labels y is named a downstream activity. Our aim is to make use of (𝐗, 𝐲) to supply further samples (𝐗*, 𝐲*), which might assist us clear up the downstream activity extra successfully (by way of predictive efficiency or robustness). For simplicity, we received’t work with labels on this tutorial, however the strategies we describe listed below are simple to generalize to the case with labels, and the software program implementations we use are simply prolonged to the supervised case by including further parameters to the `.generate`

methodology (see examples under).

With out additional ado, let’s contemplate time sequence augmentations one after the other.

In TSGM, all augmentations are neatly organized in `tsgm.fashions.augmentations`

, and you may take a look at the excellent documentation accessible at TSGM documentation.

Now, let’s kickstart coding examples by putting in tsgm:

`pip set up tsgm`

Shifting ahead, we import tsgm, and cargo an exemplary dataset. A tensor `X`

now comprises 100 sine time sequence of size 64, with 2 options every. With random shift, frequencies, and amplitudes (most amplitude is 20).

`# import the libraries`

import matplotlib.pyplot as plt

import seaborn as sns

import numpy as np

import random

from tensorflow import keras

import tsgm

`# and now generate the dataset`

X = tsgm.utils.gen_sine_dataset(100, 64, 2, max_value=20)

First, as the primary augmentation we contemplate jittering.

Time sequence knowledge are augmented with random Gaussian noise (Wikipedia)

In tsgm, Gaussian noise augmentation may be utilized as follows:

`aug_model = tsgm.fashions.augmentations.GaussianNoise()`

samples = aug_model.generate(X=X, n_samples=10, variance=0.2)

The concept behind Gaussian noise augmentation is that including a small quantity of jittering to time sequence most likely won’t change it considerably however will enhance the quantity of such noisy samples in our dataset. It usually makes the downstream fashions extra sturdy to noisy samples or improves predictive efficiency.

The hyperparameters of Gaussian noise and the way in which of including the noise (e.g., Gaussian noise can enhance in the direction of the tip of a time sequence) is a tough query and will depend on a selected dataset and downstream drawback. It’s usually price experimenting and seeing how these parameters have an effect on the efficiency of the goal mannequin.

Right here, we offer a visualization of samples from the unique sine dataset and augmented samples.

One other strategy to time sequence augmentation is solely shuffle the options. This strategy is appropriate just for specific multivariate time sequence, the place they’re invariant to all or specific permutations of options. As an example, it may be utilized to time sequence the place every function represents similar unbiased measurements from varied sensors.

To clarify this strategy, let’s take the instance of 5 equivalent sensors, labeled as S_1, S_2, S_3, S_4, and S_5. For the sake of illustration, let’s assume that sensors 1–4 are most likely exchangeable with respect to rotations. Then it is smart to strive augmenting knowledge with function rotations with respect to rotations of S_1, …, S_5 sensors.

Equally to the earlier instance, the augmentation can work as follows:

`aug_model = tsgm.fashions.augmentations.Shuffle()`

samples = aug_model.generate(X=X, n_samples=3)

Right here, we present one pattern from a timeseries with 5 options, and an augmented pattern, analogously to the picture above.

Slice and shuffle augmentation [3] cuts a time sequence into slices and shuffles these items. This augmentation may be carried out for time sequence that exhibit some type of invariance over time. As an example, think about a time sequence measured from wearable units for a number of days. The great technique for this case is to slice time sequence by days and, by shuffling these days, get further samples. Slice and shuffle augmentation is visualized within the following picture:

`aug_model = tsgm.fashions.augmentations.SliceAndShuffle()`

samples = aug_model.generate(X=X, n_samples=10, n_segments=3)

Let’s view augmented and authentic samples:

Magnitude warping [3] adjustments the magnitude of every pattern in a time sequence dataset by multiplication of the unique time sequence with a cubic spline curve. This course of scales the magnitude of time sequence, which may be useful in lots of instances, equivalent to our artificial instance with sines `n_knots`

variety of knots at random magnitudes distributed as *N(1, σ^2*) the place *σ *is about by a parameter `sigma`

in perform `.generate`

.

`aug_model = tsgm.fashions.augmentations.MagnitudeWarping()`

samples = aug_model.generate(X=X, n_samples=10, sigma=1)

Right here is an instance of authentic knowledge and augmented samples generated with `MagnitudeWarping`

.

On this method [4], the chosen home windows in time sequence knowledge are both dashing up or down. Then, the entire ensuing time sequence is scaled again to the unique dimension as a way to maintain the timesteps on the authentic size. See an instance of such augmentation under:

Such augmentation may be useful, e.g., in modeling gear. In such purposes, sensor measurements can change the pace of change relying on how these items of kit are used.

In tsgm, as at all times, the era may be finished by way of

`aug_model = tsgm.fashions.augmentations.WindowWarping()`

samples = aug_model.generate(X=X, n_samples=10, scales=(0.5,), window_ratio=0.5)

An instance of a generated time sequence may be discovered under.

Dynamic Time Warping Barycentric Common (DTWBA)[2] is an augmentation methodology that’s based mostly on Dynamic Time Warping (DTW). DTW is a technique of measuring similarity between time sequence. The concept is to “sync” these time sequence, as it’s demonstrated within the following image.

Extra particulars on DTW computation can be found at https://rtavenar.github.io/weblog/dtw.html.

DTWBA goes like this:

1. The algorithm picks one time sequence to initialize the DTWBA end result. This time sequence can both be given explicitly or may be chosen randomly from the dataset

2. For every of the `N`

time sequence, the algorithm computes DTW distance and the trail (the trail is the mapping that minimizes the gap)

3. After computing all `N`

DTW distances, the algorithm updates the DTWBA end result by doing the common with respect to all of the paths discovered above

4. The algorithm repeats steps (2) and (3) till the DTWBA end result converges

A reference implementation may be present in tslearn, and an outline may be present in [2].

In tsgm, the samples may be generated as follows

`aug_model = tsgm.fashions.augmentations.DTWBarycentricAveraging()`

initial_timeseries = random.pattern(vary(X.form[0]), 10)

initial_timeseries = X[initial_timeseries]

samples = aug_model.generate(X=X, n_samples=10, initial_timeseries=initial_timeseries )

One other strategy to augmentation is to coach a machine studying mannequin on historic knowledge and practice it to generate novel artificial samples. It’s a blackbox methodology as a result of it’s exhausting to interpret how new samples have been generated. A number of strategies may be utilized within the case of time sequence; particularly, tsgm has VAE, GANs, and Gaussian processes. An instance of the era of artificial time sequence with VAEs is

`n, n_ts, n_features = 1000, 24, 5`

knowledge = tsgm.utils.gen_sine_dataset(n, n_ts, n_features)

scaler = tsgm.utils.TSFeatureWiseScaler()

scaled_data = scaler.fit_transform(knowledge)

structure = tsgm.fashions.zoo[“vae_conv5”](n_ts, n_features, 10)

encoder, decoder = structure.encoder, structure.decodervae = tsgm.fashions.cvae.BetaVAE(encoder, decoder)

vae.compile(optimizer=keras.optimizers.Adam())

vae.match(scaled_data, epochs=1, batch_size=64)

samples = vae.generate(10)

We explored a number of strategies for artificial time sequence era. A lot of them introduce inductive biases into the mannequin and are helpful in sensible settings.

How to decide on? First, analyze whether or not your drawback comprises invariances. Is it invariant to random noise? Is it invariant to function shuffling?

Subsequent, select a broad set of strategies and confirm whether or not any of the chosen strategies enhance the efficiency of your downstream drawback (tsgm has downstream efficiency metric). Then, choose the set of augmentation strategies that offers the biggest efficiency enhance.

*Final, however not least, I thank Letizia Iannucci and Georgy Gritsenko for assist and helpful discussions about writing of this submit. Except in any other case famous, all pictures are by the creator.*

This weblog submit is part of the venture TSGM, through which we’re making a instrument for enhancing time sequence pipelines by way of augmentation and artificial knowledge era. In case you discovered it useful, check out our repo and contemplate citing the paper about TSGM:

`@article{`

nikitin2023tsgm,

title={TSGM: A Versatile Framework for Generative Modeling of Artificial Time Collection},

creator={Nikitin, Alexander and Iannucci, Letizia and Kaski, Samuel},

journal={arXiv preprint arXiv:2305.11567},

yr={2023}

}

[1] H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken phrase recognition”. IEEE Transactions on Acoustics, Speech, and Sign Processing, 26(1), 43–49 (1978).

[2] F. Petitjean, A. Ketterlin & P. Gancarski. A worldwide averaging methodology for dynamic time warping, with purposes to clustering. Sample Recognition, Elsevier, 2011, Vol. 44, Num. 3, pp. 678–693

[3] Um TT, Pfister FM, Pichler D, Endo S, Lang M, Hirche S,

Fietzek U, Kulic´ D (2017) Knowledge augmentation of wearable sensor knowledge for parkinson’s illness monitoring utilizing convolutional neural networks. In: Proceedings of the nineteenth ACM worldwide convention on multimodal interplay, pp. 216–220

[4] Rashid, Okay.M. and Louis, J., 2019. Window-warping: a time sequence knowledge augmentation of IMU knowledge for building gear exercise identification. In ISARC. Proceedings of the worldwide symposium on automation and robotics in building (Vol. 36, pp. 651–657). IAARC Publications.

[5] Nikitin, A., Iannucci, L. and Kaski, S., 2023. TSGM: A Versatile Framework for Generative Modeling of Artificial Time Collection. *arXiv preprint arXiv:2305.11567*. Arxiv hyperlink.