Synthetic Time Series Data Generator

Synthetic Time Series Data Generator (ts-data-generator) is a professional-grade Python library and command-line interface (CLI) engineered for data scientists, ML engineers, and software developers who require realistic, deterministic, and highly configurable synthetic time series datasets.

Whether you are benchmarking anomaly detection models, testing forecasting algorithms, or populating frontend dashboards before live data becomes available, ts-data-generator provides clean, composable building blocks to simulate complex, real-world temporal patterns.

Deterministic by Design: Every dataset generated is perfectly reproducible using a PCG64-backed pseudo-random number generator (PRNG) seed. This guarantees consistent generation across different machines, environments, and Python versions.

🚀 Getting Started in 5 Minutes

1. Installation

Install the package via pip or using uv (recommended):

pip install ts-data-generator
# Or with uv:
uv pip install ts-data-generator

No install needed? Run the CLI directly with uvx:
uvx --from ts-data-generator tsdata --help
Use --from (not --with) because the package name (ts-data-generator) differs from the executable name (tsdata).

Optional extras (install features as needed):

# Schema imputing / CSV reverse-engineering (requires scipy)
pip install "ts-data-generator[imputer]"

# Built-in line plotting (requires matplotlib)
pip install "ts-data-generator[plotting]"

# Country-specific holiday detection (requires holidays)
pip install holidays

# All optional features
pip install "ts-data-generator[imputer,plotting]" holidays

2. Choose Your Workflow

ts-data-generator adapts to your workspace. Choose between rapid terminal prototyping or robust pipeline scripting.

💻 Rapid Terminal Prototyping (CLI)

Generate a production-ready dataset in a single terminal line with dimensions and composed metrics:

tsdata generate \
  --start 2024-01-01 \
  --end 2024-01-07 \
  --granularity h \
  --dims "region:US,EU,AP" \
  --mets "sales:LinearTrend(slope=10)+SinusoidalTrend(amplitude=10,freq=24)" \
  --output sales_data.csv

🐍 Pipeline Integration (Python API)

Compose your generators directly inside your training/validation pipelines or notebooks:

from ts_data_generator import DataGen
from ts_data_generator.utils.trends import LinearTrend, SinusoidalTrend

dg = DataGen(seed=42)
dg.start_datetime = "2024-01-01"
dg.end_datetime = "2024-01-07"
dg.to_granularity("h")

# Composing a metric from multiple trends
dg.add_metric(
    "sales",
    {
        LinearTrend(offset=10.0, slope=10),
        SinusoidalTrend(amplitude=10.0, freq=24.0)
    }
)

df = dg.data # Retrieves the Pandas DataFrame
dg.plot() # Instant interactive visualization

🧩 Architectural Highlights

The generator is designed from the ground up around Modular Compositions:

Realistic Trends & Seasonality: Compose complex signals by layering multiple trends (Sinusoidal, Linear, AR Noise, Markov Chains, stock-like random walks) onto a single metric. Explore Trend Functions
Contextual Dimensions: Enrich your metrics with dimensions (such as region, device_id, or user_type) using built-in or custom infinite iterables. Explore Dimensions
Stochastic Anomaly Injection: Inject realistic anomalies (isolated spikes, bursty data drops, or gradual concept drifts) after your trends are calculated to benchmark your detection pipelines. Explore Anomalies
Schema Imputing: Bootstrap a generation config instantly by analyzing an existing historical CSV file. Explore Imputer

⚖️ Why Composable Primitives?

Most synthetic data generators lie at two extremes: they are either too simple (generating basic white noise) or too complex (requiring expensive black-box GAN models that lack direct interpretability).

ts-data-generator sits perfectly in the middle. By utilizing Composable Primitives, you retain total control over the mathematical laws governing your data. You explicitly specify the rules (the base growth, seasonal variations, noise patterns, and failure events) and the generator handles the complex temporal alignment, index building, dimension broadcasting, and execution.

Quickstart CLI Reference View the Python API