Benchmarking and Stress Testing

Note

TunnelMesh includes built-in benchmarking tools to measure throughput, latency, and test mesh performance under adverse network conditions. Benchmark traffic flows through actual mesh tunnels, giving realistic performance metrics.

Overview

The benchmark system consists of three tools:

Tool Purpose Use Case
tunnelmesh benchmark Network throughput and latency testing Test tunnel performance between peers
tunnelmesh-benchmarker Automated periodic benchmarking service Continuous performance monitoring in Docker
tunnelmesh-s3bench S3 storage stress testing with narratives Test S3, deduplication, versioning, RBAC, shares

Network benchmark traffic flows through the actual mesh tunnel (TUN device → encrypted tunnel → peer), giving you realistic performance metrics for file transfers and real-time applications.

Quick Start

Local CLI Benchmark

# Basic speed test (10MB upload)
tunnelmesh benchmark peer-name

# Larger transfer for more accurate throughput measurement
tunnelmesh benchmark peer-name --size 100MB

# Download test
tunnelmesh benchmark peer-name --size 50MB --direction download

# Save results to JSON
tunnelmesh benchmark peer-name --output results.json

Docker Automated Benchmarks

# Start the full stack including benchmarker
cd docker
docker compose up -d

# View benchmark logs
docker compose logs -f benchmarker

# Results are saved to the benchmark-results volume
docker compose exec server ls /results/

Network Benchmark CLI Reference

tunnelmesh benchmark <peer-name> [flags]

Flags:
  --size string        Transfer size (default "10MB")
                       Examples: 1MB, 100MB, 1GB

  --direction string   Transfer direction (default "upload")
                       Options: upload, download

  --output string      Save results to JSON file

  --timeout duration   Benchmark timeout (default 2m0s)

  --port int           Benchmark server port (default 9998)

Chaos Testing Flags:
  --packet-loss float  Packet loss percentage, 0-100 (default 0)

  --latency duration   Additional latency to inject (default 0)
                       Examples: 10ms, 100ms, 1s

  --jitter duration    Random latency variation ±jitter (default 0)
                       Examples: 5ms, 20ms

  --bandwidth string   Bandwidth limit (default unlimited)
                       Examples: 1mbps, 10mbps, 100mbps, 1gbps

Chaos Testing

Warning

Chaos testing impacts performance: Packet loss, latency, and bandwidth limits intentionally degrade performance. Use these flags to test resilience, not to measure baseline performance. Always run clean benchmarks first.

Chaos testing simulates adverse network conditions to stress test your mesh and verify resilience.

Use Cases

Scenario Flags Simulates
Lossy WiFi --packet-loss 2 Occasional packet drops
Mobile network --latency 100ms --jitter 30ms High, variable latency
Congested link --bandwidth 5mbps Bandwidth-constrained path
Worst case --packet-loss 5 --latency 200ms --jitter 50ms --bandwidth 1mbps Very poor connection

Examples

# Simulate flaky WiFi (2% packet loss)
tunnelmesh benchmark peer-1 --size 50MB --packet-loss 2

# Simulate mobile/satellite connection (high latency + jitter)
tunnelmesh benchmark peer-1 --size 10MB --latency 150ms --jitter 50ms

# Simulate bandwidth-constrained link
tunnelmesh benchmark peer-1 --size 100MB --bandwidth 10mbps

# Combined stress test
tunnelmesh benchmark peer-1 --size 20MB \
  --packet-loss 3 \
  --latency 50ms \
  --jitter 20ms \
  --bandwidth 20mbps

# Save results for comparison
tunnelmesh benchmark peer-1 --size 50MB --output baseline.json
tunnelmesh benchmark peer-1 --size 50MB --packet-loss 5 --output with-loss.json

Docker Benchmarker

Caution

Aggressive stress testing: The Docker benchmarker runs continuous benchmarks with 3-6 simultaneous transfers at all times. This keeps the mesh under constant load. Use in development/testing, not production.

The Docker benchmarker runs aggressive continuous benchmarks with multiple concurrent transfers and randomised chaos settings. The mesh is always under load.

Default Behaviour

  • Interval: New batch every 30 seconds
  • Concurrency: 3 simultaneous transfers per batch
  • Size: 100MB per transfer
  • Direction: 70% uploads, 30% downloads (randomised)
  • Chaos: Randomly selected preset per transfer

With overlapping batches, you'll typically have 3-6 active transfers at any time.

Chaos Presets

Each transfer randomly picks from these network condition presets:

Preset Packet Loss Latency Jitter Bandwidth
clean 0% 0ms 0ms unlimited
subtle 0.1% 2ms ±1ms unlimited
lossy-wifi 2% 5ms ±3ms unlimited
mobile-3g 1% 100ms ±50ms 5 Mbps
mobile-4g 0.5% 30ms ±15ms 25 Mbps
satellite 0.5% 300ms ±50ms 10 Mbps
congested 3% 20ms ±30ms 1 Mbps
bandwidth-10mbps 0% 0ms 0ms 10 Mbps
bandwidth-50mbps 0% 0ms 0ms 50 Mbps
bandwidth-100mbps 0% 0ms 0ms 100 Mbps

Environment Variables

# Basic configuration
COORD_SERVER_URL: http://localhost:8080  # Coordination server
AUTH_TOKEN: your-token                    # Authentication token
LOCAL_PEER: benchmarker                   # This peer's name
BENCHMARK_INTERVAL: 30s                   # Time between batch starts
BENCHMARK_CONCURRENCY: 3                  # Simultaneous transfers per batch
BENCHMARK_SIZE: 100MB                     # Transfer size per test
OUTPUT_DIR: /results                      # Where to save JSON results

# Chaos randomization
RANDOMIZE_CHAOS: true                     # Random preset per transfer (default)
# Set RANDOMIZE_CHAOS=false for all clean benchmarks

Controlling the Benchmarker

# Start with default aggressive settings
docker compose up -d benchmarker

# Watch the chaos unfold
docker compose logs -f benchmarker

# Run with more concurrency
docker compose run -e BENCHMARK_CONCURRENCY=5 benchmarker

# Run all clean benchmarks (no chaos)
docker compose run -e RANDOMIZE_CHAOS=false benchmarker

# Faster interval (more overlap)
docker compose run -e BENCHMARK_INTERVAL=15s benchmarker

Viewing Results

# List benchmark results
docker compose exec server ls -la /results/

# View latest result
 docker compose exec server cat /results/benchmark_*.json | jq . 

# Copy results to host
docker cp tunnelmesh-server:/results ./benchmark-results/

Understanding Results

JSON Output Format

{
  "id": "bench-abc123",
  "local_peer": "server",
  "remote_peer": "client-1",
  "direction": "upload",
  "timestamp": "2024-01-15T10:30:00Z",
  "requested_size_bytes": 104857600,
  "transferred_size_bytes": 104857600,
  "duration_ms": 1250,
  "throughput_bps": 83886080,
  "throughput_mbps": 671.09,
  "latency_min_ms": 0.5,
  "latency_max_ms": 3.2,
  "latency_avg_ms": 1.1,
  "success": true,
  "chaos": {
    "packet_loss_percent": 0.1,
    "latency": 2000000,
    "jitter": 1000000
  }
}

Key Metrics

Metric Description Good Values
throughput_mbps Megabits per second Depends on link speed
latency_avg_ms Average round-trip time <5ms LAN, <50ms WAN
latency_max_ms Worst-case latency Should be close to avg
success Whether transfer completed Should be true

Comparing Results

# Compare baseline vs chaos results
 jq -s '.[0].throughput_mbps as $base|
 .[1].throughput_mbps as $chaos|
       {baseline: $base, with_chaos: $chaos,
        degradation_pct: (($base - $chaos) / $base * 100)}' \
  baseline.json with-loss.json

Troubleshooting

Benchmark Fails to Connect

Error: cannot resolve peer "peer-1": is the mesh daemon running?
  • Ensure the mesh daemon is running: tunnelmesh status
  • Check peer is online: tunnelmesh peers
  • Verify DNS resolution: dig peer-1.tunnelmesh

Low Throughput

  1. Check for packet loss: run with --packet-loss 0 to ensure clean baseline
  2. Verify transport type: SSH is slower than UDP
  3. Check CPU usage during benchmark
  4. Try larger transfer size for more accurate measurement

High Latency Variance

  • High jitter may indicate network congestion
  • Check for competing traffic on the mesh
  • Verify both peers have stable connections
Tip

Best practice: Always establish a clean baseline first with --packet-loss 0. Then compare chaos results against the baseline to quantify performance degradation. Use larger transfers (100MB+) for accurate throughput measurement.

Best Practices

  1. Baseline First: Run without chaos to establish baseline performance
  2. Multiple Runs: Run 3-5 benchmarks and average results
  3. Warm Up: The first benchmark after mesh startup may be slower
  4. Size Matters: Use larger transfers (100MB+) for accurate throughput measurement
  5. Monitor Both Ends: Check CPU/memory on both peers during stress tests
  6. Save Results: Always use --output for reproducible comparisons

S3 Storage Benchmarking

The tunnelmesh-s3bench tool provides narrative-driven stress testing for S3 storage, deduplication, versioning, RBAC, and file shares. Unlike traditional benchmarks that generate random traffic, it simulates realistic scenarios with characters, departments, and workflows.

S3 Benchmark Quick Start

# List available story scenarios
tunnelmesh-s3bench list

# Show scenario details
tunnelmesh-s3bench describe alien_invasion

# Run a scenario (accelerated 100x for testing)
tunnelmesh-s3bench run alien_invasion --time-scale 100 --output results.json

# Run with all features enabled
tunnelmesh-s3bench run alien_invasion \
  --time-scale 100 \
  --enable-mesh \
  --enable-adversary \
  --enable-workflows \
  --test-deletion \
  --test-expiration \
  --test-permissions

Story Scenarios

Stories are narrative-driven workloads that test multiple S3 features simultaneously:

alien_invasion - 72 hours of first contact, invasion, and human resistance

  • 3 characters (military commander, scientist, adversary)
  • 4 departments with file shares (command, science, public, classified)
  • Tests deduplication, versioning, RBAC, quotas, and expiration

Each story includes:

  • Characters: Users with different roles and clearance levels
  • Departments: File shares with quotas and access control
  • Workflows: Realistic document creation, editing, collaboration
  • Adversaries: Users who attempt unauthorised access or data exfiltration

What It Tests

Feature Description
Deduplication Multiple users uploading identical files
Versioning Document editing with version history
RBAC Role-based access control across departments
File Shares Department shares with quotas and permissions
Mesh Dynamics Users joining/leaving mesh network
Quota Limits Storage limits and over-quota handling
Expiration Automatic cleanup of old objects
Permissions Authorisation enforcement for all operations

Flags

--time-scale float         Time acceleration (default: 1.0)
                           100 = 72 hours compressed to 43 minutes
                           1000 = 72 hours compressed to 4.3 minutes

--concurrency int          Concurrent workers per character (default: 5)

--enable-mesh              Simulate users joining/leaving mesh
--enable-adversary         Enable adversarial behavior
--enable-workflows         Enable realistic document workflows
--test-deletion            Test soft/hard deletion
--test-expiration          Test object expiration
--test-permissions         Test permission boundaries
--test-quota               Test quota enforcement
--test-retention           Test version retention policies

--quota-override MB        Override department quotas
--expiry-override duration Override expiration times

--output-json string       Save results to JSON file
--endpoint string          S3 endpoint (default: http://localhost:8080)

Example Output

{
  "story": "alien_invasion",
  "duration_seconds": 2592.5,
  "time_scale": 100,
  "statistics": {
    "total_operations": 1247,
    "successful_operations": 1215,
    "failed_operations": 32,
    "bytes_uploaded": 524288000,
    "bytes_downloaded": 262144000,
    "dedupe_savings_bytes": 104857600,
    "versions_created": 156,
    "objects_expired": 23,
    "permission_denials": 32
  },
  "characters": {
    "General Sarah Chen": {
      "operations": 487,
      "bytes_transferred": 209715200,
      "permission_denials": 0
    },
    "Dr. James Wright": {
      "operations": 512,
      "bytes_transferred": 314572800,
      "permission_denials": 0
    },
    "Eve Martinez": {
      "operations": 248,
      "bytes_transferred": 262144000,
      "permission_denials": 32
    }
  }
}

S3 Benchmark Use Cases

Scenario Command
Quick validation --time-scale 1000 (72h → 4.3 min)
Realistic load test --time-scale 10 (72h → 7.2h)
Full stress test --enable-all --time-scale 100
Adversary testing --enable-adversary --test-permissions
Quota testing --quota-override 100 --test-quota

S3 Benchmark Best Practices

  1. Start Fast: Use --time-scale 1000 for quick validation
  2. Enable Gradually: Add features one at a time to isolate issues
  3. Save Results: Always use --output-json for analysis
  4. Watch Logs: Monitor S3 server logs during execution
  5. Check Quotas: Ensure sufficient storage for test scenario
  6. Verify Cleanup: Check object expiration and deletion after test

TunnelMesh is released under the AGPL-3.0 License.