Benchmarking and Stress Testing

Note

TunnelMesh includes built-in benchmarking tools to measure throughput, latency, and test mesh performance under adverse network conditions. Benchmark traffic flows through actual mesh tunnels, giving realistic performance metrics.

Overview

The benchmark system consists of three tools:

Tool	Purpose	Use Case
`tunnelmesh benchmark`	Network throughput and latency testing	Test tunnel performance between peers
`tunnelmesh-benchmarker`	Automated periodic benchmarking service	Continuous performance monitoring in Docker
`tunnelmesh-s3bench`	S3 storage stress testing with narratives	Test S3, deduplication, versioning, RBAC, shares

Network benchmark traffic flows through the actual mesh tunnel (TUN device → encrypted tunnel → peer), giving you realistic performance metrics for file transfers and real-time applications.

Quick Start

Local CLI Benchmark

# Basic speed test (10MB upload)
tunnelmesh benchmark peer-name

# Larger transfer for more accurate throughput measurement
tunnelmesh benchmark peer-name --size 100MB

# Download test
tunnelmesh benchmark peer-name --size 50MB --direction download

# Save results to JSON
tunnelmesh benchmark peer-name --output results.json

Docker Automated Benchmarks

# Start the full stack including benchmarker
cd docker
docker compose up -d

# View benchmark logs
docker compose logs -f benchmarker

# Results are saved to the benchmark-results volume
docker compose exec server ls /results/

Network Benchmark CLI Reference

tunnelmesh benchmark <peer-name> [flags]

Flags:
  --size string        Transfer size (default "10MB")
                       Examples: 1MB, 100MB, 1GB

  --direction string   Transfer direction (default "upload")
                       Options: upload, download

  --output string      Save results to JSON file

  --timeout duration   Benchmark timeout (default 2m0s)

  --port int           Benchmark server port (default 9998)

Chaos Testing Flags:
  --packet-loss float  Packet loss percentage, 0-100 (default 0)

  --latency duration   Additional latency to inject (default 0)
                       Examples: 10ms, 100ms, 1s

  --jitter duration    Random latency variation ±jitter (default 0)
                       Examples: 5ms, 20ms

  --bandwidth string   Bandwidth limit (default unlimited)
                       Examples: 1mbps, 10mbps, 100mbps, 1gbps

Chaos Testing

Warning

Chaos testing impacts performance: Packet loss, latency, and bandwidth limits intentionally degrade performance. Use these flags to test resilience, not to measure baseline performance. Always run clean benchmarks first.

Chaos testing simulates adverse network conditions to stress test your mesh and verify resilience.

Use Cases

Scenario	Flags	Simulates
Lossy WiFi	`--packet-loss 2`	Occasional packet drops
Mobile network	`--latency 100ms --jitter 30ms`	High, variable latency
Congested link	`--bandwidth 5mbps`	Bandwidth-constrained path
Worst case	`--packet-loss 5 --latency 200ms --jitter 50ms --bandwidth 1mbps`	Very poor connection

Examples

# Simulate flaky WiFi (2% packet loss)
tunnelmesh benchmark peer-1 --size 50MB --packet-loss 2

# Simulate mobile/satellite connection (high latency + jitter)
tunnelmesh benchmark peer-1 --size 10MB --latency 150ms --jitter 50ms

# Simulate bandwidth-constrained link
tunnelmesh benchmark peer-1 --size 100MB --bandwidth 10mbps

# Combined stress test
tunnelmesh benchmark peer-1 --size 20MB \
  --packet-loss 3 \
  --latency 50ms \
  --jitter 20ms \
  --bandwidth 20mbps

# Save results for comparison
tunnelmesh benchmark peer-1 --size 50MB --output baseline.json
tunnelmesh benchmark peer-1 --size 50MB --packet-loss 5 --output with-loss.json

Docker Benchmarker

Caution

Aggressive stress testing: The Docker benchmarker runs continuous benchmarks with 3-6 simultaneous transfers at all times. This keeps the mesh under constant load. Use in development/testing, not production.

The Docker benchmarker runs aggressive continuous benchmarks with multiple concurrent transfers and randomised chaos settings. The mesh is always under load.

Default Behaviour

Interval: New batch every 30 seconds
Concurrency: 3 simultaneous transfers per batch
Size: 100MB per transfer
Direction: 70% uploads, 30% downloads (randomised)
Chaos: Randomly selected preset per transfer

With overlapping batches, you'll typically have 3-6 active transfers at any time.

Chaos Presets

Each transfer randomly picks from these network condition presets:

Preset	Packet Loss	Latency	Jitter	Bandwidth
`clean`	0%	0ms	0ms	unlimited
`subtle`	0.1%	2ms	±1ms	unlimited
`lossy-wifi`	2%	5ms	±3ms	unlimited
`mobile-3g`	1%	100ms	±50ms	5 Mbps
`mobile-4g`	0.5%	30ms	±15ms	25 Mbps
`satellite`	0.5%	300ms	±50ms	10 Mbps
`congested`	3%	20ms	±30ms	1 Mbps
`bandwidth-10mbps`	0%	0ms	0ms	10 Mbps
`bandwidth-50mbps`	0%	0ms	0ms	50 Mbps
`bandwidth-100mbps`	0%	0ms	0ms	100 Mbps

Environment Variables

# Basic configuration
COORD_SERVER_URL: http://localhost:8080  # Coordination server
AUTH_TOKEN: your-token                    # Authentication token
LOCAL_PEER: benchmarker                   # This peer's name
BENCHMARK_INTERVAL: 30s                   # Time between batch starts
BENCHMARK_CONCURRENCY: 3                  # Simultaneous transfers per batch
BENCHMARK_SIZE: 100MB                     # Transfer size per test
OUTPUT_DIR: /results                      # Where to save JSON results

# Chaos randomization
RANDOMIZE_CHAOS: true                     # Random preset per transfer (default)
# Set RANDOMIZE_CHAOS=false for all clean benchmarks

Controlling the Benchmarker

# Start with default aggressive settings
docker compose up -d benchmarker

# Watch the chaos unfold
docker compose logs -f benchmarker

# Run with more concurrency
docker compose run -e BENCHMARK_CONCURRENCY=5 benchmarker

# Run all clean benchmarks (no chaos)
docker compose run -e RANDOMIZE_CHAOS=false benchmarker

# Faster interval (more overlap)
docker compose run -e BENCHMARK_INTERVAL=15s benchmarker

Viewing Results

# List benchmark results
docker compose exec server ls -la /results/

# View latest result
 docker compose exec server cat /results/benchmark_*.json | jq . 

# Copy results to host
docker cp tunnelmesh-server:/results ./benchmark-results/

Understanding Results

JSON Output Format

{
  "id": "bench-abc123",
  "local_peer": "server",
  "remote_peer": "client-1",
  "direction": "upload",
  "timestamp": "2024-01-15T10:30:00Z",
  "requested_size_bytes": 104857600,
  "transferred_size_bytes": 104857600,
  "duration_ms": 1250,
  "throughput_bps": 83886080,
  "throughput_mbps": 671.09,
  "latency_min_ms": 0.5,
  "latency_max_ms": 3.2,
  "latency_avg_ms": 1.1,
  "success": true,
  "chaos": {
    "packet_loss_percent": 0.1,
    "latency": 2000000,
    "jitter": 1000000
  }
}

Key Metrics

Metric	Description	Good Values
`throughput_mbps`	Megabits per second	Depends on link speed
`latency_avg_ms`	Average round-trip time	<5ms LAN, <50ms WAN
`latency_max_ms`	Worst-case latency	Should be close to avg
`success`	Whether transfer completed	Should be `true`

Comparing Results

# Compare baseline vs chaos results
 jq -s '.[0].throughput_mbps as $base|
 .[1].throughput_mbps as $chaos|
       {baseline: $base, with_chaos: $chaos,
        degradation_pct: (($base - $chaos) / $base * 100)}' \
  baseline.json with-loss.json

Troubleshooting

Benchmark Fails to Connect

Error: cannot resolve peer "peer-1": is the mesh daemon running?

Ensure the mesh daemon is running: tunnelmesh status
Check peer is online: tunnelmesh peers
Verify DNS resolution: dig peer-1.tunnelmesh

Low Throughput

Check for packet loss: run with --packet-loss 0 to ensure clean baseline
Verify transport type: SSH is slower than UDP
Check CPU usage during benchmark
Try larger transfer size for more accurate measurement

High Latency Variance

High jitter may indicate network congestion
Check for competing traffic on the mesh
Verify both peers have stable connections

Tip

Best practice: Always establish a clean baseline first with --packet-loss 0. Then compare chaos results against the baseline to quantify performance degradation. Use larger transfers (100MB+) for accurate throughput measurement.

Best Practices

Baseline First: Run without chaos to establish baseline performance
Multiple Runs: Run 3-5 benchmarks and average results
Warm Up: The first benchmark after mesh startup may be slower
Size Matters: Use larger transfers (100MB+) for accurate throughput measurement
Monitor Both Ends: Check CPU/memory on both peers during stress tests
Save Results: Always use --output for reproducible comparisons

S3 Storage Benchmarking

The tunnelmesh-s3bench tool provides narrative-driven stress testing for S3 storage, deduplication, versioning, RBAC, and file shares. Unlike traditional benchmarks that generate random traffic, it simulates realistic scenarios with characters, departments, and workflows.

S3 Benchmark Quick Start

# List available story scenarios
tunnelmesh-s3bench list

# Show scenario details
tunnelmesh-s3bench describe alien_invasion

# Run a scenario (accelerated 100x for testing)
tunnelmesh-s3bench run alien_invasion --time-scale 100 --output results.json

# Run with all features enabled
tunnelmesh-s3bench run alien_invasion \
  --time-scale 100 \
  --enable-mesh \
  --enable-adversary \
  --enable-workflows \
  --test-deletion \
  --test-expiration \
  --test-permissions

Story Scenarios

Stories are narrative-driven workloads that test multiple S3 features simultaneously:

alien_invasion - 72 hours of first contact, invasion, and human resistance

3 characters (military commander, scientist, adversary)
4 departments with file shares (command, science, public, classified)
Tests deduplication, versioning, RBAC, quotas, and expiration

Each story includes:

Characters: Users with different roles and clearance levels
Departments: File shares with quotas and access control
Workflows: Realistic document creation, editing, collaboration
Adversaries: Users who attempt unauthorised access or data exfiltration

What It Tests

Feature	Description
Deduplication	Multiple users uploading identical files
Versioning	Document editing with version history
RBAC	Role-based access control across departments
File Shares	Department shares with quotas and permissions
Mesh Dynamics	Users joining/leaving mesh network
Quota Limits	Storage limits and over-quota handling
Expiration	Automatic cleanup of old objects
Permissions	Authorisation enforcement for all operations

Flags

--time-scale float         Time acceleration (default: 1.0)
                           100 = 72 hours compressed to 43 minutes
                           1000 = 72 hours compressed to 4.3 minutes

--concurrency int          Concurrent workers per character (default: 5)

--enable-mesh              Simulate users joining/leaving mesh
--enable-adversary         Enable adversarial behavior
--enable-workflows         Enable realistic document workflows
--test-deletion            Test soft/hard deletion
--test-expiration          Test object expiration
--test-permissions         Test permission boundaries
--test-quota               Test quota enforcement
--test-retention           Test version retention policies

--quota-override MB        Override department quotas
--expiry-override duration Override expiration times

--output-json string       Save results to JSON file
--endpoint string          S3 endpoint (default: http://localhost:8080)

Example Output

{
  "story": "alien_invasion",
  "duration_seconds": 2592.5,
  "time_scale": 100,
  "statistics": {
    "total_operations": 1247,
    "successful_operations": 1215,
    "failed_operations": 32,
    "bytes_uploaded": 524288000,
    "bytes_downloaded": 262144000,
    "dedupe_savings_bytes": 104857600,
    "versions_created": 156,
    "objects_expired": 23,
    "permission_denials": 32
  },
  "characters": {
    "General Sarah Chen": {
      "operations": 487,
      "bytes_transferred": 209715200,
      "permission_denials": 0
    },
    "Dr. James Wright": {
      "operations": 512,
      "bytes_transferred": 314572800,
      "permission_denials": 0
    },
    "Eve Martinez": {
      "operations": 248,
      "bytes_transferred": 262144000,
      "permission_denials": 32
    }
  }
}

S3 Benchmark Use Cases

Scenario	Command
Quick validation	`--time-scale 1000` (72h → 4.3 min)
Realistic load test	`--time-scale 10` (72h → 7.2h)
Full stress test	`--enable-all --time-scale 100`
Adversary testing	`--enable-adversary --test-permissions`
Quota testing	`--quota-override 100 --test-quota`

S3 Benchmark Best Practices

Start Fast: Use --time-scale 1000 for quick validation
Enable Gradually: Add features one at a time to isolate issues
Save Results: Always use --output-json for analysis
Watch Logs: Monitor S3 server logs during execution
Check Quotas: Ensure sufficient storage for test scenario
Verify Cleanup: Check object expiration and deletion after test

TunnelMesh is released under the AGPL-3.0 License.