Configuring Resiliency ⚙️

How Graph Compose Handles Durability 🛡️

Graph Compose ensures your workflows are durable and resilient by leveraging Temporal under the hood. This means:

💾 Automatic State Persistence: Every step of your workflow is automatically persisted
🔄 Seamless Recovery: If a node fails, we automatically resume from the last successful state
🌐 Distributed Execution: Your workflows are executed across multiple machines for high availability
📊 Built-in Monitoring: Track the health and performance of your workflows

All you need to do is provide the configuration - we handle all the complex infrastructure for you!

🚀 TLDR: Add retry policies to nodes and workflows to handle failures:

const node = {
  activityConfig: {
    retryPolicy: {
      initialInterval: "1s",    // Wait 1 second before the first retry
      maximumInterval: "10s",   // Maximum wait time between retries is 10 seconds
      maximumAttempts: 3,       // Attempt the operation up to 3 times
      backoffCoefficient: 2.0   // Double the wait time after each failed attempt (1s, 2s, 4s...)
    },
    startToCloseTimeout: "30s" // Each attempt has 30 seconds to complete before timing out
  }
}

Interactive Retry Calculator ⚡

Configure retry settings below to see when retries would occur if a node fails:

⚡Retry Policy Calculator

🕒Initial Delay Before First Retry

In seconds

retryPolicy.initialInterval: "1s"

⏱️Maximum Delay Between Retries

In seconds (0 for no limit)

retryPolicy.maximumInterval: "60s"

🔄Maximum Attempts

1 initial attempt + up to 4 retries if failures occur

retryPolicy.maximumAttempts: 5

📈Backoff Coefficient

Multiplier for increasing delay between retries

retryPolicy.backoffCoefficient: 2

If failures occur, retries will happen after these delays:

Enter valid parameters to calculate retry delays.

Retry Policies 🔄

Why Retries Matter

In distributed systems, temporary failures are common:

🌐 Network hiccups
🔌 Service outages
🕒 Rate limiting
💫 Transient errors

Graph Compose automatically handles these issues by retrying failed operations according to your configuration.

Retry Options

The specific options available for configuring retry behavior are defined in the schema below. This schema is automatically generated from our API specification and serves as the definitive source of truth for all available parameters and their descriptions.

View Complete RetryPolicy Schema

Understanding Backoff Coefficient 📈

The backoff coefficient determines how quickly the retry interval grows between attempts. The formula is:

wait_time = initialInterval * (backoffCoefficient ^ (attempt - 1))

🔄 Experiment with the retry calculator

Let's look at a real-world example of retrying an API call to a payment processor:

{
  activityConfig: {
    retryPolicy: {
      initialInterval: "2s",    // Wait 2 seconds before the first retry
      maximumInterval: "1m",    // Maximum wait time between retries is 1 minute
      maximumAttempts: 5,       // Retry the operation up to 5 times (1 initial + 4 retries)
      backoffCoefficient: 2.0   // Double the wait time after each failed attempt
    },
    startToCloseTimeout: "90s" // Each attempt has 90 seconds to complete
  }
}

Here's what happens on each retry:

First attempt fails → Wait 2s (initial interval)
Second attempt fails → Wait 4s (2s × 2.0¹)
Third attempt fails → Wait 8s (2s × 2.0²)
Fourth attempt fails → Wait 16s (2s × 2.0³)
Fifth attempt fails → Wait 32s (2s × 2.0⁴)

This exponential backoff is ideal for:

💳 Payment processing retries (allowing time for bank processing)
🌐 External API rate limits to reset
🔄 Database connection recovery
📦 Resource cleanup or provisioning

Note: The actual wait time will never exceed maximumInterval, even if the calculated value is larger.

Let's trace the execution flow if the operation fails repeatedly:

Initial Attempt: Runs immediately. Fails.
Wait: Waits for initialInterval (2 seconds).
Retry 1: Runs. Fails.
Wait: Waits for initialInterval * backoffCoefficient^1 (2s * 2.0^1 = 4 seconds).
Retry 2: Runs. Fails.
Wait: Waits for initialInterval * backoffCoefficient^2 (2s * 2.0^2 = 8 seconds).
Retry 3: Runs. Fails.
Wait: Waits for initialInterval * backoffCoefficient^3 (2s * 2.0^3 = 16 seconds).
Retry 4: Runs. Fails.
Wait: Waits for initialInterval * backoffCoefficient^4 (2s * 2.0^4 = 32 seconds).
Final Failure: Since maximumAttempts is 5 (1 initial + 4 retries), the workflow proceeds to handle the failure after the last wait.

The total minimum time spent waiting between retries in this scenario is 2 + 4 + 8 + 16 + 32 = 62 seconds. Remember that each attempt also has its own startToCloseTimeout of 90 seconds.

Common Patterns 🎯

Aggressive Retries for Flaky APIs

{
  activityConfig: {
    retryPolicy: {
      initialInterval: "100ms",  // Wait 100 milliseconds before the first retry
      maximumInterval: "1s",     // Maximum wait time between retries is 1 second
      maximumAttempts: 10,       // Retry the operation up to 10 times
      backoffCoefficient: 1.5    // Increase wait time by 50% after each failed attempt
    },
    startToCloseTimeout: "5s"    // Each attempt has 5 seconds to complete
  }
}

Careful Retries for Critical Operations

{
  activityConfig: {
    retryPolicy: {
      initialInterval: "5s",     // Wait 5 seconds before the first retry
      maximumInterval: "1m",     // Maximum wait time between retries is 1 minute
      maximumAttempts: 3,        // Retry the operation up to 3 times
      backoffCoefficient: 2.0    // Double the wait time after each failed attempt
    },
    startToCloseTimeout: "2m"   // Each attempt has 2 minutes to complete
  }
}

Under the Hood: Temporal Integration 🔧

Graph Compose uses Temporal to provide enterprise-grade durability:

🏢 Business Continuity: Your workflows continue even during infrastructure updates
🔒 Exactly-Once Execution: No duplicate operations or lost work
📝 Complete Audit Trail: Track every step and retry of your workflow
⚡ Zero Data Loss: All progress is automatically persisted

You get all these benefits automatically - just focus on your workflow logic and let us handle the rest!

Ready to make your workflows more resilient? Let's go! 🚀

Learn about Error Boundaries ✨

Workflow API Reference

Node API Reference