From 0a37eaca2d4b55fd9725e30445144ae596ec74e0 Mon Sep 17 00:00:00 2001 From: William Kempster Date: Wed, 18 Feb 2026 09:40:33 +0000 Subject: [PATCH] docs: add data availability and gateway expectations guidance --- content/build/access/fetch-data.mdx | 7 + content/build/access/gateway-expectations.mdx | 152 ++++++++++++++++++ content/build/access/index.mdx | 10 +- content/build/access/meta.json | 1 + .../manage/environment-variables.mdx | 8 +- content/learn/gateways/data-availability.mdx | 109 +++++++++++++ content/learn/gateways/data-retrieval.mdx | 15 +- content/learn/gateways/index.mdx | 17 +- content/learn/gateways/meta.json | 1 + 9 files changed, 312 insertions(+), 8 deletions(-) create mode 100644 content/build/access/gateway-expectations.mdx create mode 100644 content/learn/gateways/data-availability.mdx diff --git a/content/build/access/fetch-data.mdx b/content/build/access/fetch-data.mdx index b62d01823..67398d77a 100644 --- a/content/build/access/fetch-data.mdx +++ b/content/build/access/fetch-data.mdx @@ -18,6 +18,13 @@ Gateways are the most performant way to fetch data from Arweave, providing signi - **Network Optimization** - Distributed infrastructure for better performance - **Content Delivery** - Optimized serving with compression and CDN features + + For production planning before implementation, see [Gateway + Expectations](/build/access/gateway-expectations) for access-mode + recommendations, practical limits, and real-world `turbo-gateway.com` + scenarios. + + ## REST APIs for Fetching Data Gateways support multiple API endpoints for accessing data: diff --git a/content/build/access/gateway-expectations.mdx b/content/build/access/gateway-expectations.mdx new file mode 100644 index 000000000..223540043 --- /dev/null +++ b/content/build/access/gateway-expectations.mdx @@ -0,0 +1,152 @@ +--- +title: "Gateway Expectations" +description: "Production-first guidance for choosing gateway access modes, understanding limits, and planning reliable delivery" +--- + +import { Card, Cards } from "fumadocs-ui/components/card"; +import { Network, Building2, Server, Terminal } from "lucide-react"; + +For conceptual background, start with [Data Availability](/learn/gateways/data-availability). + +## Production Quick Start + +If you need stronger uptime and performance guarantees: + +1. Use a **managed or self-hosted gateway** as your primary path. +2. Keep the **open ar.io gateway network** as resilience and backup. +3. Use **Wayfinder** when you want client-side gateway failover. + +This gives you both predictable primary delivery and decentralized fallback. + +## Access Options and Recommended Uses + +| Access mode | Recommended use | Guarantee profile | +| --- | --- | --- | +| **Open network** | Prototypes, low-risk public traffic, fast launch | Strong retrievability, variable per-gateway performance/limits | +| **Self-hosted gateway** | Teams that want direct control over infra and policy | You define and operate the guarantee level | +| **Managed gateway** | Teams that want high availability with less ops overhead | Provider-backed operational guarantees | +| **Hybrid (recommended for production)** | Critical apps and customer-facing workloads | Dedicated primary guarantees + decentralized backup path | + +## How to Inspect Any Gateway + +Use `GET /ar-io/info`: + +```bash +curl -sS https:///ar-io/info | jq +``` + +Focus on these fields: + +- `rateLimiter.enabled` +- `rateLimiter.dataEgress.buckets.resource` +- `rateLimiter.dataEgress.buckets.ip` +- `x402.enabled` +- `x402.dataEgress.pricing` +- `x402.dataEgress.rateLimiterCapacityMultiplier` + +Interpretation: + +- `capacity` / `capacityBytes`: burst allowance. +- `refillRate` / `refillRateBytesPerSec`: sustained unpaid throughput. +- `x402` pricing + multiplier: how paid overflow extends access. + +## turbo-gateway.com Snapshot (As of February 16, 2026) + +Source: `https://turbo-gateway.com/ar-io/info` queried on February 16, 2026. + +### Rate limiter buckets + +| Bucket | Value | +| --- | --- | +| Resource `capacityBytes` | `1,024,000,000` bytes (~`976.56` MiB burst) | +| Resource `refillRateBytesPerSec` | `102,400` bytes/s (`100` KiB/s sustained) | +| IP `capacityBytes` | `102,400,000` bytes (~`97.66` MiB burst) | +| IP `refillRateBytesPerSec` | `20,480` bytes/s (`20` KiB/s sustained) | + +### x402 data egress + +| Field | Value | +| --- | --- | +| Enabled | `true` | +| Network | `base` | +| Per-byte price | `0.0000000001` USDC | +| Min price | `0.001000` USDC | +| Max price | `1.000000` USDC | +| Capacity multiplier | `10` | + + + This is a dated snapshot. Re-check the live endpoint before relying on exact + numbers. + + +## Real-World Scenarios (Illustrative) + +These examples use current turbo-gateway bucket values and assume bytes are not already absorbed by downstream cache/CDN. + +### 1) Small API payload pattern (10 KiB response) + +- **Unpaid sustained pace (per client IP):** about **2 requests/second** (`20 KiB/s ÷ 10 KiB`). +- **Burst behavior:** IP burst bucket (~97.66 MiB) can absorb large short spikes before refill rate becomes the limiter. +- **With x402 paid path:** additional paid tokens extend access when free tokens are exhausted. + +### 2) Static site page-load pattern (~1.5 MiB cold load) + +- **Unpaid sustained pace (per client IP):** about **0.8 page loads/minute** if every load is fully uncached. +- **Burst behavior:** IP burst bucket supports roughly **65 cold loads** in a short window (`97.66 MiB ÷ 1.5 MiB`). +- **In practice:** browser cache, gateway cache, and CDN caching usually reduce byte pressure significantly. + +### 3) Larger file delivery pattern (100 MiB object) + +- **Unpaid behavior:** one request can quickly consume most/all IP burst allowance. +- **Refill reality:** recovering ~100 MiB of unpaid headroom at `20 KiB/s` is slow (~85 minutes). +- **With x402 paid path:** requests continue through paid tokens, which is important for sustained large-transfer workloads. + + + These scenarios are directional planning examples, not hard guarantees. Real + behavior depends on caching, traffic shape, and your selected access mode. + + +## How Limits Surface to Clients + +- **`402 Payment Required`**: request exceeded free limits and a payment path is available. +- **`429 Too Many Requests`**: request is rate limited without a usable payment path. + +Treat both as control signals and retry/failover events in client logic. + +## Client Checklist + +- Use exponential backoff with jitter for retriable failures. +- Implement gateway failover (or use Wayfinder). +- Use `ETag`/`If-None-Match` and range requests where appropriate. +- Follow redirects (including sandbox redirects). +- Monitor `X-Cache`, `X-AR-IO-Verified`, `X-AR-IO-Trusted`, and `Content-Digest`. +- Re-check `/ar-io/info` regularly for policy/limit changes. + +## Next Steps + + + } + /> + } + /> + } + /> + } + /> + diff --git a/content/build/access/index.mdx b/content/build/access/index.mdx index 617e059ab..576876d42 100644 --- a/content/build/access/index.mdx +++ b/content/build/access/index.mdx @@ -3,10 +3,12 @@ title: "Access Data" description: "Learn how to retrieve and query data from Arweave's permanent storage network" --- -import { Globe, Search, Link, Compass } from 'lucide-react'; +import { Globe, Search, Link, Compass, Gauge } from 'lucide-react'; Once data is stored on Arweave, it's permanently available. Here's how to access it efficiently for your applications. +For production planning, start with **Gateway Expectations** to choose an access mode and understand practical limits. + ## Access Methods Different methods serve different needs. Each provides unique capabilities for retrieving data from Arweave. @@ -131,6 +133,12 @@ curl https://arweave.net/[transaction-id-from-above] ## Additional Access Options + } + /> + If uptime and performance guarantees matter, run a dedicated gateway path + (self-hosted or managed) and keep the decentralized network as resilience and + backup. + + +## What We Can Guarantee (Mode-Based) + +### Network-level guarantees + +With the ar.io gateway model, you get: + +- **Retrievability through multiple paths**: requests can resolve through cache, trusted gateways, ar.io peers, chunk reconstruction, and tx-data fallback. +- **Open access across independent operators**: no single required public gateway. +- **Integrity/trust signals in responses**: headers such as `X-AR-IO-Verified`, `X-AR-IO-Trusted`, `X-AR-IO-Digest`, and `Content-Digest`. + +### Network-level non-guarantees + +At the public-network layer, there is no single global guarantee for: + +- One universal latency target. +- One universal throughput target. +- One universal rate-limit profile across all operators. + +### Dedicated gateway guarantees (managed or self-hosted) + +When you run dedicated infrastructure (or have it managed for you), you can provide stronger guarantees, including: + +- High-availability architecture and failover policy. +- Predictable performance targets for your workload profile. +- Explicit traffic policies and operational ownership. + +In short: the network guarantees resilient access paths, while dedicated gateways are how you guarantee stricter production outcomes. + +## Recommended Production Patterns + +1. **Direct network access**: simplest path for non-critical or moderate workloads. +2. **Hybrid fallback**: dedicated primary gateway, network as backup. +3. **Managed HA primary + optional self-hosted secondary**: strongest balance of guarantees and independence. + +## How Availability Works Under the Hood + +Most gateways follow a layered read path: + +1. Local cache. +2. Trusted gateways. +3. Ar.io network peers. +4. Chunk-based reconstruction. +5. Transaction-data fallback. + +This layered path is why data stays reachable even when a single source is slow or unavailable. + +## When to Use Wayfinder + +Use [Wayfinder](/build/access/wayfinder) when you want client-side routing and failover across gateways without hard-coding a single endpoint. + +Wayfinder is a strong default for production read paths that need resilience. + +## Next Steps + + + } + /> + } + /> + } + /> + } + /> + diff --git a/content/learn/gateways/data-retrieval.mdx b/content/learn/gateways/data-retrieval.mdx index 9e95dfc15..16f677b0c 100644 --- a/content/learn/gateways/data-retrieval.mdx +++ b/content/learn/gateways/data-retrieval.mdx @@ -3,10 +3,17 @@ title: "Data Retrieval" description: "How ar.io gateways retrieve and share data from multiple sources including trusted peers and Arweave nodes" --- -import { Shield, Cpu, Globe, Settings } from 'lucide-react'; +import { Shield, Cpu, Globe, Settings, Gauge } from 'lucide-react'; Ar.io gateways use a sophisticated multi-tier architecture to retrieve and serve Arweave data. This system ensures high availability, fast response times, and data integrity by leveraging multiple data sources with automatic fallback mechanisms. + + Next in the learning flow: [Data Availability](/learn/gateways/data-availability) + for guarantee levels and access-strategy recommendations, then [Gateway + Expectations](/build/access/gateway-expectations) for practical production + limits and scenarios. + + ## How Gateways Retrieve Data When a gateway needs to serve data, it follows a hierarchical retrieval pattern, trying each source in order until the data is successfully retrieved: @@ -170,6 +177,12 @@ The data retrieval system is fundamental to ar.io's mission of providing reliabl href="/build/access" icon={} /> + } + /> } /> + } + />