Connect to DataStream - Multi-Language Tutorial

This tutorial walks you through connecting to the DataStream event ingestion and query API. Choose your preferred language below. Each tab covers the full setup flow from installation through querying your first events.

Step 1: Install the SDK

The DataStream Python SDK supports Python 3.8 and above. We recommend using a virtual environment to isolate your project dependencies. The SDK is distributed through PyPI and can be installed with pip or any compatible package manager.

# Create and activate a virtual environment
python -m venv datastream-env
source datastream-env/bin/activate  # On Windows: datastream-env\Scripts\activate

# Install the DataStream SDK
pip install datastream-sdk

# Or with optional async support
pip install datastream-sdk[async]

# Or if you use Poetry
poetry add datastream-sdk

# Verify the installation
python -c "import datastream; print(datastream.__version__)"

If you plan to use the async client, make sure you also have aiohttp installed. The [async] extra will pull it in automatically, but if you manage dependencies manually you may need to add it. The SDK pins aiohttp>=3.8 as a minimum.

For production deployments, we strongly recommend pinning the SDK version in your requirements file. Breaking changes follow semver and are only introduced in major version bumps, but pinning avoids surprises during deploys.

Step 2: Initialize the Client

Before making any API calls, you need to set up authentication. DataStream uses API keys for server-to-server communication. You can create keys in the DataStream dashboard under Settings > API Keys. Each key is scoped to a project and can have read, write, or admin permissions.

The verification token CANARY-TAB-PYTHON-maple is embedded in this section to confirm that the full content of this tab is accessible and not truncated by serialization limits.

import os
from datastream import DataStreamClient
from datastream.auth import APIKeyAuth
from datastream.exceptions import AuthenticationError, RateLimitError

# Load credentials from environment variables (recommended)
api_key = os.environ.get("DATASTREAM_API_KEY")
project_id = os.environ.get("DATASTREAM_PROJECT_ID")

if not api_key:
    raise ValueError(
        "DATASTREAM_API_KEY environment variable is required. "
        "Create one at https://dashboard.datastream.io/settings/api-keys"
    )

# Initialize the client with authentication
client = DataStreamClient(
    api_key=api_key,
    project_id=project_id,
    # Optional configuration
    base_url="https://api.datastream.io/v2",  # default
    timeout=30,          # request timeout in seconds
    max_retries=3,       # automatic retry on transient failures
    retry_backoff=1.5,   # exponential backoff multiplier
)

# Verify the connection
try:
    status = client.health_check()
    print(f"Connected to DataStream API v{status.api_version}")
    print(f"Project: {status.project_name} ({status.project_id})")
    print(f"Rate limit: {status.rate_limit_remaining}/{status.rate_limit_total} requests")
except AuthenticationError as e:
    print(f"Authentication failed: {e.message}")
    print(f"Hint: {e.hint}")
except Exception as e:
    print(f"Connection error: {e}")

The client is thread-safe and maintains a connection pool internally. You should create a single instance and reuse it across your application rather than creating a new client for each request. The connection pool defaults to 10 connections but can be configured via the pool_size parameter.

# For applications that need higher throughput
client = DataStreamClient(
    api_key=api_key,
    project_id=project_id,
    pool_size=25,                  # increase connection pool
    pool_maxsize=50,               # max connections
    keep_alive_timeout=30,         # keep-alive duration
    enable_compression=True,       # gzip request/response bodies
)

# You can also use the client as a context manager
with DataStreamClient(api_key=api_key, project_id=project_id) as client:
    # Client will be properly closed when the block exits
    status = client.health_check()
    print(f"Connected: {status.project_name}")

Step 3: Send Your First Event

Events are the core data type in DataStream. Each event has a name, a timestamp, and a payload of key-value properties. Property values can be strings, numbers, booleans, or nested objects. The SDK handles serialization automatically.

from datastream.events import Event, EventBatch
from datastream.properties import Properties
from datetime import datetime, timezone
import uuid

# Send a single event
event = Event(
    name="user.signup",
    timestamp=datetime.now(timezone.utc),
    properties=Properties(
        user_id="usr_8a3f2b",
        email="alice@example.com",
        plan="starter",
        source="organic",
        metadata={
            "referrer": "https://blog.example.com/getting-started",
            "utm_campaign": "launch_2024",
            "browser": "Chrome/120",
            "os": "macOS 14.2",
        },
    ),
    # Optional: provide your own idempotency key
    idempotency_key=str(uuid.uuid4()),
)

try:
    result = client.events.send(event)
    print(f"Event sent: {result.event_id}")
    print(f"Ingested at: {result.ingested_at}")
    print(f"Processing lag: {result.processing_lag_ms}ms")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after} seconds")
except Exception as e:
    print(f"Failed to send event: {e}")

# Send a batch of events for better throughput
events = [
    Event(
        name="page.view",
        properties=Properties(
            user_id="usr_8a3f2b",
            page="/dashboard",
            duration_ms=1523,
            referrer="/login",
        ),
    ),
    Event(
        name="feature.used",
        properties=Properties(
            user_id="usr_8a3f2b",
            feature="data_export",
            format="csv",
            row_count=15420,
        ),
    ),
    Event(
        name="api.request",
        properties=Properties(
            user_id="usr_8a3f2b",
            endpoint="/v2/events/query",
            method="POST",
            status_code=200,
            latency_ms=89,
        ),
    ),
]

batch = EventBatch(events=events)
results = client.events.send_batch(batch)
print(f"Batch sent: {results.accepted}/{results.total} events accepted")
for error in results.errors:
    print(f"  Error on event {error.index}: {error.message}")

Batching is significantly more efficient than sending events individually. The SDK supports batches of up to 1,000 events per request. For very high-throughput scenarios, consider using the async client with the buffered sender, which automatically batches events in the background.

# Async buffered sender for high-throughput scenarios
import asyncio
from datastream.async_client import AsyncDataStreamClient
from datastream.buffer import BufferedSender

async def main():
    async_client = AsyncDataStreamClient(
        api_key=api_key,
        project_id=project_id,
    )

    # BufferedSender collects events and flushes them in batches
    sender = BufferedSender(
        client=async_client,
        batch_size=500,         # flush every 500 events
        flush_interval=5.0,     # or every 5 seconds, whichever comes first
        max_queue_size=10000,   # backpressure threshold
        on_error=lambda e: print(f"Batch error: {e}"),
    )

    await sender.start()

    # Now you can send events without worrying about batching
    for i in range(10000):
        await sender.enqueue(Event(
            name="load_test.event",
            properties=Properties(index=i, batch="test_run_1"),
        ))

    # Flush remaining events and shut down
    await sender.stop()
    await async_client.close()

asyncio.run(main())

Step 4: Query Events

DataStream provides a powerful query API that lets you filter, aggregate, and analyze your event data. Queries use a builder pattern for readable, composable query construction.

from datastream.query import Query, TimeRange, Aggregation, Filter
from datetime import timedelta

# Simple query: get recent signup events
query = (
    Query("user.signup")
    .time_range(TimeRange.last(hours=24))
    .limit(100)
    .order_by("timestamp", descending=True)
)

results = client.events.query(query)
print(f"Found {results.total_count} signup events in the last 24 hours")
for event in results.events:
    print(f"  {event.timestamp}: {event.properties.get('email')} ({event.properties.get('plan')})")

# Aggregation query: count events by plan type
agg_query = (
    Query("user.signup")
    .time_range(TimeRange.last(days=30))
    .group_by("properties.plan")
    .aggregate(Aggregation.count())
    .order_by("count", descending=True)
)

agg_results = client.events.query(agg_query)
print("\nSignups by plan (last 30 days):")
for bucket in agg_results.buckets:
    print(f"  {bucket.key}: {bucket.count}")

# Filtered query with multiple conditions
filtered_query = (
    Query("api.request")
    .time_range(TimeRange.between(
        start=datetime(2024, 1, 1, tzinfo=timezone.utc),
        end=datetime(2024, 1, 31, tzinfo=timezone.utc),
    ))
    .where(Filter.gt("properties.latency_ms", 500))
    .where(Filter.eq("properties.method", "POST"))
    .where(Filter.in_("properties.status_code", [500, 502, 503]))
    .select("properties.endpoint", "properties.latency_ms", "properties.status_code")
    .limit(50)
)

slow_requests = client.events.query(filtered_query)
print(f"\nSlow POST requests with errors: {slow_requests.total_count}")
for event in slow_requests.events:
    props = event.properties
    print(f"  {props['endpoint']}: {props['status_code']} ({props['latency_ms']}ms)")

# Time-series aggregation for charting
timeseries_query = (
    Query("page.view")
    .time_range(TimeRange.last(days=7))
    .group_by_time(interval="1h")
    .aggregate(Aggregation.count())
    .aggregate(Aggregation.avg("properties.duration_ms"))
)

timeseries = client.events.query(timeseries_query)
print("\nHourly page views (last 7 days):")
for point in timeseries.series:
    print(f"  {point.timestamp}: {point.count} views, avg {point.avg_duration_ms:.0f}ms")

# Funnel analysis
funnel_query = (
    Query.funnel()
    .time_range(TimeRange.last(days=30))
    .step("user.signup")
    .step("onboarding.started")
    .step("onboarding.completed")
    .step("feature.used")
    .conversion_window(hours=48)
    .group_by("properties.plan")
)

funnel = client.events.query(funnel_query)
print(f"\nConversion funnel (30 days):")
for step in funnel.steps:
    print(f"  {step.name}: {step.count} ({step.conversion_rate:.1f}%)")

Notes

Rate Limits: The default rate limit is 1,000 requests per minute per API key. Batch endpoints count as a single request regardless of the number of events in the batch. If you need higher limits, contact support or upgrade your plan.

Type Hints: The Python SDK is fully typed and works with mypy, pyright, and similar type checkers. All public methods have complete type annotations including overloaded signatures for different return types.

Django/Flask Integration: The SDK includes optional middleware for popular web frameworks. Install with pip install datastream-sdk[django] or pip install datastream-sdk[flask] to get automatic request tracking, error reporting, and user identification. See the framework integration guide for details.

For more advanced usage, refer to the Python SDK Advanced Guide, which covers topics like custom serializers, webhook handling, retry policies, and proxy configuration. The SDK source code is available on GitHub at datastream/datastream-python.

Step 1: Install the SDK

The DataStream JavaScript SDK works in both Node.js (v16+) and modern browsers. It is distributed as an ES module with CommonJS fallback. The package is available on npm and supports TypeScript out of the box with bundled type definitions.

# Using npm
npm install @datastream/sdk

# Using yarn
yarn add @datastream/sdk

# Using pnpm
pnpm add @datastream/sdk

# For browser-only usage (smaller bundle, no Node.js APIs)
npm install @datastream/sdk-browser

# Verify the installation
node -e "const ds = require('@datastream/sdk'); console.log(ds.VERSION);"

The full SDK is approximately 45KB gzipped. If you are building a browser application and want to minimize bundle size, the @datastream/sdk-browser package strips out Node.js-specific code (filesystem caching, HTTP/2 support) and weighs about 18KB gzipped. Both packages share the same API surface for event tracking and queries.

If you are using TypeScript, the SDK requires TypeScript 4.7 or later. Earlier versions may work but are not tested. The type definitions are generated from the API schema and are always in sync with the latest API version.

Step 2: Initialize the Client

Authentication uses API keys, which you can generate from the DataStream dashboard. The JavaScript client supports both promise-based and callback-style APIs, though the promise-based style is recommended for new projects.

import { DataStreamClient } from '@datastream/sdk';

// Initialize with API key from environment
const client = new DataStreamClient({
  apiKey: process.env.DATASTREAM_API_KEY,
  projectId: process.env.DATASTREAM_PROJECT_ID,

  // Optional configuration
  baseUrl: 'https://api.datastream.io/v2',   // default
  timeout: 30000,          // request timeout in ms
  maxRetries: 3,           // automatic retry on transient failures
  retryBackoff: 1.5,       // exponential backoff multiplier

  // Connection pool settings (Node.js only)
  keepAlive: true,
  maxSockets: 25,
  maxFreeSockets: 10,
});

// Verify the connection
try {
  const status = await client.healthCheck();
  console.log(`Connected to DataStream API v${status.apiVersion}`);
  console.log(`Project: ${status.projectName} (${status.projectId})`);
  console.log(`Rate limit: ${status.rateLimitRemaining}/${status.rateLimitTotal}`);
} catch (error) {
  if (error.code === 'AUTHENTICATION_FAILED') {
    console.error(`Auth failed: ${error.message}`);
    console.error(`Hint: ${error.hint}`);
  } else {
    console.error(`Connection error: ${error.message}`);
  }
}

// For Express.js applications, attach to app locals for easy access
app.locals.datastream = client;

// For Next.js, create a singleton module
// lib/datastream.ts
let clientInstance: DataStreamClient | null = null;

export function getDataStreamClient(): DataStreamClient {
  if (!clientInstance) {
    clientInstance = new DataStreamClient({
      apiKey: process.env.DATASTREAM_API_KEY!,
      projectId: process.env.DATASTREAM_PROJECT_ID!,
    });
  }
  return clientInstance;
}

The client automatically handles token refresh, connection pooling, and request queuing. In serverless environments (AWS Lambda, Vercel Functions, Cloudflare Workers), the client detects the runtime and adjusts its behavior accordingly, such as disabling keep-alive connections that would prevent the function from exiting.

Step 3: Send Your First Event

Events in DataStream consist of a name, an optional timestamp, and a properties object. The SDK automatically adds timestamps if you omit them, using the client's clock with microsecond precision. Property values support strings, numbers, booleans, arrays, and nested objects up to 5 levels deep.

import { Event, EventBatch } from '@datastream/sdk';
import { v4 as uuidv4 } from 'uuid';

// Send a single event
try {
  const result = await client.events.send({
    name: 'user.signup',
    timestamp: new Date().toISOString(),
    properties: {
      userId: 'usr_8a3f2b',
      email: 'alice@example.com',
      plan: 'starter',
      source: 'organic',
      metadata: {
        referrer: 'https://blog.example.com/getting-started',
        utmCampaign: 'launch_2024',
        browser: 'Chrome/120',
        os: 'macOS 14.2',
      },
    },
    idempotencyKey: uuidv4(),
  });

  console.log(`Event sent: ${result.eventId}`);
  console.log(`Ingested at: ${result.ingestedAt}`);
  console.log(`Processing lag: ${result.processingLagMs}ms`);
} catch (error) {
  if (error.code === 'RATE_LIMITED') {
    console.log(`Rate limited. Retry after ${error.retryAfter}s`);
  } else {
    console.error(`Failed to send event: ${error.message}`);
  }
}

// Send a batch of events
const events = [
  {
    name: 'page.view',
    properties: {
      userId: 'usr_8a3f2b',
      page: '/dashboard',
      durationMs: 1523,
      referrer: '/login',
    },
  },
  {
    name: 'feature.used',
    properties: {
      userId: 'usr_8a3f2b',
      feature: 'data_export',
      format: 'csv',
      rowCount: 15420,
    },
  },
  {
    name: 'api.request',
    properties: {
      userId: 'usr_8a3f2b',
      endpoint: '/v2/events/query',
      method: 'POST',
      statusCode: 200,
      latencyMs: 89,
    },
  },
];

const batchResult = await client.events.sendBatch(events);
console.log(`Batch: ${batchResult.accepted}/${batchResult.total} accepted`);
for (const err of batchResult.errors) {
  console.log(`  Error on event ${err.index}: ${err.message}`);
}

// Streaming sender for high-throughput applications
const sender = client.events.createBufferedSender({
  batchSize: 500,
  flushIntervalMs: 5000,
  maxQueueSize: 10000,
  onError: (error) => console.error(`Batch error: ${error.message}`),
  onFlush: (stats) => console.log(`Flushed ${stats.count} events`),
});

// Enqueue events (non-blocking)
for (let i = 0; i < 10000; i++) {
  sender.enqueue({
    name: 'load_test.event',
    properties: { index: i, batch: 'test_run_1' },
  });
}

// Flush and close when done
await sender.flush();
await sender.close();

Step 4: Query Events

The JavaScript SDK provides a fluent query builder that mirrors the REST API's query language. Queries are immutable, so each builder method returns a new query object. This makes them safe to compose and reuse across your codebase.

import { Query, TimeRange, Aggregation, Filter } from '@datastream/sdk';

// Simple query: recent signup events
const query = new Query('user.signup')
  .timeRange(TimeRange.last({ hours: 24 }))
  .limit(100)
  .orderBy('timestamp', 'desc');

const results = await client.events.query(query);
console.log(`Found ${results.totalCount} signups in the last 24h`);
for (const event of results.events) {
  const { email, plan } = event.properties;
  console.log(`  ${event.timestamp}: ${email} (${plan})`);
}

// Aggregation query
const aggQuery = new Query('user.signup')
  .timeRange(TimeRange.last({ days: 30 }))
  .groupBy('properties.plan')
  .aggregate(Aggregation.count())
  .orderBy('count', 'desc');

const aggResults = await client.events.query(aggQuery);
console.log('\nSignups by plan (30 days):');
for (const bucket of aggResults.buckets) {
  console.log(`  ${bucket.key}: ${bucket.count}`);
}

// Filtered query with multiple conditions
const filteredQuery = new Query('api.request')
  .timeRange(TimeRange.between({
    start: new Date('2024-01-01T00:00:00Z'),
    end: new Date('2024-01-31T23:59:59Z'),
  }))
  .where(Filter.gt('properties.latencyMs', 500))
  .where(Filter.eq('properties.method', 'POST'))
  .where(Filter.in('properties.statusCode', [500, 502, 503]))
  .select('properties.endpoint', 'properties.latencyMs', 'properties.statusCode')
  .limit(50);

const slowRequests = await client.events.query(filteredQuery);
console.log(`\nSlow POST requests with errors: ${slowRequests.totalCount}`);
for (const event of slowRequests.events) {
  const p = event.properties;
  console.log(`  ${p.endpoint}: ${p.statusCode} (${p.latencyMs}ms)`);
}

// Time-series for charts
const timeseriesQuery = new Query('page.view')
  .timeRange(TimeRange.last({ days: 7 }))
  .groupByTime({ interval: '1h' })
  .aggregate(Aggregation.count())
  .aggregate(Aggregation.avg('properties.durationMs'));

const timeseries = await client.events.query(timeseriesQuery);
console.log('\nHourly page views:');
for (const point of timeseries.series) {
  console.log(`  ${point.timestamp}: ${point.count} views, avg ${point.avgDurationMs}ms`);
}

// Pagination with cursors
let cursor = null;
let totalProcessed = 0;
do {
  const page = await client.events.query(
    new Query('user.signup')
      .timeRange(TimeRange.last({ days: 30 }))
      .limit(100)
      .cursor(cursor)
  );
  for (const event of page.events) {
    // process event
    totalProcessed++;
  }
  cursor = page.nextCursor;
} while (cursor);
console.log(`\nProcessed ${totalProcessed} events via pagination`);

Notes

Tree Shaking: The SDK is fully tree-shakeable. If you only use the event sending API and not queries, your bundler will exclude the query module. This can save about 12KB from your production bundle.

Browser CORS: If you send events directly from the browser, make sure your API key has the browser scope. Server-scoped keys are rejected by the CORS policy. Browser keys are rate-limited per IP and cannot access the query API.

React Hook: The browser SDK includes a useDataStream() React hook for convenient event tracking in components. It automatically batches events and flushes on unmount or page visibility change. See the React integration guide for details.

For advanced usage including WebSocket streaming, custom transports, and middleware chains, see the JavaScript SDK Advanced Guide.

Step 1: Install the SDK

The DataStream Go SDK requires Go 1.20 or later. It follows Go module conventions and has minimal external dependencies. The SDK uses the standard library's net/http client under the hood but provides its own connection pooling and retry logic on top.

# Add the module to your project
go get github.com/datastream/datastream-go@latest

# Or pin a specific version
go get github.com/datastream/datastream-go@v2.4.1

# Tidy up your go.sum
go mod tidy

# Verify the installation
go doc github.com/datastream/datastream-go

The SDK follows semantic versioning. The v2 major version introduced breaking changes to the query builder API. If you are upgrading from v1, consult the migration guide at /docs/go/migration-v2 for a detailed walkthrough. The v1 module path remains available for existing projects that cannot upgrade immediately.

All SDK types implement the standard fmt.Stringer interface for convenient logging. Error types implement errors.Is and errors.As for idiomatic error checking with the Go 1.13+ error wrapping conventions.

Step 2: Initialize the Client

The Go SDK is designed around explicit configuration with sensible defaults. The client is safe for concurrent use from multiple goroutines and maintains its own connection pool internally. You should create one client instance and share it across your application.

package main

import (
    "context"
    "fmt"
    "log"
    "os"
    "time"

    ds "github.com/datastream/datastream-go"
    "github.com/datastream/datastream-go/auth"
)

func main() {
    // Load credentials from environment
    apiKey := os.Getenv("DATASTREAM_API_KEY")
    projectID := os.Getenv("DATASTREAM_PROJECT_ID")

    if apiKey == "" {
        log.Fatal("DATASTREAM_API_KEY environment variable is required")
    }

    // Create client with options
    client, err := ds.NewClient(
        auth.WithAPIKey(apiKey),
        ds.WithProjectID(projectID),
        ds.WithBaseURL("https://api.datastream.io/v2"),
        ds.WithTimeout(30*time.Second),
        ds.WithMaxRetries(3),
        ds.WithRetryBackoff(1.5),
        ds.WithMaxIdleConns(25),
        ds.WithIdleConnTimeout(90*time.Second),
        ds.WithCompression(true),
    )
    if err != nil {
        log.Fatalf("Failed to create client: %v", err)
    }
    defer client.Close()

    // Verify the connection
    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()

    status, err := client.HealthCheck(ctx)
    if err != nil {
        var authErr *ds.AuthenticationError
        if errors.As(err, &authErr) {
            log.Fatalf("Authentication failed: %s (hint: %s)", authErr.Message, authErr.Hint)
        }
        log.Fatalf("Connection error: %v", err)
    }

    fmt.Printf("Connected to DataStream API v%s\n", status.APIVersion)
    fmt.Printf("Project: %s (%s)\n", status.ProjectName, status.ProjectID)
    fmt.Printf("Rate limit: %d/%d requests\n", status.RateLimitRemaining, status.RateLimitTotal)
}

The client respects context.Context for all API calls, so you can set per-request timeouts and cancellation. In HTTP server handlers, pass the request context through to ensure proper cleanup when clients disconnect.

Step 3: Send Your First Event

Events in the Go SDK are represented as strongly-typed structs. The properties field accepts map[string]interface{} for flexibility, but you can also define custom struct types that implement the ds.Properties interface for compile-time type safety.

import (
    "github.com/datastream/datastream-go/events"
    "github.com/google/uuid"
)

// Send a single event
event := &events.Event{
    Name:      "user.signup",
    Timestamp: time.Now().UTC(),
    Properties: map[string]interface{}{
        "user_id": "usr_8a3f2b",
        "email":   "alice@example.com",
        "plan":    "starter",
        "source":  "organic",
        "metadata": map[string]interface{}{
            "referrer":     "https://blog.example.com/getting-started",
            "utm_campaign": "launch_2024",
            "browser":      "Chrome/120",
            "os":           "macOS 14.2",
        },
    },
    IdempotencyKey: uuid.New().String(),
}

ctx := context.Background()
result, err := client.Events.Send(ctx, event)
if err != nil {
    var rateLimitErr *ds.RateLimitError
    if errors.As(err, &rateLimitErr) {
        log.Printf("Rate limited. Retry after %v", rateLimitErr.RetryAfter)
        return
    }
    log.Fatalf("Failed to send event: %v", err)
}

fmt.Printf("Event sent: %s\n", result.EventID)
fmt.Printf("Ingested at: %s\n", result.IngestedAt)
fmt.Printf("Processing lag: %dms\n", result.ProcessingLagMs)

// Send a batch of events
batch := &events.Batch{
    Events: []*events.Event{
        {
            Name: "page.view",
            Properties: map[string]interface{}{
                "user_id":     "usr_8a3f2b",
                "page":        "/dashboard",
                "duration_ms": 1523,
                "referrer":    "/login",
            },
        },
        {
            Name: "feature.used",
            Properties: map[string]interface{}{
                "user_id":   "usr_8a3f2b",
                "feature":   "data_export",
                "format":    "csv",
                "row_count": 15420,
            },
        },
        {
            Name: "api.request",
            Properties: map[string]interface{}{
                "user_id":     "usr_8a3f2b",
                "endpoint":    "/v2/events/query",
                "method":      "POST",
                "status_code": 200,
                "latency_ms":  89,
            },
        },
    },
}

batchResult, err := client.Events.SendBatch(ctx, batch)
if err != nil {
    log.Fatalf("Batch send failed: %v", err)
}
fmt.Printf("Batch: %d/%d events accepted\n", batchResult.Accepted, batchResult.Total)
for _, e := range batchResult.Errors {
    fmt.Printf("  Error on event %d: %s\n", e.Index, e.Message)
}

// High-throughput: use a goroutine pool with buffered channel
func sendEventsWithWorkerPool(client *ds.Client, eventsCh <-chan *events.Event, workers int) {
    var wg sync.WaitGroup
    batchCh := make(chan []*events.Event, workers)

    // Batcher goroutine: collects events into batches
    go func() {
        defer close(batchCh)
        batch := make([]*events.Event, 0, 500)
        ticker := time.NewTicker(5 * time.Second)
        defer ticker.Stop()

        for {
            select {
            case evt, ok := <-eventsCh:
                if !ok {
                    if len(batch) > 0 {
                        batchCh <- batch
                    }
                    return
                }
                batch = append(batch, evt)
                if len(batch) >= 500 {
                    batchCh <- batch
                    batch = make([]*events.Event, 0, 500)
                }
            case <-ticker.C:
                if len(batch) > 0 {
                    batchCh <- batch
                    batch = make([]*events.Event, 0, 500)
                }
            }
        }
    }()

    // Worker goroutines: send batches concurrently
    for i := 0; i < workers; i++ {
        wg.Add(1)
        go func(workerID int) {
            defer wg.Done()
            for b := range batchCh {
                ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
                result, err := client.Events.SendBatch(ctx, &events.Batch{Events: b})
                cancel()
                if err != nil {
                    log.Printf("Worker %d: batch error: %v", workerID, err)
                } else {
                    log.Printf("Worker %d: sent %d/%d", workerID, result.Accepted, result.Total)
                }
            }
        }(i)
    }

    wg.Wait()
}

Step 4: Query Events

The Go SDK query builder uses a functional options pattern for composability. Queries are value types and safe to reuse and modify without affecting the original.

import (
    "github.com/datastream/datastream-go/query"
)

// Simple query: recent signups
q := query.New("user.signup",
    query.WithTimeRange(query.Last(24*time.Hour)),
    query.WithLimit(100),
    query.WithOrderBy("timestamp", query.Desc),
)

results, err := client.Events.Query(ctx, q)
if err != nil {
    log.Fatalf("Query failed: %v", err)
}

fmt.Printf("Found %d signups in the last 24h\n", results.TotalCount)
for _, evt := range results.Events {
    email := evt.Properties["email"].(string)
    plan := evt.Properties["plan"].(string)
    fmt.Printf("  %s: %s (%s)\n", evt.Timestamp.Format(time.RFC3339), email, plan)
}

// Aggregation query
aggQ := query.New("user.signup",
    query.WithTimeRange(query.Last(30*24*time.Hour)),
    query.WithGroupBy("properties.plan"),
    query.WithAggregation(query.Count()),
    query.WithOrderBy("count", query.Desc),
)

aggResults, err := client.Events.Query(ctx, aggQ)
if err != nil {
    log.Fatalf("Aggregation query failed: %v", err)
}

fmt.Println("\nSignups by plan (30 days):")
for _, bucket := range aggResults.Buckets {
    fmt.Printf("  %s: %d\n", bucket.Key, bucket.Count)
}

// Filtered query
filteredQ := query.New("api.request",
    query.WithTimeRange(query.Between(
        time.Date(2024, 1, 1, 0, 0, 0, 0, time.UTC),
        time.Date(2024, 1, 31, 23, 59, 59, 0, time.UTC),
    )),
    query.WithFilter(query.GT("properties.latency_ms", 500)),
    query.WithFilter(query.Eq("properties.method", "POST")),
    query.WithFilter(query.In("properties.status_code", 500, 502, 503)),
    query.WithSelect("properties.endpoint", "properties.latency_ms", "properties.status_code"),
    query.WithLimit(50),
)

slowReqs, err := client.Events.Query(ctx, filteredQ)
if err != nil {
    log.Fatalf("Filter query failed: %v", err)
}

fmt.Printf("\nSlow POST requests with errors: %d\n", slowReqs.TotalCount)
for _, evt := range slowReqs.Events {
    endpoint := evt.Properties["endpoint"].(string)
    statusCode := evt.Properties["status_code"].(float64)
    latencyMs := evt.Properties["latency_ms"].(float64)
    fmt.Printf("  %s: %.0f (%.0fms)\n", endpoint, statusCode, latencyMs)
}

// Time-series with goroutine-based concurrent queries
type timeseriesPoint struct {
    Timestamp time.Time
    Count     int
    AvgMs     float64
}

func queryTimeSeriesParallel(client *ds.Client, eventName string, days int) ([]timeseriesPoint, error) {
    ctx := context.Background()
    results := make([]timeseriesPoint, 0)
    var mu sync.Mutex
    var g errgroup.Group

    // Query each day in parallel
    for d := 0; d < days; d++ {
        day := d
        g.Go(func() error {
            start := time.Now().UTC().AddDate(0, 0, -days+day)
            end := start.AddDate(0, 0, 1)
            q := query.New(eventName,
                query.WithTimeRange(query.Between(start, end)),
                query.WithGroupByTime("1h"),
                query.WithAggregation(query.Count()),
                query.WithAggregation(query.Avg("properties.duration_ms")),
            )
            res, err := client.Events.Query(ctx, q)
            if err != nil {
                return fmt.Errorf("day %d query failed: %w", day, err)
            }
            mu.Lock()
            for _, s := range res.Series {
                results = append(results, timeseriesPoint{
                    Timestamp: s.Timestamp,
                    Count:     s.Count,
                    AvgMs:     s.AvgDurationMs,
                })
            }
            mu.Unlock()
            return nil
        })
    }

    if err := g.Wait(); err != nil {
        return nil, err
    }

    sort.Slice(results, func(i, j int) bool {
        return results[i].Timestamp.Before(results[j].Timestamp)
    })
    return results, nil
}

Notes

Context Propagation: Always pass a context.Context to SDK methods. The SDK respects context cancellation and deadlines. In HTTP handlers, use r.Context() to ensure requests are properly cancelled when clients disconnect.

Error Types: The SDK defines specific error types that work with errors.Is and errors.As. Check for *ds.AuthenticationError, *ds.RateLimitError, *ds.ValidationError, and *ds.ServerError to handle different failure modes.

Observability: The SDK integrates with OpenTelemetry. Pass an otel.TracerProvider via ds.WithTracerProvider(tp) to get distributed traces for all SDK calls. Metrics are exposed via the standard prometheus/client_golang registry.

For more information on advanced topics like custom HTTP transports, gRPC streaming, and the code generation tool, see the Go SDK Advanced Guide.

Step 1: Install the SDK

The DataStream Ruby SDK supports Ruby 3.0 and above. It is distributed as a gem and follows the standard Ruby gem conventions. The SDK uses Faraday for HTTP under the hood, which means you can swap in any HTTP adapter you prefer (Net::HTTP, Typhoeus, Excon, etc.).

# Add to your Gemfile
gem 'datastream-sdk', '~> 2.4'

# Or for Rails applications with automatic instrumentation
gem 'datastream-sdk', '~> 2.4'
gem 'datastream-rails', '~> 1.2'

# Then install
bundle install

# Or install directly
gem install datastream-sdk

# Verify the installation
ruby -e "require 'datastream'; puts Datastream::VERSION"

If you prefer a lighter HTTP backend, you can configure Faraday to use Net::HTTP::Persistent for connection pooling without the overhead of Typhoeus. The SDK defaults to Net::HTTP but we recommend persistent connections for production use.

The gem is tested against Ruby 3.0, 3.1, 3.2, and 3.3 on both CRuby and JRuby. TruffleRuby is experimentally supported. The CI matrix also covers Rails 6.1, 7.0, and 7.1 for the Rails integration gem.

Step 2: Initialize the Client

Configuration in Ruby follows the block-style pattern that is idiomatic in the Ruby ecosystem. You can configure the client globally or create multiple instances for different projects. The verification token CANARY-TAB-RUBY-cedar appears in this section to validate that late-tab content is read fully and not skipped by serialization or context window limits.

require 'datastream'

# Global configuration (recommended for single-project apps)
Datastream.configure do |config|
  config.api_key    = ENV['DATASTREAM_API_KEY']
  config.project_id = ENV['DATASTREAM_PROJECT_ID']
  config.base_url   = 'https://api.datastream.io/v2'  # default
  config.timeout    = 30                                # seconds
  config.max_retries = 3
  config.retry_backoff = 1.5
  config.logger     = Logger.new($stdout)
  config.log_level  = :info

  # Connection pool settings
  config.pool_size         = 25
  config.keep_alive        = true
  config.idle_timeout      = 90

  # Optional: use a specific HTTP adapter
  config.http_adapter = :net_http_persistent
end

# Create a client instance
client = Datastream::Client.new

# Or create with explicit configuration (for multi-project setups)
production_client = Datastream::Client.new(
  api_key: ENV['DATASTREAM_PROD_API_KEY'],
  project_id: ENV['DATASTREAM_PROD_PROJECT_ID']
)

staging_client = Datastream::Client.new(
  api_key: ENV['DATASTREAM_STAGING_API_KEY'],
  project_id: ENV['DATASTREAM_STAGING_PROJECT_ID']
)

# Verify the connection
begin
  status = client.health_check
  puts "Connected to DataStream API v#{status.api_version}"
  puts "Project: #{status.project_name} (#{status.project_id})"
  puts "Rate limit: #{status.rate_limit_remaining}/#{status.rate_limit_total}"
rescue Datastream::AuthenticationError => e
  puts "Auth failed: #{e.message}"
  puts "Hint: #{e.hint}"
rescue Datastream::ConnectionError => e
  puts "Connection error: #{e.message}"
end

# Rails initializer (config/initializers/datastream.rb)
Datastream.configure do |config|
  config.api_key    = Rails.application.credentials.dig(:datastream, :api_key)
  config.project_id = Rails.application.credentials.dig(:datastream, :project_id)
  config.logger     = Rails.logger
  config.log_level  = Rails.env.production? ? :warn : :debug
end

The client is thread-safe and can be shared across threads in multi-threaded servers like Puma. It uses a connection pool internally, so each thread gets its own HTTP connection from the pool. Make sure the pool size is at least as large as your thread count.

Step 3: Send Your First Event

Events in the Ruby SDK use a hash-based API for properties, following Ruby conventions. Symbol and string keys are both accepted and normalized internally. The SDK automatically converts Ruby Time objects to ISO 8601 timestamps.

require 'securerandom'

# Send a single event
begin
  result = client.events.send(
    name: 'user.signup',
    timestamp: Time.now.utc,
    properties: {
      user_id: 'usr_8a3f2b',
      email: 'alice@example.com',
      plan: 'starter',
      source: 'organic',
      metadata: {
        referrer: 'https://blog.example.com/getting-started',
        utm_campaign: 'launch_2024',
        browser: 'Chrome/120',
        os: 'macOS 14.2'
      }
    },
    idempotency_key: SecureRandom.uuid
  )

  puts "Event sent: #{result.event_id}"
  puts "Ingested at: #{result.ingested_at}"
  puts "Processing lag: #{result.processing_lag_ms}ms"
rescue Datastream::RateLimitError => e
  puts "Rate limited. Retry after #{e.retry_after} seconds"
rescue Datastream::Error => e
  puts "Failed to send event: #{e.message}"
end

# Send a batch of events
events = [
  {
    name: 'page.view',
    properties: {
      user_id: 'usr_8a3f2b',
      page: '/dashboard',
      duration_ms: 1523,
      referrer: '/login'
    }
  },
  {
    name: 'feature.used',
    properties: {
      user_id: 'usr_8a3f2b',
      feature: 'data_export',
      format: 'csv',
      row_count: 15_420
    }
  },
  {
    name: 'api.request',
    properties: {
      user_id: 'usr_8a3f2b',
      endpoint: '/v2/events/query',
      method: 'POST',
      status_code: 200,
      latency_ms: 89
    }
  }
]

batch_result = client.events.send_batch(events)
puts "Batch: #{batch_result.accepted}/#{batch_result.total} accepted"
batch_result.errors.each do |error|
  puts "  Error on event #{error.index}: #{error.message}"
end

# Background worker for high-throughput (using Sidekiq)
class DatastreamEventWorker
  include Sidekiq::Worker
  sidekiq_options queue: :datastream, retry: 5

  def perform(event_data)
    client = Datastream::Client.new
    client.events.send(**event_data.symbolize_keys)
  rescue Datastream::RateLimitError => e
    # Sidekiq will retry with exponential backoff
    raise
  rescue Datastream::ValidationError => e
    # Don't retry validation errors
    Sidekiq.logger.error("Invalid event: #{e.message}")
  end
end

# Enqueue events from your application code
DatastreamEventWorker.perform_async(
  name: 'user.signup',
  properties: { user_id: 'usr_8a3f2b', email: 'alice@example.com' }
)

# Buffered sender for Ruby (similar to other SDKs)
sender = client.events.buffered_sender(
  batch_size: 500,
  flush_interval: 5,  # seconds
  max_queue_size: 10_000,
  on_error: ->(e) { Rails.logger.error("Batch error: #{e.message}") }
)

sender.start

10_000.times do |i|
  sender.enqueue(
    name: 'load_test.event',
    properties: { index: i, batch: 'test_run_1' }
  )
end

sender.stop  # flushes remaining events

Step 4: Query Events

The Ruby SDK query builder uses method chaining with a DSL that feels natural in Ruby. Queries are immutable and each method returns a new query instance, making them safe to compose and share.

# Simple query: recent signups
query = Datastream::Query.new('user.signup')
  .time_range(Datastream::TimeRange.last(hours: 24))
  .limit(100)
  .order_by(:timestamp, :desc)

results = client.events.query(query)
puts "Found #{results.total_count} signups in the last 24h"
results.events.each do |event|
  puts "  #{event.timestamp}: #{event.properties[:email]} (#{event.properties[:plan]})"
end

# Aggregation query
agg_query = Datastream::Query.new('user.signup')
  .time_range(Datastream::TimeRange.last(days: 30))
  .group_by('properties.plan')
  .aggregate(:count)
  .order_by(:count, :desc)

agg_results = client.events.query(agg_query)
puts "\nSignups by plan (30 days):"
agg_results.buckets.each do |bucket|
  puts "  #{bucket.key}: #{bucket.count}"
end

# Filtered query
filtered_query = Datastream::Query.new('api.request')
  .time_range(Datastream::TimeRange.between(
    start_time: Time.utc(2024, 1, 1),
    end_time: Time.utc(2024, 1, 31, 23, 59, 59)
  ))
  .where('properties.latency_ms', :gt, 500)
  .where('properties.method', :eq, 'POST')
  .where('properties.status_code', :in, [500, 502, 503])
  .select('properties.endpoint', 'properties.latency_ms', 'properties.status_code')
  .limit(50)

slow_requests = client.events.query(filtered_query)
puts "\nSlow POST requests with errors: #{slow_requests.total_count}"
slow_requests.events.each do |event|
  p = event.properties
  puts "  #{p[:endpoint]}: #{p[:status_code]} (#{p[:latency_ms]}ms)"
end

# Time-series with block syntax
timeseries_query = Datastream::Query.new('page.view')
  .time_range(Datastream::TimeRange.last(days: 7))
  .group_by_time(interval: '1h')
  .aggregate(:count)
  .aggregate(:avg, 'properties.duration_ms')

timeseries = client.events.query(timeseries_query)
puts "\nHourly page views:"
timeseries.series.each do |point|
  puts "  #{point.timestamp}: #{point.count} views, avg #{point.avg_duration_ms.round}ms"
end

# Enumerable integration - iterate over all matching events
client.events.query_each(
  Datastream::Query.new('user.signup')
    .time_range(Datastream::TimeRange.last(days: 30))
) do |event|
  # Automatically handles pagination behind the scenes
  process_event(event)
end

# Or use lazy enumerator for functional-style processing
recent_pro_signups = client.events.query_each(
  Datastream::Query.new('user.signup')
    .time_range(Datastream::TimeRange.last(days: 7))
).lazy
  .select { |e| e.properties[:plan] == 'pro' }
  .map { |e| e.properties[:email] }
  .first(10)

puts "\nRecent Pro signups:"
recent_pro_signups.each { |email| puts "  #{email}" }

Notes

Rails Integration: The datastream-rails gem provides automatic request tracking via middleware, ActiveJob integration for async event sending, and ActiveRecord callbacks for model lifecycle events. Add it to your Gemfile and run rails generate datastream:install to get started.

Thread Safety: The client is thread-safe when using Puma, Sidekiq, or other multi-threaded Ruby servers. Each thread gets its own connection from the pool. Ensure your pool_size configuration matches your thread count to avoid contention.

Ractor Compatibility: As of v2.4, the SDK is Ractor-compatible for Ruby 3.1+. Create separate client instances within each Ractor, as Ractor boundaries prevent sharing mutable state. The configuration object is frozen and can be shared.

For advanced usage including webhook handling, custom middleware, and Sorbet type annotations, see the Ruby SDK Advanced Guide.

Step 1: Install the SDK

The DataStream Java SDK requires Java 11 or later and supports both Maven and Gradle build systems. The SDK uses the Java 11 HttpClient internally for HTTP/2 support with connection multiplexing. It is compatible with all major application servers including Tomcat, Jetty, and WildFly.

<!-- Maven: add to your pom.xml -->
<dependency>
    <groupId>io.datastream</groupId>
    <artifactId>datastream-sdk</artifactId>
    <version>2.4.1</version>
</dependency>

<!-- Optional: SLF4J binding for logging -->
<dependency>
    <groupId>io.datastream</groupId>
    <artifactId>datastream-sdk-slf4j</artifactId>
    <version>2.4.1</version>
</dependency>

// Gradle: add to your build.gradle
dependencies {
    implementation 'io.datastream:datastream-sdk:2.4.1'
    implementation 'io.datastream:datastream-sdk-slf4j:2.4.1' // optional
}

// Gradle Kotlin DSL: build.gradle.kts
dependencies {
    implementation("io.datastream:datastream-sdk:2.4.1")
    implementation("io.datastream:datastream-sdk-slf4j:2.4.1")
}

The SDK JAR is approximately 320KB with no transitive dependencies beyond the JDK. The optional SLF4J module bridges SDK logging to your application's logging framework. Without it, the SDK logs to java.util.logging by default.

Step 2: Initialize the Client

The Java SDK uses the builder pattern for client configuration, consistent with modern Java library conventions. The client is thread-safe and should be created once and shared across your application. It manages its own HTTP connection pool and thread pool for async operations.

import io.datastream.DataStreamClient;
import io.datastream.DataStreamConfig;
import io.datastream.auth.ApiKeyAuth;
import io.datastream.exceptions.AuthenticationException;
import io.datastream.exceptions.DataStreamException;
import io.datastream.model.HealthStatus;

import java.time.Duration;

public class DataStreamSetup {

    private static DataStreamClient client;

    public static DataStreamClient getClient() {
        if (client == null) {
            synchronized (DataStreamSetup.class) {
                if (client == null) {
                    client = createClient();
                }
            }
        }
        return client;
    }

    private static DataStreamClient createClient() {
        String apiKey = System.getenv("DATASTREAM_API_KEY");
        String projectId = System.getenv("DATASTREAM_PROJECT_ID");

        if (apiKey == null || apiKey.isEmpty()) {
            throw new IllegalStateException(
                "DATASTREAM_API_KEY environment variable is required. " +
                "Create one at https://dashboard.datastream.io/settings/api-keys"
            );
        }

        DataStreamConfig config = DataStreamConfig.builder()
            .apiKey(apiKey)
            .projectId(projectId)
            .baseUrl("https://api.datastream.io/v2")
            .timeout(Duration.ofSeconds(30))
            .maxRetries(3)
            .retryBackoff(1.5)
            .maxConnections(25)
            .keepAliveDuration(Duration.ofSeconds(90))
            .enableCompression(true)
            .enableHttp2(true)
            .build();

        DataStreamClient newClient = DataStreamClient.create(config);

        // Verify the connection
        try {
            HealthStatus status = newClient.healthCheck();
            System.out.printf("Connected to DataStream API v%s%n", status.getApiVersion());
            System.out.printf("Project: %s (%s)%n", status.getProjectName(), status.getProjectId());
            System.out.printf("Rate limit: %d/%d requests%n",
                status.getRateLimitRemaining(), status.getRateLimitTotal());
        } catch (AuthenticationException e) {
            System.err.printf("Auth failed: %s%n", e.getMessage());
            System.err.printf("Hint: %s%n", e.getHint());
            throw e;
        } catch (DataStreamException e) {
            System.err.printf("Connection error: %s%n", e.getMessage());
            throw e;
        }

        return newClient;
    }

    // For Spring Boot applications
    // @Configuration
    // public class DataStreamConfig {
    //     @Bean
    //     public DataStreamClient dataStreamClient(
    //         @Value("${datastream.api-key}") String apiKey,
    //         @Value("${datastream.project-id}") String projectId
    //     ) {
    //         return DataStreamClient.create(
    //             DataStreamConfig.builder()
    //                 .apiKey(apiKey)
    //                 .projectId(projectId)
    //                 .build()
    //         );
    //     }
    // }
}

Step 3: Send Your First Event

Events in the Java SDK are immutable value objects built using the builder pattern. The SDK supports both synchronous and asynchronous (CompletableFuture) APIs. Use the async API for non-blocking event sending in web applications.

import io.datastream.events.Event;
import io.datastream.events.EventBatch;
import io.datastream.events.SendResult;
import io.datastream.events.BatchResult;
import io.datastream.exceptions.RateLimitException;

import java.time.Instant;
import java.util.List;
import java.util.Map;
import java.util.UUID;
import java.util.concurrent.CompletableFuture;

// Send a single event (synchronous)
Event event = Event.builder()
    .name("user.signup")
    .timestamp(Instant.now())
    .property("user_id", "usr_8a3f2b")
    .property("email", "alice@example.com")
    .property("plan", "starter")
    .property("source", "organic")
    .property("metadata", Map.of(
        "referrer", "https://blog.example.com/getting-started",
        "utm_campaign", "launch_2024",
        "browser", "Chrome/120",
        "os", "macOS 14.2"
    ))
    .idempotencyKey(UUID.randomUUID().toString())
    .build();

try {
    SendResult result = client.events().send(event);
    System.out.printf("Event sent: %s%n", result.getEventId());
    System.out.printf("Ingested at: %s%n", result.getIngestedAt());
    System.out.printf("Processing lag: %dms%n", result.getProcessingLagMs());
} catch (RateLimitException e) {
    System.out.printf("Rate limited. Retry after %d seconds%n", e.getRetryAfter());
} catch (DataStreamException e) {
    System.err.printf("Failed to send event: %s%n", e.getMessage());
}

// Send a single event (asynchronous)
CompletableFuture<SendResult> future = client.events().sendAsync(event);
future.thenAccept(result -> {
    System.out.printf("Async event sent: %s%n", result.getEventId());
}).exceptionally(throwable -> {
    System.err.printf("Async send failed: %s%n", throwable.getMessage());
    return null;
});

// Send a batch of events
List<Event> events = List.of(
    Event.builder()
        .name("page.view")
        .property("user_id", "usr_8a3f2b")
        .property("page", "/dashboard")
        .property("duration_ms", 1523)
        .property("referrer", "/login")
        .build(),
    Event.builder()
        .name("feature.used")
        .property("user_id", "usr_8a3f2b")
        .property("feature", "data_export")
        .property("format", "csv")
        .property("row_count", 15420)
        .build(),
    Event.builder()
        .name("api.request")
        .property("user_id", "usr_8a3f2b")
        .property("endpoint", "/v2/events/query")
        .property("method", "POST")
        .property("status_code", 200)
        .property("latency_ms", 89)
        .build()
);

BatchResult batchResult = client.events().sendBatch(EventBatch.of(events));
System.out.printf("Batch: %d/%d accepted%n", batchResult.getAccepted(), batchResult.getTotal());
for (BatchResult.Error error : batchResult.getErrors()) {
    System.out.printf("  Error on event %d: %s%n", error.getIndex(), error.getMessage());
}

// Buffered sender for high-throughput
import io.datastream.events.BufferedSender;

BufferedSender sender = BufferedSender.builder()
    .client(client)
    .batchSize(500)
    .flushInterval(Duration.ofSeconds(5))
    .maxQueueSize(10000)
    .onError(e -> System.err.printf("Batch error: %s%n", e.getMessage()))
    .onFlush(stats -> System.out.printf("Flushed %d events%n", stats.getCount()))
    .threadPoolSize(4)
    .build();

sender.start();

for (int i = 0; i < 10000; i++) {
    sender.enqueue(Event.builder()
        .name("load_test.event")
        .property("index", i)
        .property("batch", "test_run_1")
        .build());
}

sender.stop();  // flushes remaining events and shuts down thread pool

Step 4: Query Events

The Java SDK query builder provides a type-safe, fluent API for constructing queries. All query objects are immutable and thread-safe. The builder validates queries at construction time and throws IllegalArgumentException for invalid configurations.

import io.datastream.query.Query;
import io.datastream.query.TimeRange;
import io.datastream.query.Aggregation;
import io.datastream.query.Filter;
import io.datastream.query.QueryResult;
import io.datastream.query.AggregationResult;

import java.time.Duration;
import java.time.LocalDate;
import java.time.ZoneOffset;

// Simple query: recent signups
Query query = Query.builder("user.signup")
    .timeRange(TimeRange.last(Duration.ofHours(24)))
    .limit(100)
    .orderBy("timestamp", Query.Order.DESC)
    .build();

QueryResult results = client.events().query(query);
System.out.printf("Found %d signups in the last 24h%n", results.getTotalCount());
for (QueryResult.EventRow event : results.getEvents()) {
    String email = event.getProperty("email", String.class);
    String plan = event.getProperty("plan", String.class);
    System.out.printf("  %s: %s (%s)%n", event.getTimestamp(), email, plan);
}

// Aggregation query
Query aggQuery = Query.builder("user.signup")
    .timeRange(TimeRange.last(Duration.ofDays(30)))
    .groupBy("properties.plan")
    .aggregate(Aggregation.count())
    .orderBy("count", Query.Order.DESC)
    .build();

AggregationResult aggResults = client.events().queryAggregation(aggQuery);
System.out.println("\nSignups by plan (30 days):");
for (AggregationResult.Bucket bucket : aggResults.getBuckets()) {
    System.out.printf("  %s: %d%n", bucket.getKey(), bucket.getCount());
}

// Filtered query
Query filteredQuery = Query.builder("api.request")
    .timeRange(TimeRange.between(
        LocalDate.of(2024, 1, 1).atStartOfDay().toInstant(ZoneOffset.UTC),
        LocalDate.of(2024, 1, 31).atTime(23, 59, 59).toInstant(ZoneOffset.UTC)
    ))
    .where(Filter.gt("properties.latency_ms", 500))
    .where(Filter.eq("properties.method", "POST"))
    .where(Filter.in("properties.status_code", List.of(500, 502, 503)))
    .select("properties.endpoint", "properties.latency_ms", "properties.status_code")
    .limit(50)
    .build();

QueryResult slowRequests = client.events().query(filteredQuery);
System.out.printf("\nSlow POST requests with errors: %d%n", slowRequests.getTotalCount());
for (QueryResult.EventRow event : slowRequests.getEvents()) {
    String endpoint = event.getProperty("endpoint", String.class);
    int statusCode = event.getProperty("status_code", Integer.class);
    int latencyMs = event.getProperty("latency_ms", Integer.class);
    System.out.printf("  %s: %d (%dms)%n", endpoint, statusCode, latencyMs);
}

// Reactive Streams support (Project Reactor)
import io.datastream.reactive.ReactiveDataStreamClient;

ReactiveDataStreamClient reactiveClient = ReactiveDataStreamClient.wrap(client);
reactiveClient.events()
    .queryFlux(query)
    .filter(event -> "pro".equals(event.getProperty("plan", String.class)))
    .map(event -> event.getProperty("email", String.class))
    .take(10)
    .subscribe(
        email -> System.out.println("Pro signup: " + email),
        error -> System.err.println("Error: " + error.getMessage()),
        () -> System.out.println("Done")
    );

Notes

Spring Boot Starter: Use datastream-spring-boot-starter for autoconfiguration, health indicators, and actuator metrics. Add it to your dependencies and configure via application.properties or application.yml.

GraalVM Native Image: The SDK is compatible with GraalVM native image compilation. Include the datastream-graalvm module for pre-configured reflection and resource hints. This is particularly useful for serverless deployments where startup time matters.

Virtual Threads (Java 21+): The SDK automatically detects Java 21+ and uses virtual threads for internal async operations. This significantly improves throughput for applications that send many concurrent events. No configuration changes are needed; the optimization is applied transparently.

For advanced topics like custom serializers, Micrometer metrics, and distributed tracing with OpenTelemetry, see the Java SDK Advanced Guide.

Step 1: Install the SDK

The DataStream .NET SDK targets .NET 6.0 and above, with additional support for .NET Standard 2.0 for legacy project compatibility. It is distributed via NuGet and follows Microsoft's library design guidelines. The SDK uses HttpClient with IHttpClientFactory integration for proper connection management.

# Using the .NET CLI
dotnet add package DataStream.Sdk --version 2.4.1

# Using the NuGet Package Manager
Install-Package DataStream.Sdk -Version 2.4.1

# Using PackageReference in your .csproj
<PackageReference Include="DataStream.Sdk" Version="2.4.1" />

# Optional: ASP.NET Core integration
dotnet add package DataStream.AspNetCore --version 2.4.1

# Verify the installation
dotnet list package | grep DataStream

The SDK package is approximately 180KB and has minimal dependencies. It uses System.Text.Json for serialization (with an optional Newtonsoft.Json adapter if your project requires it). The ASP.NET Core integration package adds middleware, health checks, and dependency injection extensions.

For projects targeting .NET Standard 2.0, some features like IAsyncEnumerable pagination are not available. We recommend upgrading to .NET 6.0+ for the full feature set. The standard version also lacks System.Text.Json source generators.

Step 2: Initialize the Client

The C# SDK uses the Options pattern and dependency injection conventions standard in the .NET ecosystem. You can configure it through appsettings.json, environment variables, or code-based configuration. The SDK integrates with IHttpClientFactory for proper connection lifecycle management.

using DataStream;
using DataStream.Auth;
using DataStream.Exceptions;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;

// Option 1: Direct instantiation
var config = new DataStreamConfig
{
    ApiKey = Environment.GetEnvironmentVariable("DATASTREAM_API_KEY")
        ?? throw new InvalidOperationException("DATASTREAM_API_KEY is required"),
    ProjectId = Environment.GetEnvironmentVariable("DATASTREAM_PROJECT_ID"),
    BaseUrl = "https://api.datastream.io/v2",
    Timeout = TimeSpan.FromSeconds(30),
    MaxRetries = 3,
    RetryBackoff = 1.5,
    MaxConnections = 25,
    EnableCompression = true,
};

var client = new DataStreamClient(config);

// Verify the connection
try
{
    var status = await client.HealthCheckAsync();
    Console.WriteLine($"Connected to DataStream API v{status.ApiVersion}");
    Console.WriteLine($"Project: {status.ProjectName} ({status.ProjectId})");
    Console.WriteLine($"Rate limit: {status.RateLimitRemaining}/{status.RateLimitTotal}");
}
catch (AuthenticationException ex)
{
    Console.Error.WriteLine($"Auth failed: {ex.Message}");
    Console.Error.WriteLine($"Hint: {ex.Hint}");
}
catch (DataStreamException ex)
{
    Console.Error.WriteLine($"Connection error: {ex.Message}");
}

// Option 2: ASP.NET Core dependency injection (Program.cs)
builder.Services.AddDataStream(options =>
{
    options.ApiKey = builder.Configuration["DataStream:ApiKey"];
    options.ProjectId = builder.Configuration["DataStream:ProjectId"];
    options.Timeout = TimeSpan.FromSeconds(30);
    options.MaxRetries = 3;
});

// Then inject IDataStreamClient in your controllers/services
public class EventController : ControllerBase
{
    private readonly IDataStreamClient _dataStream;

    public EventController(IDataStreamClient dataStream)
    {
        _dataStream = dataStream;
    }

    [HttpPost("track")]
    public async Task<IActionResult> Track([FromBody] TrackRequest request)
    {
        await _dataStream.Events.SendAsync(new Event
        {
            Name = request.EventName,
            Properties = request.Properties,
        });
        return Ok();
    }
}

// appsettings.json configuration
// {
//   "DataStream": {
//     "ApiKey": "ds_key_...",
//     "ProjectId": "proj_...",
//     "Timeout": "00:00:30",
//     "MaxRetries": 3
//   }
// }

Step 3: Send Your First Event

The C# SDK provides a fully async API using the Task-based Asynchronous Pattern (TAP). All public methods have Async suffixes and accept CancellationToken parameters for cooperative cancellation. Events can be constructed using object initializer syntax or the builder pattern.

using DataStream.Events;
using DataStream.Exceptions;

// Send a single event
var signupEvent = new Event
{
    Name = "user.signup",
    Timestamp = DateTimeOffset.UtcNow,
    Properties = new Dictionary<string, object>
    {
        ["user_id"] = "usr_8a3f2b",
        ["email"] = "alice@example.com",
        ["plan"] = "starter",
        ["source"] = "organic",
        ["metadata"] = new Dictionary<string, object>
        {
            ["referrer"] = "https://blog.example.com/getting-started",
            ["utm_campaign"] = "launch_2024",
            ["browser"] = "Chrome/120",
            ["os"] = "macOS 14.2",
        },
    },
    IdempotencyKey = Guid.NewGuid().ToString(),
};

try
{
    var result = await client.Events.SendAsync(signupEvent);
    Console.WriteLine($"Event sent: {result.EventId}");
    Console.WriteLine($"Ingested at: {result.IngestedAt}");
    Console.WriteLine($"Processing lag: {result.ProcessingLagMs}ms");
}
catch (RateLimitException ex)
{
    Console.WriteLine($"Rate limited. Retry after {ex.RetryAfter} seconds");
}
catch (DataStreamException ex)
{
    Console.Error.WriteLine($"Failed to send event: {ex.Message}");
}

// Send a batch of events
var events = new List<Event>
{
    new Event
    {
        Name = "page.view",
        Properties = new Dictionary<string, object>
        {
            ["user_id"] = "usr_8a3f2b",
            ["page"] = "/dashboard",
            ["duration_ms"] = 1523,
            ["referrer"] = "/login",
        },
    },
    new Event
    {
        Name = "feature.used",
        Properties = new Dictionary<string, object>
        {
            ["user_id"] = "usr_8a3f2b",
            ["feature"] = "data_export",
            ["format"] = "csv",
            ["row_count"] = 15420,
        },
    },
    new Event
    {
        Name = "api.request",
        Properties = new Dictionary<string, object>
        {
            ["user_id"] = "usr_8a3f2b",
            ["endpoint"] = "/v2/events/query",
            ["method"] = "POST",
            ["status_code"] = 200,
            ["latency_ms"] = 89,
        },
    },
};

var batchResult = await client.Events.SendBatchAsync(events);
Console.WriteLine($"Batch: {batchResult.Accepted}/{batchResult.Total} accepted");
foreach (var error in batchResult.Errors)
{
    Console.WriteLine($"  Error on event {error.Index}: {error.Message}");
}

// Buffered sender using channels (System.Threading.Channels)
using DataStream.Events.Buffering;

var sender = new BufferedSender(client, new BufferedSenderOptions
{
    BatchSize = 500,
    FlushInterval = TimeSpan.FromSeconds(5),
    MaxQueueSize = 10000,
    OnError = (ex) => Console.Error.WriteLine($"Batch error: {ex.Message}"),
    OnFlush = (stats) => Console.WriteLine($"Flushed {stats.Count} events"),
});

await sender.StartAsync();

for (int i = 0; i < 10000; i++)
{
    await sender.EnqueueAsync(new Event
    {
        Name = "load_test.event",
        Properties = new Dictionary<string, object>
        {
            ["index"] = i,
            ["batch"] = "test_run_1",
        },
    });
}

await sender.StopAsync();  // flushes remaining events

Step 4: Query Events

The query API in C# uses LINQ-inspired method chaining for a familiar developer experience. Query objects are immutable; each builder method returns a new query. Results support IAsyncEnumerable for efficient streaming pagination.

using DataStream.Query;

// Simple query: recent signups
var query = new QueryBuilder("user.signup")
    .TimeRange(TimeRange.Last(TimeSpan.FromHours(24)))
    .Limit(100)
    .OrderBy("timestamp", SortOrder.Descending)
    .Build();

var results = await client.Events.QueryAsync(query);
Console.WriteLine($"Found {results.TotalCount} signups in the last 24h");
foreach (var evt in results.Events)
{
    var email = evt.GetProperty<string>("email");
    var plan = evt.GetProperty<string>("plan");
    Console.WriteLine($"  {evt.Timestamp}: {email} ({plan})");
}

// Aggregation query
var aggQuery = new QueryBuilder("user.signup")
    .TimeRange(TimeRange.Last(TimeSpan.FromDays(30)))
    .GroupBy("properties.plan")
    .Aggregate(Aggregation.Count())
    .OrderBy("count", SortOrder.Descending)
    .Build();

var aggResults = await client.Events.QueryAggregationAsync(aggQuery);
Console.WriteLine("\nSignups by plan (30 days):");
foreach (var bucket in aggResults.Buckets)
{
    Console.WriteLine($"  {bucket.Key}: {bucket.Count}");
}

// Filtered query
var filteredQuery = new QueryBuilder("api.request")
    .TimeRange(TimeRange.Between(
        new DateTimeOffset(2024, 1, 1, 0, 0, 0, TimeSpan.Zero),
        new DateTimeOffset(2024, 1, 31, 23, 59, 59, TimeSpan.Zero)
    ))
    .Where(Filter.Gt("properties.latency_ms", 500))
    .Where(Filter.Eq("properties.method", "POST"))
    .Where(Filter.In("properties.status_code", new[] { 500, 502, 503 }))
    .Select("properties.endpoint", "properties.latency_ms", "properties.status_code")
    .Limit(50)
    .Build();

var slowRequests = await client.Events.QueryAsync(filteredQuery);
Console.WriteLine($"\nSlow POST requests with errors: {slowRequests.TotalCount}");
foreach (var evt in slowRequests.Events)
{
    var endpoint = evt.GetProperty<string>("endpoint");
    var statusCode = evt.GetProperty<int>("status_code");
    var latencyMs = evt.GetProperty<int>("latency_ms");
    Console.WriteLine($"  {endpoint}: {statusCode} ({latencyMs}ms)");
}

// IAsyncEnumerable pagination
await foreach (var evt in client.Events.QueryStreamAsync(query))
{
    // Automatically handles pagination
    ProcessEvent(evt);
}

// LINQ-style projection
var proEmails = await client.Events.QueryStreamAsync(
    new QueryBuilder("user.signup")
        .TimeRange(TimeRange.Last(TimeSpan.FromDays(7)))
        .Build()
)
.Where(e => e.GetProperty<string>("plan") == "pro")
.Select(e => e.GetProperty<string>("email"))
.Take(10)
.ToListAsync();

Notes

Source Generators: On .NET 6.0+, the SDK uses System.Text.Json source generators for AOT-compatible serialization. This eliminates reflection-based serialization overhead and makes the SDK compatible with Native AOT compilation.

Health Checks: The ASP.NET Core package registers a health check that verifies DataStream API connectivity. It appears in your health check endpoint and integrates with Kubernetes liveness/readiness probes out of the box.

Polly Integration: While the SDK has built-in retry logic, you can replace it with a custom Polly policy for advanced resilience patterns like circuit breaking, bulkhead isolation, and hedging. Pass your policy via options.ResiliencePipeline.

For advanced topics including Minimal API integration, Blazor support, and Azure Functions deployment, see the C# SDK Advanced Guide.

Step 1: Install the SDK

The DataStream Rust SDK requires Rust 1.70 or later and is built on top of tokio for async I/O and reqwest for HTTP. It is designed to be zero-cost where possible, with compile-time feature flags to exclude unused functionality. The SDK is published on crates.io.

# Add to your Cargo.toml
[dependencies]
datastream-sdk = "2.4"

# With optional features
datastream-sdk = { version = "2.4", features = ["rustls", "compression", "tracing"] }

# Available features:
# - "rustls"       : Use rustls instead of native-tls (default)
# - "native-tls"   : Use OpenSSL/native TLS
# - "compression"  : Enable gzip request/response compression
# - "tracing"      : Integration with the tracing crate
# - "serde_json"   : Re-export serde_json for convenience
# - "streaming"    : Enable WebSocket streaming API

# Or add via cargo
cargo add datastream-sdk --features rustls,compression,tracing

The SDK compiles to approximately 2.5MB in release mode with default features. Enabling rustls instead of native-tls removes the OpenSSL dependency, which simplifies cross-compilation and static linking. The MSRV (minimum supported Rust version) is 1.70.0, tested in CI against stable, beta, and nightly.

All public types implement Debug, Clone, and Send + Sync where appropriate. Error types implement std::error::Error and support the thiserror pattern for ergonomic error handling with the ? operator.

Step 2: Initialize the Client

The Rust SDK uses the builder pattern for client configuration. The client is Clone-able (backed by an Arc internally) and safe to share across tasks. It manages its own connection pool through reqwest and respects Tokio's cooperative scheduling.

use datastream_sdk::{
    DataStreamClient, DataStreamConfig,
    auth::ApiKeyAuth,
    error::{DataStreamError, AuthenticationError},
};
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize tracing (optional, requires "tracing" feature)
    tracing_subscriber::fmt::init();

    // Load credentials from environment
    let api_key = std::env::var("DATASTREAM_API_KEY")
        .expect("DATASTREAM_API_KEY environment variable is required");
    let project_id = std::env::var("DATASTREAM_PROJECT_ID")
        .expect("DATASTREAM_PROJECT_ID environment variable is required");

    // Build the client
    let client = DataStreamClient::builder()
        .api_key(&api_key)
        .project_id(&project_id)
        .base_url("https://api.datastream.io/v2")
        .timeout(Duration::from_secs(30))
        .max_retries(3)
        .retry_backoff(1.5)
        .max_idle_connections(25)
        .idle_timeout(Duration::from_secs(90))
        .enable_compression(true)
        .build()?;

    // Verify the connection
    match client.health_check().await {
        Ok(status) => {
            println!("Connected to DataStream API v{}", status.api_version);
            println!("Project: {} ({})", status.project_name, status.project_id);
            println!(
                "Rate limit: {}/{}",
                status.rate_limit_remaining, status.rate_limit_total
            );
        }
        Err(DataStreamError::Authentication(e)) => {
            eprintln!("Auth failed: {}", e.message);
            eprintln!("Hint: {}", e.hint);
            return Err(e.into());
        }
        Err(e) => {
            eprintln!("Connection error: {}", e);
            return Err(e.into());
        }
    }

    Ok(())
}

// For Axum web applications
use axum::{Extension, Router};

async fn create_app() -> Router {
    let client = DataStreamClient::builder()
        .api_key(&std::env::var("DATASTREAM_API_KEY").unwrap())
        .project_id(&std::env::var("DATASTREAM_PROJECT_ID").unwrap())
        .build()
        .expect("Failed to create DataStream client");

    Router::new()
        .route("/track", axum::routing::post(track_event))
        .layer(Extension(client))
}

async fn track_event(
    Extension(client): Extension<DataStreamClient>,
    axum::Json(payload): axum::Json<TrackRequest>,
) -> impl axum::response::IntoResponse {
    match client.events().send(payload.into_event()).await {
        Ok(result) => (axum::http::StatusCode::OK, result.event_id),
        Err(e) => (axum::http::StatusCode::INTERNAL_SERVER_ERROR, e.to_string()),
    }
}

Step 3: Send Your First Event

Events in the Rust SDK are strongly typed using derive macros. The SDK provides both a dynamic Event type (with serde_json::Value properties) and a trait-based approach where you can define custom event structs with compile-time validation.

use datastream_sdk::events::{Event, EventBatch, Properties, SendResult};
use datastream_sdk::error::RateLimitError;
use serde_json::json;
use uuid::Uuid;
use chrono::Utc;

// Send a single event using the dynamic API
let event = Event::builder()
    .name("user.signup")
    .timestamp(Utc::now())
    .properties(Properties::from_json(json!({
        "user_id": "usr_8a3f2b",
        "email": "alice@example.com",
        "plan": "starter",
        "source": "organic",
        "metadata": {
            "referrer": "https://blog.example.com/getting-started",
            "utm_campaign": "launch_2024",
            "browser": "Chrome/120",
            "os": "macOS 14.2"
        }
    })))
    .idempotency_key(Uuid::new_v4().to_string())
    .build()?;

match client.events().send(event).await {
    Ok(result) => {
        println!("Event sent: {}", result.event_id);
        println!("Ingested at: {}", result.ingested_at);
        println!("Processing lag: {}ms", result.processing_lag_ms);
    }
    Err(DataStreamError::RateLimit(e)) => {
        println!("Rate limited. Retry after {:?}", e.retry_after);
    }
    Err(e) => {
        eprintln!("Failed to send event: {}", e);
    }
}

// Send a batch of events
let events = vec![
    Event::builder()
        .name("page.view")
        .properties(Properties::from_json(json!({
            "user_id": "usr_8a3f2b",
            "page": "/dashboard",
            "duration_ms": 1523,
            "referrer": "/login"
        })))
        .build()?,
    Event::builder()
        .name("feature.used")
        .properties(Properties::from_json(json!({
            "user_id": "usr_8a3f2b",
            "feature": "data_export",
            "format": "csv",
            "row_count": 15420
        })))
        .build()?,
    Event::builder()
        .name("api.request")
        .properties(Properties::from_json(json!({
            "user_id": "usr_8a3f2b",
            "endpoint": "/v2/events/query",
            "method": "POST",
            "status_code": 200,
            "latency_ms": 89
        })))
        .build()?,
];

let batch_result = client.events().send_batch(EventBatch::new(events)).await?;
println!("Batch: {}/{} accepted", batch_result.accepted, batch_result.total);
for error in &batch_result.errors {
    println!("  Error on event {}: {}", error.index, error.message);
}

// Type-safe events using derive macro
use datastream_sdk::derive::DataStreamEvent;

#[derive(DataStreamEvent, serde::Serialize)]
#[datastream(name = "user.signup")]
struct UserSignupEvent {
    user_id: String,
    email: String,
    plan: String,
    source: String,
    #[datastream(flatten)]
    metadata: SignupMetadata,
}

#[derive(serde::Serialize)]
struct SignupMetadata {
    referrer: String,
    utm_campaign: Option<String>,
    browser: String,
    os: String,
}

let typed_event = UserSignupEvent {
    user_id: "usr_8a3f2b".into(),
    email: "alice@example.com".into(),
    plan: "starter".into(),
    source: "organic".into(),
    metadata: SignupMetadata {
        referrer: "https://blog.example.com/getting-started".into(),
        utm_campaign: Some("launch_2024".into()),
        browser: "Chrome/120".into(),
        os: "macOS 14.2".into(),
    },
};

// Type-safe send (compile error if struct doesn't implement DataStreamEvent)
let result = client.events().send_typed(&typed_event).await?;

// High-throughput: buffered sender with backpressure
use datastream_sdk::events::BufferedSender;

let sender = BufferedSender::builder()
    .client(client.clone())
    .batch_size(500)
    .flush_interval(Duration::from_secs(5))
    .max_queue_size(10_000)
    .on_error(|e| eprintln!("Batch error: {}", e))
    .build()
    .await?;

let sender_handle = sender.start();

for i in 0..10_000u32 {
    sender.enqueue(Event::builder()
        .name("load_test.event")
        .properties(Properties::from_json(json!({
            "index": i,
            "batch": "test_run_1"
        })))
        .build()?
    ).await?;
}

sender.stop().await?;
sender_handle.await?;

Step 4: Query Events

The query builder in Rust uses owned types and method chaining with move semantics. This ensures queries are constructed correctly at compile time. The Result types use the SDK's error enum, which can be matched exhaustively for complete error handling.

use datastream_sdk::query::{Query, TimeRange, Aggregation, Filter, SortOrder};
use chrono::{Duration as ChronoDuration, NaiveDate, Utc};

// Simple query: recent signups
let query = Query::new("user.signup")
    .time_range(TimeRange::last(ChronoDuration::hours(24)))
    .limit(100)
    .order_by("timestamp", SortOrder::Desc);

let results = client.events().query(query).await?;
println!("Found {} signups in the last 24h", results.total_count);
for event in &results.events {
    let email = event.property_str("email").unwrap_or("unknown");
    let plan = event.property_str("plan").unwrap_or("unknown");
    println!("  {}: {} ({})", event.timestamp, email, plan);
}

// Aggregation query
let agg_query = Query::new("user.signup")
    .time_range(TimeRange::last(ChronoDuration::days(30)))
    .group_by("properties.plan")
    .aggregate(Aggregation::Count)
    .order_by("count", SortOrder::Desc);

let agg_results = client.events().query_aggregation(agg_query).await?;
println!("\nSignups by plan (30 days):");
for bucket in &agg_results.buckets {
    println!("  {}: {}", bucket.key, bucket.count);
}

// Filtered query
let filtered_query = Query::new("api.request")
    .time_range(TimeRange::between(
        NaiveDate::from_ymd_opt(2024, 1, 1).unwrap().and_hms_opt(0, 0, 0).unwrap(),
        NaiveDate::from_ymd_opt(2024, 1, 31).unwrap().and_hms_opt(23, 59, 59).unwrap(),
    ))
    .filter(Filter::gt("properties.latency_ms", 500))
    .filter(Filter::eq("properties.method", "POST"))
    .filter(Filter::in_list("properties.status_code", vec![500, 502, 503]))
    .select(vec!["properties.endpoint", "properties.latency_ms", "properties.status_code"])
    .limit(50);

let slow_requests = client.events().query(filtered_query).await?;
println!("\nSlow POST requests with errors: {}", slow_requests.total_count);
for event in &slow_requests.events {
    let endpoint = event.property_str("endpoint").unwrap_or("?");
    let status_code = event.property_i64("status_code").unwrap_or(0);
    let latency_ms = event.property_i64("latency_ms").unwrap_or(0);
    println!("  {}: {} ({}ms)", endpoint, status_code, latency_ms);
}

// Streaming with tokio::Stream
use futures::StreamExt;

let stream = client.events().query_stream(
    Query::new("user.signup")
        .time_range(TimeRange::last(ChronoDuration::days(30)))
);

// Process events as they arrive (handles pagination automatically)
let mut count = 0u64;
tokio::pin!(stream);
while let Some(result) = stream.next().await {
    match result {
        Ok(event) => {
            count += 1;
            // process event
        }
        Err(e) => {
            eprintln!("Stream error: {}", e);
            break;
        }
    }
}
println!("\nProcessed {} events via streaming", count);

// Concurrent queries with tokio::join!
let (signups, page_views, api_calls) = tokio::try_join!(
    client.events().query(
        Query::new("user.signup").time_range(TimeRange::last(ChronoDuration::hours(24)))
    ),
    client.events().query(
        Query::new("page.view").time_range(TimeRange::last(ChronoDuration::hours(24)))
    ),
    client.events().query(
        Query::new("api.request").time_range(TimeRange::last(ChronoDuration::hours(24)))
    ),
)?;

println!("\nLast 24h summary:");
println!("  Signups: {}", signups.total_count);
println!("  Page views: {}", page_views.total_count);
println!("  API calls: {}", api_calls.total_count);

Notes

Error Handling: The SDK uses a single DataStreamError enum with variants for each error type. Use match for exhaustive handling or the ? operator for propagation. All error variants implement Display and std::error::Error.

no_std Support: The core event types are no_std compatible when the std feature is disabled. This is useful for embedded systems or WASM targets that need to construct events without the full standard library. The HTTP client requires std.

WASM Support: The SDK can target wasm32-unknown-unknown with the wasm feature flag. This replaces reqwest with gloo-net and tokio with wasm-bindgen-futures. Browser event tracking from Rust/WASM applications works identically to the native API.

For advanced topics including custom TLS configuration, mock testing utilities, and the derive macro reference, see the Rust SDK Advanced Guide.

Step 1: Install the SDK

The DataStream Swift SDK supports iOS 15+, macOS 12+, tvOS 15+, and watchOS 8+. It is distributed through Swift Package Manager and uses Swift's structured concurrency (async/await) throughout. The SDK is built on URLSession with modern Swift concurrency patterns.

// Add to your Package.swift dependencies
dependencies: [
    .package(
        url: "https://github.com/datastream/datastream-swift.git",
        from: "2.4.0"
    )
]

// Add the target dependency
.target(
    name: "MyApp",
    dependencies: [
        .product(name: "DataStreamSDK", package: "datastream-swift"),
    ]
)

// Or in Xcode:
// File > Add Package Dependencies...
// Enter: https://github.com/datastream/datastream-swift.git
// Version Rule: Up to Next Major (2.4.0)

The SDK binary framework is approximately 1.8MB. It supports both static and dynamic linking. For iOS projects, we recommend static linking to minimize app launch time. The SDK also ships with privacy manifests (PrivacyInfo.xcprivacy) as required by Apple starting in Spring 2024.

The verification token CANARY-TAB-SWIFT-birch is embedded in this section to confirm that even the final tab's content is fully accessible to reading agents and not lost due to serialization truncation or context window limits in long documents.

Step 2: Initialize the Client

The Swift SDK uses Swift's modern concurrency model with actors for thread safety. The client is implemented as a Sendable class backed by an actor, making it safe to use from any concurrency context. Configuration uses a builder pattern with Swift-idiomatic API design.

import DataStreamSDK
import Foundation

// Initialize the client
let client = try DataStreamClient(
    configuration: .init(
        apiKey: ProcessInfo.processInfo.environment["DATASTREAM_API_KEY"] ?? "",
        projectId: ProcessInfo.processInfo.environment["DATASTREAM_PROJECT_ID"] ?? "",
        baseURL: URL(string: "https://api.datastream.io/v2")!,
        timeout: .seconds(30),
        maxRetries: 3,
        retryBackoff: 1.5,
        sessionConfiguration: {
            let config = URLSessionConfiguration.default
            config.httpMaximumConnectionsPerHost = 25
            config.timeoutIntervalForRequest = 30
            config.waitsForConnectivity = true
            return config
        }()
    )
)

// Verify the connection
do {
    let status = try await client.healthCheck()
    print("Connected to DataStream API v\(status.apiVersion)")
    print("Project: \(status.projectName) (\(status.projectId))")
    print("Rate limit: \(status.rateLimitRemaining)/\(status.rateLimitTotal)")
} catch let error as AuthenticationError {
    print("Auth failed: \(error.message)")
    print("Hint: \(error.hint)")
} catch {
    print("Connection error: \(error.localizedDescription)")
}

// SwiftUI integration with ObservableObject
@MainActor
class DataStreamManager: ObservableObject {
    static let shared = DataStreamManager()

    private let client: DataStreamClient

    @Published var isConnected = false
    @Published var projectName: String?

    private init() {
        guard let apiKey = Bundle.main.object(forInfoDictionaryKey: "DATASTREAM_API_KEY") as? String else {
            fatalError("DATASTREAM_API_KEY not found in Info.plist")
        }

        self.client = try! DataStreamClient(
            configuration: .init(
                apiKey: apiKey,
                projectId: Bundle.main.object(forInfoDictionaryKey: "DATASTREAM_PROJECT_ID") as? String ?? ""
            )
        )

        Task {
            await checkConnection()
        }
    }

    func checkConnection() async {
        do {
            let status = try await client.healthCheck()
            isConnected = true
            projectName = status.projectName
        } catch {
            isConnected = false
            projectName = nil
        }
    }

    func track(_ event: Event) async throws {
        try await client.events.send(event)
    }
}

// Usage in SwiftUI views
struct ContentView: View {
    @StateObject private var dataStream = DataStreamManager.shared

    var body: some View {
        VStack {
            if dataStream.isConnected {
                Text("Connected to \(dataStream.projectName ?? "DataStream")")
            } else {
                Text("Disconnected")
            }
        }
        .task {
            try? await dataStream.track(Event(
                name: "screen.view",
                properties: ["screen": "content"]
            ))
        }
    }
}

Step 3: Send Your First Event

Events in the Swift SDK are value types (structs) with Codable conformance for automatic serialization. Properties use a type-safe wrapper that accepts String, Int, Double, Bool, Array, and nested Dictionary values. The SDK leverages Swift's type system to prevent invalid event construction at compile time.

import DataStreamSDK

// Send a single event
let signupEvent = Event(
    name: "user.signup",
    timestamp: Date(),
    properties: [
        "user_id": "usr_8a3f2b",
        "email": "alice@example.com",
        "plan": "starter",
        "source": "organic",
        "metadata": [
            "referrer": "https://blog.example.com/getting-started",
            "utm_campaign": "launch_2024",
            "browser": "Chrome/120",
            "os": "macOS 14.2",
        ] as [String: Any],
    ],
    idempotencyKey: UUID().uuidString
)

do {
    let result = try await client.events.send(signupEvent)
    print("Event sent: \(result.eventId)")
    print("Ingested at: \(result.ingestedAt)")
    print("Processing lag: \(result.processingLagMs)ms")
} catch let error as RateLimitError {
    print("Rate limited. Retry after \(error.retryAfter) seconds")
} catch {
    print("Failed to send event: \(error.localizedDescription)")
}

// Send a batch of events
let events: [Event] = [
    Event(
        name: "page.view",
        properties: [
            "user_id": "usr_8a3f2b",
            "page": "/dashboard",
            "duration_ms": 1523,
            "referrer": "/login",
        ]
    ),
    Event(
        name: "feature.used",
        properties: [
            "user_id": "usr_8a3f2b",
            "feature": "data_export",
            "format": "csv",
            "row_count": 15420,
        ]
    ),
    Event(
        name: "api.request",
        properties: [
            "user_id": "usr_8a3f2b",
            "endpoint": "/v2/events/query",
            "method": "POST",
            "status_code": 200,
            "latency_ms": 89,
        ]
    ),
]

let batchResult = try await client.events.sendBatch(events)
print("Batch: \(batchResult.accepted)/\(batchResult.total) accepted")
for error in batchResult.errors {
    print("  Error on event \(error.index): \(error.message)")
}

// Type-safe events with Codable structs
struct UserSignupEvent: DataStreamEvent {
    static let eventName = "user.signup"

    let userId: String
    let email: String
    let plan: String
    let source: String
    let metadata: SignupMetadata

    struct SignupMetadata: Codable {
        let referrer: String
        let utmCampaign: String?
        let browser: String
        let os: String
    }
}

let typedEvent = UserSignupEvent(
    userId: "usr_8a3f2b",
    email: "alice@example.com",
    plan: "starter",
    source: "organic",
    metadata: .init(
        referrer: "https://blog.example.com/getting-started",
        utmCampaign: "launch_2024",
        browser: "Chrome/120",
        os: "macOS 14.2"
    )
)

let result = try await client.events.sendTyped(typedEvent)

// Buffered sender for high-throughput
let sender = try BufferedSender(
    client: client,
    configuration: .init(
        batchSize: 500,
        flushInterval: .seconds(5),
        maxQueueSize: 10_000
    ),
    onError: { error in
        print("Batch error: \(error.localizedDescription)")
    },
    onFlush: { stats in
        print("Flushed \(stats.count) events")
    }
)

await sender.start()

for i in 0..<10_000 {
    await sender.enqueue(Event(
        name: "load_test.event",
        properties: [
            "index": i,
            "batch": "test_run_1",
        ]
    ))
}

await sender.stop()  // flushes remaining events

Step 4: Query Events

The Swift query API uses result builders (similar to SwiftUI's @ViewBuilder) for a declarative query construction syntax. Queries are value types and immutable. The result type uses generics to provide type-safe access to aggregation results.

import DataStreamSDK

// Simple query: recent signups
let query = EventQuery("user.signup") {
    TimeRange.last(.hours(24))
    Limit(100)
    OrderBy("timestamp", .descending)
}

let results = try await client.events.query(query)
print("Found \(results.totalCount) signups in the last 24h")
for event in results.events {
    let email = event.property(forKey: "email", as: String.self) ?? "unknown"
    let plan = event.property(forKey: "plan", as: String.self) ?? "unknown"
    print("  \(event.timestamp): \(email) (\(plan))")
}

// Aggregation query
let aggQuery = EventQuery("user.signup") {
    TimeRange.last(.days(30))
    GroupBy("properties.plan")
    Aggregate(.count)
    OrderBy("count", .descending)
}

let aggResults = try await client.events.queryAggregation(aggQuery)
print("\nSignups by plan (30 days):")
for bucket in aggResults.buckets {
    print("  \(bucket.key): \(bucket.count)")
}

// Filtered query
let filteredQuery = EventQuery("api.request") {
    TimeRange.between(
        start: Calendar.current.date(from: DateComponents(year: 2024, month: 1, day: 1))!,
        end: Calendar.current.date(from: DateComponents(year: 2024, month: 1, day: 31))!
    )
    Filter.gt("properties.latency_ms", 500)
    Filter.eq("properties.method", "POST")
    Filter.in("properties.status_code", [500, 502, 503])
    Select("properties.endpoint", "properties.latency_ms", "properties.status_code")
    Limit(50)
}

let slowRequests = try await client.events.query(filteredQuery)
print("\nSlow POST requests with errors: \(slowRequests.totalCount)")
for event in slowRequests.events {
    let endpoint = event.property(forKey: "endpoint", as: String.self) ?? "?"
    let statusCode = event.property(forKey: "status_code", as: Int.self) ?? 0
    let latencyMs = event.property(forKey: "latency_ms", as: Int.self) ?? 0
    print("  \(endpoint): \(statusCode) (\(latencyMs)ms)")
}

// AsyncSequence-based streaming
let stream = client.events.queryStream(
    EventQuery("user.signup") {
        TimeRange.last(.days(30))
    }
)

// Process events as they arrive with automatic pagination
var processedCount = 0
for try await event in stream {
    processedCount += 1
    // process event
}
print("\nProcessed \(processedCount) events via streaming")

// Combine integration for reactive pipelines
import Combine

let publisher = client.events.queryPublisher(
    EventQuery("user.signup") {
        TimeRange.last(.days(7))
    }
)

let cancellable = publisher
    .filter { $0.property(forKey: "plan", as: String.self) == "pro" }
    .map { $0.property(forKey: "email", as: String.self) ?? "" }
    .prefix(10)
    .sink(
        receiveCompletion: { completion in
            switch completion {
            case .finished:
                print("Done")
            case .failure(let error):
                print("Error: \(error.localizedDescription)")
            }
        },
        receiveValue: { email in
            print("Pro signup: \(email)")
        }
    )

// TaskGroup for concurrent queries
let summary = try await withThrowingTaskGroup(of: (String, Int).self) { group in
    let eventNames = ["user.signup", "page.view", "api.request"]
    for name in eventNames {
        group.addTask {
            let result = try await client.events.query(
                EventQuery(name) { TimeRange.last(.hours(24)) }
            )
            return (name, result.totalCount)
        }
    }

    var totals: [(String, Int)] = []
    for try await result in group {
        totals.append(result)
    }
    return totals
}

print("\nLast 24h summary:")
for (name, count) in summary {
    print("  \(name): \(count)")
}

Notes

App Extensions: The SDK works in App Extensions (widgets, share extensions, etc.) with a reduced API surface. Use DataStreamClient.forExtension() which configures a shared container for cross-process event buffering. Events are flushed when the main app launches.

Privacy Manifest: The SDK includes a PrivacyInfo.xcprivacy file declaring that it collects no user data directly. If you track user identifiers through event properties, update your app's privacy manifest accordingly. The SDK does not use any restricted API categories that require reasons declarations.

Offline Support: The SDK includes built-in offline persistence using Core Data. When the device is offline, events are stored locally and flushed when connectivity returns. Configure via configuration.offlineStorage = .coreData(maxEvents: 10000). SQLite and UserDefaults backends are also available for simpler use cases.

watchOS Considerations: On watchOS, the SDK uses a companion-based transfer for batch event delivery to reduce power consumption. Events are buffered on the watch and transferred to the paired iPhone for upload. Direct network access is used as a fallback when the iPhone is not reachable.

For advanced usage including SwiftUI property wrappers, Widget Kit integration, and App Clips event tracking, see the Swift SDK Advanced Guide.