Observability in Xferity — Logs, Prometheus Metrics, Health Endpoints, and Audit Records

Observability

Xferity exposes observability at three layers:

logs for runtime behavior and diagnostics
metrics for service and workload monitoring
audit records for flow and file traceability

These layers serve different purposes and should not be treated as interchangeable.

Logs

Structured logs are used for runtime diagnostics, warnings, and error investigation.

They are useful for:

startup failures
configuration validation warnings
protocol and connectivity failures
worker behavior
notification delivery errors

The CLI exposes log access through logs, including tailing and level filtering.

Metrics

Xferity exports Prometheus-format metrics through the Web runtime.

Coverage includes metrics for areas such as:

flow runs and durations
jobs enqueued, completed, and queue depth
transfer bytes, files, and errors
retries and dead-letter activity
notification outcomes
authentication failures and rate-limit denials
certificate expiry state
secret-resolution outcomes
selected AS2 and audit-sidecar metrics

Metrics access boundary

/metrics is behind authenticated admin access. Do not document it as an anonymous scrape endpoint unless that changes.

Health checks

The runtime exposes several health-related endpoints.

The access model is:

/health/worker for unauthenticated worker readiness
/health for general service health behind authenticated access
/health/secrets behind authenticated access
/health/certificates behind authenticated access

The health payload checks runtime conditions such as state-store writability, audit-path writability, and worker readiness.

Audit records

Audit records are distinct from logs. They are structured lifecycle records intended for file and flow traceability.

Use:

logs to understand service behavior
metrics to understand health and trends
audit records to answer what happened to a specific file or run

See Audit Logging.

Crypto diagnostics and observability

When a flow uses pgp.provider=gnupg or pgp.provider=auto, diagnostics can show:

provider mode
resolved GnuPG binary path
GnuPG version
whether fallback capability is available on this host

This is useful before rollout, especially on Windows hosts or systems where gpg is not installed in a standard location.

Crypto log fields

Crypto operations now emit structured fields that help operators understand whether fallback happened:

provider
mode
fallback_used
fallback_reason
fallback_subreason
cleanup_status

Typical interpretation:

provider=gopenpgp and fallback_used=false means the native path handled the operation
fallback_used=true means the native path failed with a classified compatibility case and GnuPG was tried once
cleanup_status=failed means the crypto operation may have succeeded, but temporary crypto workspace cleanup did not complete cleanly and should be investigated

Secret safety in crypto logs

Crypto observability is intentionally sanitized.

Xferity avoids logging:

passphrases
key material
raw GnuPG stderr when it may contain sensitive data

Instead, logs use structured fields and redacted output summaries.

Boundaries and limits

To keep this page precise:

metrics do not replace audit records for file evidence
logs do not replace audit retention
health endpoints do not replace external probes or end-to-end checks
built-in telemetry does not replace SIEM or incident response processes