Gateway troubleshooting

This page is the deep runbook. Start at /help/troubleshooting if you want the fast triage flow first.

Command ladder

Run these first, in this order:
pllan status
pllan gateway status
pllan logs --follow
pllan doctor
pllan channels status --probe
Expected healthy signals:
  • pllan gateway status shows Runtime: running and RPC probe: ok.
  • pllan doctor reports no blocking config/service issues.
  • pllan channels status --probe shows connected/ready channels.

Anthropic 429 extra usage required for long context

Use this when logs/errors include: HTTP 429: rate_limit_error: Extra usage is required for long context requests.
pllan logs --follow
pllan models status
pllan config get agents.defaults.models
Look for:
  • Selected Anthropic Opus/Sonnet model has params.context1m: true.
  • Current Anthropic credential is not eligible for long-context usage.
  • Requests fail only on long sessions/model runs that need the 1M beta path.
Fix options:
  1. Disable context1m for that model to fall back to the normal context window.
  2. Use an Anthropic API key with billing, or enable Anthropic Extra Usage on the subscription account.
  3. Configure fallback models so runs continue when Anthropic long-context requests are rejected.
Related:

No replies

If channels are up but nothing answers, check routing and policy before reconnecting anything.
pllan status
pllan channels status --probe
pllan pairing list --channel <channel> [--account <id>]
pllan config get channels
pllan logs --follow
Look for:
  • Pairing pending for DM senders.
  • Group mention gating (requireMention, mentionPatterns).
  • Channel/group allowlist mismatches.
Common signatures:
  • drop guild message (mention required → group message ignored until mention.
  • pairing request → sender needs approval.
  • blocked / allowlist → sender/channel was filtered by policy.
Related:

Dashboard control ui connectivity

When dashboard/control UI will not connect, validate URL, auth mode, and secure context assumptions.
pllan gateway status
pllan status
pllan logs --follow
pllan doctor
pllan gateway status --json
Look for:
  • Correct probe URL and dashboard URL.
  • Auth mode/token mismatch between client and gateway.
  • HTTP usage where device identity is required.
Common signatures:
  • device identity required → non-secure context or missing device auth.
  • device nonce required / device nonce mismatch → client is not completing the challenge-based device auth flow (connect.challenge + device.nonce).
  • device signature invalid / device signature expired → client signed the wrong payload (or stale timestamp) for the current handshake.
  • AUTH_TOKEN_MISMATCH with canRetryWithDeviceToken=true → client can do one trusted retry with cached device token.
  • repeated unauthorized after that retry → shared token/device token drift; refresh token config and re-approve/rotate device token if needed.
  • gateway connect failed: → wrong host/port/url target.

Auth detail codes quick map

Use error.details.code from the failed connect response to pick the next action:
Detail codeMeaningRecommended action
AUTH_TOKEN_MISSINGClient did not send a required shared token.Paste/set token in the client and retry. For dashboard paths: pllan config get gateway.auth.token then paste into Control UI settings.
AUTH_TOKEN_MISMATCHShared token did not match gateway auth token.If canRetryWithDeviceToken=true, allow one trusted retry. If still failing, run the token drift recovery checklist.
AUTH_DEVICE_TOKEN_MISMATCHCached per-device token is stale or revoked.Rotate/re-approve device token using devices CLI, then reconnect.
PAIRING_REQUIREDDevice identity is known but not approved for this role.Approve pending request: pllan devices list then pllan devices approve <requestId>.
Device auth v2 migration check:
pllan --version
pllan doctor
pllan gateway status
If logs show nonce/signature errors, update the connecting client and verify it:
  1. waits for connect.challenge
  2. signs the challenge-bound payload
  3. sends connect.params.device.nonce with the same challenge nonce
Related:

Gateway service not running

Use this when service is installed but process does not stay up.
pllan gateway status
pllan status
pllan logs --follow
pllan doctor
pllan gateway status --deep
Look for:
  • Runtime: stopped with exit hints.
  • Service config mismatch (Config (cli) vs Config (service)).
  • Port/listener conflicts.
Common signatures:
  • Gateway start blocked: set gateway.mode=local → local gateway mode is not enabled. Fix: set gateway.mode="local" in your config (or run pllan configure). If you are running Pllan via Podman using the dedicated pllan user, the config lives at ~pllan/.pllan/pllan.json.
  • refusing to bind gateway ... without auth → non-loopback bind without token/password.
  • another gateway instance is already listening / EADDRINUSE → port conflict.
Related:

Channel connected messages not flowing

If channel state is connected but message flow is dead, focus on policy, permissions, and channel specific delivery rules.
pllan channels status --probe
pllan pairing list --channel <channel> [--account <id>]
pllan status --deep
pllan logs --follow
pllan config get channels
Look for:
  • DM policy (pairing, allowlist, open, disabled).
  • Group allowlist and mention requirements.
  • Missing channel API permissions/scopes.
Common signatures:
  • mention required → message ignored by group mention policy.
  • pairing / pending approval traces → sender is not approved.
  • missing_scope, not_in_channel, Forbidden, 401/403 → channel auth/permissions issue.
Related:

Cron and heartbeat delivery

If cron or heartbeat did not run or did not deliver, verify scheduler state first, then delivery target.
pllan cron status
pllan cron list
pllan cron runs --id <jobId> --limit 20
pllan system heartbeat last
pllan logs --follow
Look for:
  • Cron enabled and next wake present.
  • Job run history status (ok, skipped, error).
  • Heartbeat skip reasons (quiet-hours, requests-in-flight, alerts-disabled).
Common signatures:
  • cron: scheduler disabled; jobs will not run automatically → cron disabled.
  • cron: timer tick failed → scheduler tick failed; check file/log/runtime errors.
  • heartbeat skipped with reason=quiet-hours → outside active hours window.
  • heartbeat: unknown accountId → invalid account id for heartbeat delivery target.
  • heartbeat skipped with reason=dm-blocked → heartbeat target resolved to a DM-style destination while agents.defaults.heartbeat.directPolicy (or per-agent override) is set to block.
Related:

Node paired tool fails

If a node is paired but tools fail, isolate foreground, permission, and approval state.
pllan nodes status
pllan nodes describe --node <idOrNameOrIp>
pllan approvals get --node <idOrNameOrIp>
pllan logs --follow
pllan status
Look for:
  • Node online with expected capabilities.
  • OS permission grants for camera/mic/location/screen.
  • Exec approvals and allowlist state.
Common signatures:
  • NODE_BACKGROUND_UNAVAILABLE → node app must be in foreground.
  • *_PERMISSION_REQUIRED / LOCATION_PERMISSION_REQUIRED → missing OS permission.
  • SYSTEM_RUN_DENIED: approval required → exec approval pending.
  • SYSTEM_RUN_DENIED: allowlist miss → command blocked by allowlist.
Related:

Browser tool fails

Use this when browser tool actions fail even though the gateway itself is healthy.
pllan browser status
pllan browser start --browser-profile pllan
pllan browser profiles
pllan logs --follow
pllan doctor
Look for:
  • Valid browser executable path.
  • CDP profile reachability.
  • Local Chrome availability for existing-session / user profiles.
Common signatures:
  • Failed to start Chrome CDP on port → browser process failed to launch.
  • browser.executablePath not found → configured path is invalid.
  • No Chrome tabs found for profile="user" → the Chrome MCP attach profile has no open local Chrome tabs.
  • Browser attachOnly is enabled ... not reachable → attach-only profile has no reachable target.
Related:

If you upgraded and something suddenly broke

Most post-upgrade breakage is config drift or stricter defaults now being enforced.

1) Auth and URL override behavior changed

pllan gateway status
pllan config get gateway.mode
pllan config get gateway.remote.url
pllan config get gateway.auth.mode
What to check:
  • If gateway.mode=remote, CLI calls may be targeting remote while your local service is fine.
  • Explicit --url calls do not fall back to stored credentials.
Common signatures:
  • gateway connect failed: → wrong URL target.
  • unauthorized → endpoint reachable but wrong auth.

2) Bind and auth guardrails are stricter

pllan config get gateway.bind
pllan config get gateway.auth.token
pllan gateway status
pllan logs --follow
What to check:
  • Non-loopback binds (lan, tailnet, custom) need auth configured.
  • Old keys like gateway.token do not replace gateway.auth.token.
Common signatures:
  • refusing to bind gateway ... without auth → bind+auth mismatch.
  • RPC probe: failed while runtime is running → gateway alive but inaccessible with current auth/url.

3) Pairing and device identity state changed

pllan devices list
pllan pairing list --channel <channel> [--account <id>]
pllan logs --follow
pllan doctor
What to check:
  • Pending device approvals for dashboard/nodes.
  • Pending DM pairing approvals after policy or identity changes.
Common signatures:
  • device identity required → device auth not satisfied.
  • pairing required → sender/device must be approved.
If the service config and runtime still disagree after checks, reinstall service metadata from the same profile/state directory:
pllan gateway install --force
pllan gateway restart
Related: