Gateway troubleshooting

This page is the deep runbook. Start at /help/troubleshooting if you want the fast triage flow first.

Command ladder

Run these first, in this order:

pllan status
pllan gateway status
pllan logs --follow
pllan doctor
pllan channels status --probe

Expected healthy signals:

pllan gateway status shows Runtime: running and RPC probe: ok.
pllan doctor reports no blocking config/service issues.
pllan channels status --probe shows connected/ready channels.

Anthropic 429 extra usage required for long context

Use this when logs/errors include: HTTP 429: rate_limit_error: Extra usage is required for long context requests.

pllan logs --follow
pllan models status
pllan config get agents.defaults.models

Look for:

Selected Anthropic Opus/Sonnet model has params.context1m: true.
Current Anthropic credential is not eligible for long-context usage.
Requests fail only on long sessions/model runs that need the 1M beta path.

Fix options:

Disable context1m for that model to fall back to the normal context window.
Use an Anthropic API key with billing, or enable Anthropic Extra Usage on the subscription account.
Configure fallback models so runs continue when Anthropic long-context requests are rejected.

No replies

If channels are up but nothing answers, check routing and policy before reconnecting anything.

pllan status
pllan channels status --probe
pllan pairing list --channel <channel> [--account <id>]
pllan config get channels
pllan logs --follow

Look for:

Pairing pending for DM senders.
Group mention gating (requireMention, mentionPatterns).
Channel/group allowlist mismatches.

Common signatures:

drop guild message (mention required → group message ignored until mention.
pairing request → sender needs approval.
blocked / allowlist → sender/channel was filtered by policy.

Dashboard control ui connectivity

When dashboard/control UI will not connect, validate URL, auth mode, and secure context assumptions.

pllan gateway status
pllan status
pllan logs --follow
pllan doctor
pllan gateway status --json

Look for:

Correct probe URL and dashboard URL.
Auth mode/token mismatch between client and gateway.
HTTP usage where device identity is required.

Common signatures:

device identity required → non-secure context or missing device auth.
device nonce required / device nonce mismatch → client is not completing the challenge-based device auth flow (connect.challenge + device.nonce).
device signature invalid / device signature expired → client signed the wrong payload (or stale timestamp) for the current handshake.
AUTH_TOKEN_MISMATCH with canRetryWithDeviceToken=true → client can do one trusted retry with cached device token.
repeated unauthorized after that retry → shared token/device token drift; refresh token config and re-approve/rotate device token if needed.
gateway connect failed: → wrong host/port/url target.

Auth detail codes quick map

Use error.details.code from the failed connect response to pick the next action:

Detail code	Meaning	Recommended action
`AUTH_TOKEN_MISSING`	Client did not send a required shared token.	Paste/set token in the client and retry. For dashboard paths: `pllan config get gateway.auth.token` then paste into Control UI settings.
`AUTH_TOKEN_MISMATCH`	Shared token did not match gateway auth token.	If `canRetryWithDeviceToken=true`, allow one trusted retry. If still failing, run the token drift recovery checklist.
`AUTH_DEVICE_TOKEN_MISMATCH`	Cached per-device token is stale or revoked.	Rotate/re-approve device token using devices CLI, then reconnect.
`PAIRING_REQUIRED`	Device identity is known but not approved for this role.	Approve pending request: `pllan devices list` then `pllan devices approve <requestId>`.

Device auth v2 migration check:

pllan --version
pllan doctor
pllan gateway status

If logs show nonce/signature errors, update the connecting client and verify it:

waits for connect.challenge
signs the challenge-bound payload
sends connect.params.device.nonce with the same challenge nonce

Gateway service not running

Use this when service is installed but process does not stay up.

pllan gateway status
pllan status
pllan logs --follow
pllan doctor
pllan gateway status --deep

Look for:

Runtime: stopped with exit hints.
Service config mismatch (Config (cli) vs Config (service)).
Port/listener conflicts.

Common signatures:

Gateway start blocked: set gateway.mode=local → local gateway mode is not enabled. Fix: set gateway.mode="local" in your config (or run pllan configure). If you are running Pllan via Podman using the dedicated pllan user, the config lives at ~pllan/.pllan/pllan.json.
refusing to bind gateway ... without auth → non-loopback bind without token/password.
another gateway instance is already listening / EADDRINUSE → port conflict.

Channel connected messages not flowing

If channel state is connected but message flow is dead, focus on policy, permissions, and channel specific delivery rules.

pllan channels status --probe
pllan pairing list --channel <channel> [--account <id>]
pllan status --deep
pllan logs --follow
pllan config get channels

Look for:

DM policy (pairing, allowlist, open, disabled).
Group allowlist and mention requirements.
Missing channel API permissions/scopes.

Common signatures:

mention required → message ignored by group mention policy.
pairing / pending approval traces → sender is not approved.
missing_scope, not_in_channel, Forbidden, 401/403 → channel auth/permissions issue.

Cron and heartbeat delivery

If cron or heartbeat did not run or did not deliver, verify scheduler state first, then delivery target.

pllan cron status
pllan cron list
pllan cron runs --id <jobId> --limit 20
pllan system heartbeat last
pllan logs --follow

Look for:

Cron enabled and next wake present.
Job run history status (ok, skipped, error).
Heartbeat skip reasons (quiet-hours, requests-in-flight, alerts-disabled).

Common signatures:

cron: scheduler disabled; jobs will not run automatically → cron disabled.
cron: timer tick failed → scheduler tick failed; check file/log/runtime errors.
heartbeat skipped with reason=quiet-hours → outside active hours window.
heartbeat: unknown accountId → invalid account id for heartbeat delivery target.
heartbeat skipped with reason=dm-blocked → heartbeat target resolved to a DM-style destination while agents.defaults.heartbeat.directPolicy (or per-agent override) is set to block.

Node paired tool fails

If a node is paired but tools fail, isolate foreground, permission, and approval state.

pllan nodes status
pllan nodes describe --node <idOrNameOrIp>
pllan approvals get --node <idOrNameOrIp>
pllan logs --follow
pllan status

Look for:

Node online with expected capabilities.
OS permission grants for camera/mic/location/screen.
Exec approvals and allowlist state.

Common signatures:

NODE_BACKGROUND_UNAVAILABLE → node app must be in foreground.
*_PERMISSION_REQUIRED / LOCATION_PERMISSION_REQUIRED → missing OS permission.
SYSTEM_RUN_DENIED: approval required → exec approval pending.
SYSTEM_RUN_DENIED: allowlist miss → command blocked by allowlist.

Browser tool fails

Use this when browser tool actions fail even though the gateway itself is healthy.

pllan browser status
pllan browser start --browser-profile pllan
pllan browser profiles
pllan logs --follow
pllan doctor

Look for:

Valid browser executable path.
CDP profile reachability.
Local Chrome availability for existing-session / user profiles.

Common signatures:

Failed to start Chrome CDP on port → browser process failed to launch.
browser.executablePath not found → configured path is invalid.
No Chrome tabs found for profile="user" → the Chrome MCP attach profile has no open local Chrome tabs.
Browser attachOnly is enabled ... not reachable → attach-only profile has no reachable target.

If you upgraded and something suddenly broke

Most post-upgrade breakage is config drift or stricter defaults now being enforced.

1) Auth and URL override behavior changed

pllan gateway status
pllan config get gateway.mode
pllan config get gateway.remote.url
pllan config get gateway.auth.mode

What to check:

If gateway.mode=remote, CLI calls may be targeting remote while your local service is fine.
Explicit --url calls do not fall back to stored credentials.

Common signatures:

gateway connect failed: → wrong URL target.
unauthorized → endpoint reachable but wrong auth.

2) Bind and auth guardrails are stricter

pllan config get gateway.bind
pllan config get gateway.auth.token
pllan gateway status
pllan logs --follow

What to check:

Non-loopback binds (lan, tailnet, custom) need auth configured.
Old keys like gateway.token do not replace gateway.auth.token.

Common signatures:

refusing to bind gateway ... without auth → bind+auth mismatch.
RPC probe: failed while runtime is running → gateway alive but inaccessible with current auth/url.

3) Pairing and device identity state changed

pllan devices list
pllan pairing list --channel <channel> [--account <id>]
pllan logs --follow
pllan doctor

What to check:

Pending device approvals for dashboard/nodes.
Pending DM pairing approvals after policy or identity changes.

Common signatures:

device identity required → device auth not satisfied.
pairing required → sender/device must be approved.

If the service config and runtime still disagree after checks, reinstall service metadata from the same profile/state directory:

pllan gateway install --force
pllan gateway restart

Gateway

Remote access

Security

Web interfaces

Troubleshooting

Gateway troubleshooting

Command ladder

Anthropic 429 extra usage required for long context

No replies

Dashboard control ui connectivity

Auth detail codes quick map

Gateway service not running

Channel connected messages not flowing

Cron and heartbeat delivery

Node paired tool fails

Browser tool fails

If you upgraded and something suddenly broke

1) Auth and URL override behavior changed

2) Bind and auth guardrails are stricter

3) Pairing and device identity state changed

Gateway

Remote access

Security

Web interfaces

​Gateway troubleshooting

​Command ladder

​Anthropic 429 extra usage required for long context

​No replies

​Dashboard control ui connectivity

​Auth detail codes quick map

​Gateway service not running

​Channel connected messages not flowing

​Cron and heartbeat delivery

​Node paired tool fails

​Browser tool fails

​If you upgraded and something suddenly broke

​1) Auth and URL override behavior changed

​2) Bind and auth guardrails are stricter

​3) Pairing and device identity state changed

Gateway troubleshooting

Command ladder

Anthropic 429 extra usage required for long context

No replies

Dashboard control ui connectivity

Auth detail codes quick map

Gateway service not running

Channel connected messages not flowing

Cron and heartbeat delivery

Node paired tool fails

Browser tool fails

If you upgraded and something suddenly broke

1) Auth and URL override behavior changed

2) Bind and auth guardrails are stricter

3) Pairing and device identity state changed