[Bug]: Lock contention on Linux install — `enable --now` timer races with inline post-install telemetry

### Version

`stepsecurity-dev-machine-guard v1.11.0` (linux_amd64)

### OS

Fedora Linux 42 (Cloud Edition), kernel 6.19.12-100.fc42.x86_64

### Command Run

```
./stepsecurity-dev-machine-guard install
```
(invoked via the Linux loader script's `install` path; reproduced both as the target user and via `sudo` with `SUDO_USER` privilege drop.)

### Expected Behavior

A clean first install: timer registered, initial telemetry uploaded once, no errors recorded in `~/.stepsecurity/agent.error.log`.

### Actual Behavior

Every fresh `install` on Linux leaves the following two errors in `~/.stepsecurity/agent.error.log` even though the install itself appears to succeed and a telemetry upload eventually returns HTTP 200:

```
==========================================
StepSecurity Device Agent v1.11.0
==========================================

[scanning] run-status[failed]: HTTP 400 (terminal, no retry)
[error] acquiring lock: another instance is already running (PID <X>)
```

…where `PID <X>` is the PID of the process that ran the inline post-install telemetry from the binary's `install` command. The lock contender is a *second* concurrent invocation of the binary that the user did not explicitly start.

### Output / Error Messages

Sequence observed during a clean install on Fedora 42 (timestamps abbreviated):

```
18:43:24  [loader] Running binary install...
18:43:25  [binary] systemd user timer configuration completed successfully
18:43:25  [binary]   Service: ~/.config/systemd/user/stepsecurity-dev-machine-guard.service
18:43:25  [binary]   Timer:   ~/.config/systemd/user/stepsecurity-dev-machine-guard.timer
18:43:25  [binary] Installation complete!
18:43:25  [binary] Sending initial telemetry...
18:43:25  [binary] Lock acquired (PID: 76249)         <-- inline post-install telemetry
18:43:25  [error]  run-status[failed]: HTTP 400 (terminal, no retry)
18:43:25  [error]  acquiring lock: another instance is already running (PID 76249)   <-- racing process
18:43:31  [binary] Telemetry collection completed successfully
18:43:31  [binary] Lock released (PID: 76249)
```

### Root cause (suspected)

The Linux install path enables and **immediately starts** the timer, then runs initial telemetry inline:

- `internal/systemd/systemd.go:81` — `systemctl --user enable --now stepsecurity-dev-machine-guard.timer`
- Timer unit (same file, ~lines 181-183):
  ```ini
  OnBootSec=5min
  Persistent=true
  ```
- `cmd/stepsecurity-dev-machine-guard/main.go:132-137` — after the `systemd.Install()` call returns, the binary calls `telemetry.Run(...)` inline.

With `Persistent=true` and `OnBootSec=5min`, on any host whose uptime exceeds 5 minutes (i.e. effectively every install in the wild), enabling the timer with `--now` causes systemd to consider the trigger "missed" and fire the service immediately. That timer-triggered service runs `send-telemetry` and tries to acquire the singleton lock at the same moment `main.go`'s inline `telemetry.Run()` is doing the same — they race, and whoever loses prints the `acquiring lock: another instance is already running` error and exits non-zero (the systemd-launched one in this case, since the inline call started a fraction earlier).

This appears to be Linux-specific. The macOS path (`launchd.Install`) and Windows path (`schtasks.Install`) presumably don't have an equivalent "fire immediately on register" behavior, hence no equivalent race.

### Suggested fixes (any one is sufficient)

1. **Drop `--now` from the `enable` call** on Linux (`systemd.go:81`) so the inline `telemetry.Run()` in `main.go:134` is the only initial run; the timer will naturally fire on its next scheduled tick.
2. **Skip the inline `telemetry.Run()` on Linux** in the `install` case in `main.go`, and rely on `enable --now` to trigger the first scan via the timer.
3. **Hold the singleton lock around the entire `install` command** so the timer-triggered service blocks until the inline run releases it (rather than failing fast with "another instance is already running"). This also fixes the misleading error in `agent.error.log`.

Option 1 is the smallest, most local change.

### Additional Context

- Reproduced on a fresh Fedora 42 VM with no prior agent state (after `rm -rf ~/.stepsecurity` and removing `~/.config/systemd/user/stepsecurity-*`).
- Reproduces both when the loader is invoked as the target user and as `sudo` (with the loader correctly dropping privileges via `runuser` + `XDG_RUNTIME_DIR`/`DBUS_SESSION_BUS_ADDRESS`).
- Telemetry uploads still succeed end-to-end (HTTP 200), so this is a "correctness of error log" issue rather than a functional install failure — but the persistent `[error] acquiring lock: another instance is already running` message in `agent.error.log` is alarming for operators reviewing logs and looks like a real concurrency bug rather than a self-inflicted race.
- Separately worth noting (not the same bug, but related symptom amplifier): the Linux loader template that's served from the dashboard runs `binary install` and then *also* runs `binary send-telemetry` immediately after, which produces a third back-to-back telemetry invocation for every install. With the timer-fires-immediately behavior above, that's effectively three telemetry runs queued up at install time. Worth removing the redundant `send-telemetry` from the loader template, but the race in this issue is reproducible without the loader's extra call too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Lock contention on Linux install — `enable --now` timer races with inline post-install telemetry #62

Version

OS

Command Run

Expected Behavior

Actual Behavior

Output / Error Messages

Root cause (suspected)

Suggested fixes (any one is sufficient)

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Lock contention on Linux install — enable --now timer races with inline post-install telemetry #62

Description

Version

OS

Command Run

Expected Behavior

Actual Behavior

Output / Error Messages

Root cause (suspected)

Suggested fixes (any one is sufficient)

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: Lock contention on Linux install — `enable --now` timer races with inline post-install telemetry #62