Monitoring Dashboard Overview
Who this is for
All users who want to understand what the Monitoring dashboard shows and how to navigate it.
What you will complete
Learn how the Monitoring dashboard is organized, what each panel shows, and how to use it as your daily infrastructure health check.
Before you begin
- At least one server must be connected and in a Running state.
- The monitoring agent must be installed on the server (this happens automatically during provisioning).
- Navigate to Monitoring in the left sidebar.
Overview
The Monitoring dashboard is your real-time view of infrastructure health across your entire fleet. It consolidates CPU, memory, disk, network, and load metrics from every connected server into one screen, alongside active alerts and recent events.
The two main views
View 1: Fleet overview
The default view shows a summary card for each server in your organization. Each card displays:
- Server name and provider — the server's hostname and cloud provider (AWS, GCP, Azure, DigitalOcean)
- Current CPU % — real-time CPU utilization
- Current RAM % — memory usage as a percentage of total available
- Disk % — current disk utilization
- Status indicator — a colored dot showing health: green (healthy), amber (warning), red (critical), grey (offline or no data)
- Last updated — when metrics were last received
Use the fleet overview to scan for anomalies across all servers at a glance.
View 2: Per-server detail
Click any server card to open its detailed monitoring view. This shows:
- Historical metric charts — CPU, RAM, disk, and network throughput over selectable time ranges (1 hour, 6 hours, 24 hours, 7 days)
- Load average — 1-minute, 5-minute, and 15-minute load averages
- Network I/O — inbound and outbound throughput in Kbps
- Active alerts for this server — any currently firing alert rules
The Alerts tab
The Monitoring page includes an Alerts tab (or navigate directly to Alerts in the sidebar). This shows:
- Alert rules — all configured threshold rules, with their current status (Active, Firing, Paused)
- Alert events — history of when each rule fired, resolved, or was snoozed
See KB-06-03 for how to create alert rules.
Time range selector
On the per-server detail view, use the time range selector to adjust the chart window:
- 1h — last hour (highest resolution, 1-minute intervals)
- 6h — last 6 hours
- 24h — last 24 hours (default)
- 7d — last 7 days (daily averages)
Longer time ranges use aggregated data. For diagnosing a specific incident, use the 1h or 6h view for the best resolution.
Filtering and sorting
On the fleet overview, you can filter servers by:
- Cloud provider (AWS, GCP, Azure, DigitalOcean)
- Cloud account
- Status (healthy, warning, critical, offline)
Use filters when managing a large fleet to focus on the servers that need attention.
What success looks like
- All server cards on the fleet overview show green status indicators.
- No alert events appear in the firing state.
- Metric charts show stable, expected values for your workload.
Common errors and fixes
"Server card shows grey / no data" Cause: The monitoring agent is not sending data. The server may be offline, or the agent was not installed. Fix: Check server status on the Servers page. If the server is running, see KB-06-10 for non-SSH monitoring.
"Metrics appear to be delayed or stuck" Cause: The monitoring agent sends metrics every 60 seconds. Spikes shorter than 60 seconds may not appear. Fix: This is expected. The monitoring system captures 60-second interval samples. For sub-minute visibility, use SSH to run top or htop directly.
"I can see metrics but no alert rules are shown" Cause: No alert rules have been created for this server yet. Fix: Go to the Alerts tab and create rules. See KB-06-03.