Overview
Nekotopia operates a comprehensive monitoring infrastructure to ensure service reliability, performance optimisation, and proactive issue detection. Our telemetry stack is built on industry-standard open-source tools, providing real-time visibility into network health and VPN service performance.
Monitoring Stack
Core Components
| Component | Purpose | Description |
|---|---|---|
| Prometheus | Metrics Database | Time-series collection and storage with powerful query language (PromQL) |
| Grafana | Visualisation | Dashboards, alerting, and data exploration |
| MKTXP | MikroTik Exporter | Exports RouterOS metrics (interfaces, queues, firewall, wireless) |
| Node Exporter | Host Metrics | CPU, memory, disk, network stats from Linux servers |
| cAdvisor | Container Metrics | Docker container resource usage and performance |
| pmacct | Flow Analysis | NetFlow/IPFIX traffic accounting and analysis |
Architecture
The monitoring system follows a pull-based collection model with centralised storage and visualisation.
Data Collection Flow
1. Data Sources generate metrics:
- MikroTik Router (network stats, queues, firewall)
- Linux Hosts (system resources)
- Docker Containers (application metrics)
2. Exporters expose metrics in Prometheus format:
MKTXP→ MikroTik metrics on:9436Node Exporter→ Host metrics on:9100cAdvisor→ Container metrics on:8080
3. Prometheus scrapes all exporters every 15 seconds and stores 30 days of time-series data.
4. Grafana queries Prometheus and displays dashboards for operators.
Visual Flow
╔═══════════════════════════════════════════════════════════════════╗
║ DATA SOURCES ║
╠═══════════════════════════════════════════════════════════════════╣
║ ║
║ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ║
║ │ MikroTik │ │ Linux │ │ Docker │ ║
║ │ Router │ │ Hosts │ │ Containers │ ║
║ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ ║
║ │ │ │ ║
║ ▼ ▼ ▼ ║
║ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ║
║ │ MKTXP │ │ Node │ │ cAdvisor │ ║
║ │ :9436 │ │ :9100 │ │ :8080 │ ║
║ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ ║
╚══════════╪══════════════════╪══════════════════╪══════════════════╝
│ │ │
└──────────────────┼──────────────────┘
│
▼
╔═══════════════════════════════╗
║ PROMETHEUS ║
║ (Metrics Database) ║
║ ║
║ • Scrapes every 15 sec ║
║ • 30 day retention ║
║ • Alerting rules ║
╚═══════════════╤═══════════════╝
│
▼
╔═══════════════════════════════╗
║ GRAFANA ║
║ (Dashboards) ║
║ ║
║ • VPN Statistics ║
║ • Bandwidth Graphs ║
║ • System Health ║
╚═══════════════════════════════╝
NetFlow Traffic Analysis
For detailed traffic analysis, NetFlow data follows a separate path:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ MikroTik │ NetFlow │ pmacct │ metrics │ Prometheus │
│ Router │ ──────▶ │ Collector │ ──────▶ │ │
│ │ v9 │ │ │ │
└─────────────┘ └──────┬──────┘ └─────────────┘
│
▼
┌─────────────┐
│ Grafana │
│ Traffic │
│ Analysis │
└─────────────┘
Key Metrics Collected
Network Metrics
| Metric | Description |
|---|---|
| Interface Throughput | Bytes in/out per interface |
| Packet Rates | Packets per second, errors, drops |
| Queue Statistics | Per-user bandwidth enforcement |
| Firewall Counters | Rule hit counts, blocked connections |
| WireGuard Peers | Handshake status, data transfer |
System Metrics
| Metric | Description |
|---|---|
| CPU/Memory | Server and router resource usage |
| Disk I/O | Storage performance |
| Container Health | Docker service status |
| Process Monitoring | Critical service uptime |
Traffic Analysis
| Metric | Description |
|---|---|
| Flow Records | Source/destination IP, ports, protocols |
| User Bandwidth | Per-VPN-peer traffic accounting |
| Top Talkers | Highest bandwidth consumers |
| Protocol Distribution | Traffic breakdown by application |
Access
Grafana dashboards are available to administrators at the internal monitoring endpoint. User-facing statistics are exposed through the Nekotopia dashboard where appropriate.