Overview

Nekotopia operates ~~a comprehensive~~ monitoring infrastructure to ensure service reliability, performance optimisation, and proactive issue detection. Our telemetry stack is built on industry-standard open-source ~~tools,~~tools.

~~providing~~

~~real-time~~

What visibilityWe intoMonitor

~~network~~

We continuously monitor the health and ~~VPN~~performance ~~service~~of ~~performance.~~the Torus network:

Monitoring Stack

Core Components

WhattunnelCorecapacity

~~Component~~Area	~~Purpose~~	~~Description~~We Track
~~Prometheus~~Network Health	~~Metrics~~VPN ~~Database~~	~~Time-series~~status, ~~collection~~bandwidth ~~and~~utilisation, ~~storage~~packet ~~with powerful query language (PromQL)~~loss
~~Grafana~~Service Availability	~~Visualisation~~	~~Dashboards,~~services ~~alerting,~~uptime, ~~and~~API ~~data exploration~~responsiveness
~~MKTXP~~Resource Usage	~~MikroTik~~Infrastructure ~~Exporter~~	~~Exports~~and ~~RouterOS metrics (interfaces, queues, firewall, wireless)~~headroom
~~Node~~Security ~~Exporter~~Events	~~Host Metrics~~	~~CPU, memory, disk, network stats from Linux servers~~
~~cAdvisor~~	~~Container Metrics~~	~~Docker container resource usage and performance~~
~~pmacct~~	~~Flow Analysis~~	~~NetFlow/IPFIX~~Unusual traffic ~~accounting~~patterns, ~~and~~connection ~~analysis~~anomalies

ArchitectureAlerting

~~The~~Automated ~~monitoring~~alerts ~~system~~notify ~~follows~~administrators ~~a pull-based collection model with centralised storage and visualisation.~~

Data Collection Flow

~~1. Data Sources~~ ~~generate metrics:~~of:

~~MikroTik~~Service ~~Router~~degradation ~~(network~~or ~~stats, queues, firewall)~~outages
~~Linux~~Capacity ~~Hosts~~thresholds ~~(system~~approaching ~~resources)~~limits
~~Docker~~Security-relevant ~~Containers~~events

~~(application~~

Infrastructure ~~metrics)~~component failures

Dashboards

Real-time dashboards provide visibility into:

Overall network health status

Active VPN connections

Bandwidth consumption by tier

Historical performance trends

User-Facing Metrics

Some metrics are exposed to users through the dashboard:

Your connection status and uptime

Your bandwidth usage

Network latency to the hub

2.For ~~Exporters~~more ~~expose metrics in Prometheus format:~~

MKTXP ~~→ MikroTik metrics~~details on :9436

Node Exporter ~~→ Host metrics on~~ :9100

cAdvisor ~~→ Container metrics on~~ :8080

~~3. Prometheus~~ ~~scrapes all exporters every 15 seconds and stores 30 days of time-series data.~~

~~4. Grafana~~ ~~queries Prometheus and displays dashboards for operators.~~

Visual Flow

╔═══════════════════════════════════════════════════════════════════╗
║                        DATA SOURCES                               ║
╠═══════════════════════════════════════════════════════════════════╣
║                                                                   ║
║   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐          ║
║   │  MikroTik   │    │   Linux     │    │   Docker    │          ║
║   │   Router    │    │   Hosts     │    │ Containers  │          ║
║   └──────┬──────┘    └──────┬──────┘    └──────┬──────┘          ║
║          │                  │                  │                 ║
║          ▼                  ▼                  ▼                 ║
║   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐          ║
║   │    MKTXP    │    │    Node     │    │  cAdvisor   │          ║
║   │  :9436      │    │  :9100      │    │  :8080      │          ║
║   └──────┬──────┘    └──────┬──────┘    └──────┬──────┘          ║
╚══════════╪══════════════════╪══════════════════╪══════════════════╝
           │                  │                  │
           └──────────────────┼──────────────────┘
                              │
                              ▼
              ╔═══════════════════════════════╗
              ║        PROMETHEUS             ║
              ║      (Metrics Database)       ║
              ║                               ║
              ║  • Scrapes every 15 sec       ║
              ║  • 30 day retention           ║
              ║  • Alerting rules             ║
              ╚═══════════════╤═══════════════╝
                              │
                              ▼
              ╔═══════════════════════════════╗
              ║          GRAFANA              ║
              ║        (Dashboards)           ║
              ║                               ║
              ║  • VPN Statistics             ║
              ║  • Bandwidth Graphs           ║
              ║  • System Health              ║
              ╚═══════════════════════════════╝

NetFlow Traffic Analysis

~~For detailed traffic analysis, NetFlow~~what data ~~follows~~we acollect ~~separate~~about ~~path:~~users, see Data Collection.

┌─────────────┐         ┌─────────────┐         ┌─────────────┐
│  MikroTik   │ NetFlow │   pmacct    │ metrics │ Prometheus  │
│   Router    │ ──────▶ │  Collector  │ ──────▶ │             │
│             │   v9    │             │         │             │
└─────────────┘         └──────┬──────┘         └─────────────┘
                               │
                               ▼
                        ┌─────────────┐
                        │   Grafana   │
                        │  Traffic    │
                        │  Analysis   │
                        └─────────────┘

Key Metrics Collected

Network Metrics

~~Metric~~	~~Description~~
~~Interface Throughput~~	~~Bytes in/out per interface~~
~~Packet Rates~~	~~Packets per second, errors, drops~~
~~Queue Statistics~~	~~Per-user bandwidth enforcement~~
~~Firewall Counters~~	~~Rule hit counts, blocked connections~~
~~WireGuard Peers~~	~~Handshake status, data transfer~~

System Metrics

~~Metric~~	~~Description~~
~~CPU/Memory~~	~~Server and router resource usage~~
~~Disk I/O~~	~~Storage performance~~
~~Container Health~~	~~Docker service status~~
~~Process Monitoring~~	~~Critical service uptime~~

Traffic Analysis

~~Metric~~	~~Description~~
~~Flow Records~~	~~Source/destination IP, ports, protocols~~
~~User Bandwidth~~	~~Per-VPN-peer traffic accounting~~
~~Top Talkers~~	~~Highest bandwidth consumers~~
~~Protocol Distribution~~	~~Traffic breakdown by application~~

Access

~~Grafana dashboards are available to administrators at the internal monitoring endpoint. User-facing statistics are exposed through the Nekotopia dashboard where appropriate.~~