QNX-Based SAT Monitoring System for EAST Poloidal Field Power Supply
This paper presents the design and implementation of a SAT (System Analysis Toolkit) monitoring system on QNX 6.20 for the EAST tokamak poloidal field power supply. The system enables non-intrusive, real-time observation of kernel events, thread states, interrupts, and message passing, providing developers with deep visibility into system behavior for fault diagnosis and optimization.
🔍 Overview of SAT #
The QNX System Analysis Toolkit captures kernel-level events without affecting performance (>98% of normal kernel throughput). Key components include:
- Kernel Buffers: Circular buffers (1024 slots × 16 bytes) storing timestamped events.
- Data Interceptors and Translators: Capture and preprocess events for real-time or offline analysis.
- Filters: Reduce data volume while preserving critical events.
- Multi-Process Architecture: Ensures efficient, concurrent data collection and processing.
Workflow:
- Kernel intercepts events and stores them in buffers.
- Buffers reaching ~70% capacity trigger the TraceLog process.
- TraceLog processes events, applies filters, and forwards data to GUI or storage.
- Remote users access data via Photon microGUI over LAN.
⚙️ Software Architecture #
The monitoring system uses a multi-process, multi-thread design:
- Init Process: Creates shared memory, initializes system, and launches monitoring processes.
- Monitor Process: Tracks application processes and threads, detects dead or zombie states.
- TraceLog Process: Core engine that configures filters, captures kernel events, performs analysis, and dispatches data.
- Com Process: Manages UDP network transmission to the GUI (every 20 ms).
Shared memory enables low-latency, high-bandwidth data exchange with priority-based access control to avoid race conditions.
Example: Shared Memory Setup
shd = shm_open(SM_NM, O_RDWR | O_CREAT, 0777);
ftruncate(shd, SHM_SIZE);
struct_p = (subsat_t *)mmap(NULL, SHM_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, shd, 0);
🛠 Data Filtering and Event Management #
To handle high-volume kernel events, multiple filter strategies are implemented:
- Fast/Verbose Filter: Controls event detail density.
- Static Rule Filter: Selects specific event classes or PIDs.
- Dynamic Rule Filter: Conditional runtime filtering for flexibility.
- Post-Processing Filter: Offline analysis of all captured events.
Filters significantly reduce processing overhead while maintaining critical visibility.
📡 Network Communication and GUI #
- UDP Protocol: Non-blocking transmission ensures low latency.
- Photon microGUI: Provides remote monitoring, real-time event display, and alerts for fault conditions.
- Supports monitoring of any host or process on the LAN.
The system allows operators and developers to observe kernel and application-level events with minimal intrusion, preserving real-time control performance.
✅ Practical Results and Application #
Applied to the EAST poloidal field power supply:
- Captured real-time kernel events across multiple PF units.
- Detected hidden performance bottlenecks and anomalies.
- Enabled precise diagnostics for rare faults.
- Improved system maintainability and operational reliability.
The SAT system proved capable of continuous monitoring without impacting the deterministic control cycles of the QNX-based PF controllers.
🔮 Modern Perspective (2026) #
Enhancements for current systems include:
- Upgrading to QNX SDP 8.x / Helix for multi-core, safety-certified RTOS features.
- Integration with System Viewer, advanced core dumps, and observability dashboards (Prometheus + Grafana).
- Containerized RTPs and real-time analytics for centralized monitoring of distributed systems.
- Use of modern network protocols like TSN for deterministic, low-latency telemetry.
References:
- QNX Momentics Development Suite and SAT Documentation.
- EAST Tokamak Technical Reports and Field Testing Data.
- Best Practices in Real-Time Kernel Event Tracing.