Add project structure and roadmap documentation
- Created `project-structure.md` to outline the directory layout, crate dependencies, design principles, module guidelines, and naming conventions for the NxMesh codebase. - Introduced `roadmap.md` detailing the development phases, milestones, tasks, deliverables, and resource requirements for the NxMesh project, spanning from foundational setup to enterprise features.
This commit is contained in:
814
docs/features.md
Normal file
814
docs/features.md
Normal file
@@ -0,0 +1,814 @@
|
||||
# NxMesh Feature Specification
|
||||
|
||||
## Table of Contents
|
||||
1. [Core Features](#core-features)
|
||||
2. [Master Features](#master-features)
|
||||
3. [Agent Features](#agent-features)
|
||||
4. [Configuration Management](#configuration-management)
|
||||
5. [Observability](#observability)
|
||||
6. [Security Features](#security-features)
|
||||
|
||||
---
|
||||
|
||||
## Core Features
|
||||
|
||||
### CF-001: Multi-tenancy with Organizations and Workspaces
|
||||
|
||||
**Description**: Support for multiple organizations with isolated workspaces within each organization.
|
||||
|
||||
**Requirements**:
|
||||
- Organizations are top-level resource containers
|
||||
- Each organization can have multiple workspaces
|
||||
- Resources (agents, configs, certificates) are scoped to a workspace
|
||||
- Cross-workspace visibility is configurable
|
||||
|
||||
**Data Model**:
|
||||
```rust
|
||||
struct Organization {
|
||||
id: Uuid,
|
||||
name: String,
|
||||
slug: String, // URL-friendly identifier
|
||||
created_at: DateTime,
|
||||
settings: OrganizationSettings,
|
||||
}
|
||||
|
||||
struct Workspace {
|
||||
id: Uuid,
|
||||
organization_id: Uuid,
|
||||
name: String,
|
||||
slug: String,
|
||||
created_at: DateTime,
|
||||
}
|
||||
```
|
||||
|
||||
**API Endpoints**:
|
||||
- `GET /api/v1/organizations` - List organizations
|
||||
- `POST /api/v1/organizations` - Create organization
|
||||
- `GET /api/v1/organizations/{id}/workspaces` - List workspaces
|
||||
- `POST /api/v1/organizations/{id}/workspaces` - Create workspace
|
||||
|
||||
---
|
||||
|
||||
### CF-002: Agent Registration and Lifecycle Management
|
||||
|
||||
**Description**: Agents must register with the master before receiving configurations.
|
||||
|
||||
**Registration Flow**:
|
||||
1. Administrator generates bootstrap token in Master UI
|
||||
2. Token is provided to agent via environment variable or config file
|
||||
3. Agent establishes TLS connection to master (verifies server certificate)
|
||||
4. Agent sends bootstrap token for registration
|
||||
5. Master validates token and establishes shared secret:
|
||||
- Master generates session_key (per-agent) + key_id
|
||||
- Session key used for HMAC request signing
|
||||
- Primary/secondary key design for rotation
|
||||
|
||||
**Agent States**:
|
||||
```rust
|
||||
enum AgentState {
|
||||
Pending, // Registered but never connected
|
||||
Online, // Connected and healthy
|
||||
Offline, // Disconnected
|
||||
Degraded, // Connected but health checks failing
|
||||
Maintenance, // Manually placed in maintenance mode
|
||||
}
|
||||
```
|
||||
|
||||
**Agent Metadata**:
|
||||
```rust
|
||||
struct Agent {
|
||||
id: Uuid,
|
||||
workspace_id: Uuid,
|
||||
name: String,
|
||||
hostname: String,
|
||||
ip_address: String,
|
||||
version: String,
|
||||
state: AgentState,
|
||||
deployment_mode: DeploymentMode, // DockerSidecar, K8sSidecar, Standalone
|
||||
last_seen_at: DateTime,
|
||||
capabilities: Vec<String>, // e.g., ["http3", "websocket", "rate_limiting"]
|
||||
labels: HashMap<String, String>, // e.g., {"env": "prod", "region": "us-east"}
|
||||
}
|
||||
```
|
||||
|
||||
**API Endpoints**:
|
||||
- `POST /api/v1/agents/register` - Register new agent
|
||||
- `GET /api/v1/agents` - List agents
|
||||
- `GET /api/v1/agents/{id}` - Get agent details
|
||||
- `POST /api/v1/agents/{id}/tokens` - Generate registration token
|
||||
- `DELETE /api/v1/agents/{id}` - Deregister agent
|
||||
|
||||
---
|
||||
|
||||
### CF-003: Real-time Configuration Distribution
|
||||
|
||||
**Description**: Push configuration changes to agents in real-time with delivery guarantees.
|
||||
|
||||
**Requirements**:
|
||||
- Config changes propagate to all affected agents within 5 seconds
|
||||
- Support for targeted updates (specific agents or groups)
|
||||
- Config versioning with rollback capability
|
||||
- Delivery confirmation from agents
|
||||
|
||||
**Configuration Scope**:
|
||||
```rust
|
||||
enum ConfigScope {
|
||||
Global, // All agents
|
||||
Workspace, // All agents in workspace
|
||||
AgentGroup(String), // Agents with specific label selector
|
||||
Agent(Uuid), // Single agent
|
||||
}
|
||||
```
|
||||
|
||||
**Delivery Guarantees**:
|
||||
- At-least-once delivery
|
||||
- Automatic retry with exponential backoff
|
||||
- Config checksum verification
|
||||
- Offline agents receive updates on reconnection
|
||||
|
||||
---
|
||||
|
||||
## Master Features
|
||||
|
||||
### MF-001: RESTful API
|
||||
|
||||
**Description**: Comprehensive REST API for all operations.
|
||||
|
||||
**Base URL**: `/api/v1`
|
||||
|
||||
**Resource Endpoints**:
|
||||
|
||||
| Resource | Endpoints |
|
||||
|----------|-----------|
|
||||
| Organizations | GET, POST, PATCH, DELETE `/organizations` |
|
||||
| Workspaces | GET, POST, PATCH, DELETE `/workspaces` |
|
||||
| Agents | GET, POST, PATCH, DELETE `/agents` |
|
||||
| VirtualHosts | GET, POST, PATCH, DELETE `/virtual-hosts` |
|
||||
| Upstreams | GET, POST, PATCH, DELETE `/upstreams` |
|
||||
| Certificates | GET, POST, DELETE `/certificates` |
|
||||
| AccessLogs | GET `/access-logs` |
|
||||
| Metrics | GET `/metrics` |
|
||||
|
||||
**Response Format**:
|
||||
```json
|
||||
{
|
||||
"data": { ... },
|
||||
"meta": {
|
||||
"page": 1,
|
||||
"per_page": 20,
|
||||
"total": 100
|
||||
},
|
||||
"links": {
|
||||
"self": "/api/v1/agents?page=1",
|
||||
"next": "/api/v1/agents?page=2",
|
||||
"prev": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Error Format**:
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"code": "VALIDATION_ERROR",
|
||||
"message": "Invalid configuration",
|
||||
"details": [
|
||||
{"field": "server_name", "message": "Invalid domain format"}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### MF-002: Web-based Admin Console (Embedded)
|
||||
|
||||
**Description**: Modern web UI for managing the entire system. Built with React + Vite and served as static files embedded directly in the master binary.
|
||||
|
||||
**Pages**:
|
||||
|
||||
| Page | Features |
|
||||
|------|----------|
|
||||
| Dashboard | Agent status, recent events, traffic overview |
|
||||
| Agents | List, detail view, logs, metrics graphs |
|
||||
| Configurations | Virtual host editor, upstream management |
|
||||
| Certificates | SSL certificate list, expiration alerts |
|
||||
| Access Control | Users, roles, permissions management |
|
||||
| Settings | Organization settings, integrations |
|
||||
|
||||
**Key UI Features**:
|
||||
- Real-time updates via WebSocket
|
||||
- Monaco editor for nginx configuration
|
||||
- Visual topology view (agent connections)
|
||||
- Dark/light mode support
|
||||
- Responsive design
|
||||
|
||||
---
|
||||
|
||||
### MF-003: Configuration Template Engine
|
||||
|
||||
**Description**: Templating system for generating nginx configurations.
|
||||
|
||||
**Template Variables**:
|
||||
```handlebars
|
||||
# Example virtual host template
|
||||
server {
|
||||
listen {{port}} {{#if ssl}}ssl{{/if}} {{#if http2}}http2{{/if}};
|
||||
server_name {{server_name}};
|
||||
|
||||
{{#if ssl}}
|
||||
ssl_certificate {{ssl_certificate_path}};
|
||||
ssl_certificate_key {{ssl_certificate_key_path}};
|
||||
{{/if}}
|
||||
|
||||
location {{location_path}} {
|
||||
proxy_pass http://{{upstream_name}};
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
|
||||
{{#each custom_headers}}
|
||||
add_header {{name}} "{{value}}";
|
||||
{{/each}}
|
||||
|
||||
{{#if rate_limiting}}
|
||||
limit_req zone={{rate_limit_zone}} burst={{rate_limit_burst}};
|
||||
{{/if}}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Built-in Templates**:
|
||||
- `default` - Standard reverse proxy
|
||||
- `spa` - Single Page Application (with fallback to index.html)
|
||||
- `api` - API gateway with rate limiting
|
||||
- `static` - Static file serving with caching
|
||||
- `websocket` - WebSocket proxy with connection upgrades
|
||||
|
||||
---
|
||||
|
||||
### MF-004: Certificate Management (ACME)
|
||||
|
||||
**Description**: Automatic SSL/TLS certificate provisioning via Let's Encrypt.
|
||||
|
||||
**Features**:
|
||||
- ACME v2 protocol support
|
||||
- HTTP-01 and DNS-01 challenges
|
||||
- Automatic renewal (30 days before expiry)
|
||||
- Wildcard certificate support (DNS-01)
|
||||
- Certificate monitoring and alerts
|
||||
|
||||
**Certificate Entity**:
|
||||
```rust
|
||||
struct Certificate {
|
||||
id: Uuid,
|
||||
workspace_id: Uuid,
|
||||
domain: String,
|
||||
is_wildcard: bool,
|
||||
provider: CertificateProvider, // LetsEncrypt, Custom
|
||||
status: CertificateStatus, // Pending, Active, Expired, Error
|
||||
issued_at: DateTime,
|
||||
expires_at: DateTime,
|
||||
auto_renew: bool,
|
||||
certificate_pem: Option<String>, // Encrypted at rest
|
||||
private_key_pem: Option<String>, // Encrypted at rest
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Agent Features
|
||||
|
||||
### AF-001: Nginx Lifecycle Management
|
||||
|
||||
**Description**: Agent manages nginx process lifecycle based on deployment mode.
|
||||
|
||||
**Docker Sidecar Mode**:
|
||||
- Shares PID namespace with nginx container (via `pid: service:nginx`)
|
||||
- Directly signals nginx process for reload/restart
|
||||
- Monitors nginx via health checks
|
||||
|
||||
**Standalone Mode**:
|
||||
- Direct process management (signals to PID from file)
|
||||
- systemd integration (optional, for service management)
|
||||
- PID file monitoring
|
||||
|
||||
**Lifecycle Actions**:
|
||||
- `start` - Start nginx
|
||||
- `stop` - Graceful shutdown
|
||||
- `reload` - Hot reload configuration
|
||||
- `restart` - Full restart
|
||||
- `test` - Validate configuration
|
||||
|
||||
---
|
||||
|
||||
### AF-002: Configuration Rendering and Application
|
||||
|
||||
**Description**: Agent renders nginx configs from master templates and applies them using atomic symlink swaps for zero-downtime updates.
|
||||
|
||||
**Config Directory Structure**:
|
||||
```
|
||||
/etc/nginx/
|
||||
├── nginx.conf # Contains: include /etc/nginx/conf.d/current/*.conf
|
||||
├── conf.d/
|
||||
│ ├── current -> ./20260302143000/ # Symlink to active deployment
|
||||
│ ├── 20260302143000/ # Active config (timestamped)
|
||||
│ │ ├── default.conf
|
||||
│ │ └── upstream.conf
|
||||
│ ├── 20260302141500/ # Previous deployment (for rollback)
|
||||
│ │ ├── default.conf
|
||||
│ │ └── upstream.conf
|
||||
│ └── 20260302140000/ # Older deployment (cleanup candidate)
|
||||
```
|
||||
|
||||
**Config Rendering Flow**:
|
||||
1. Receive ConfigUpdate from master
|
||||
2. Create new deployment folder: `./conf.d/<timestamp>/`
|
||||
3. Render nginx config files into timestamped folder
|
||||
4. **Validate** new config: `nginx -t -c /etc/nginx/conf.d/<timestamp>/nginx.conf`
|
||||
5. If validation passes, **atomically update symlink**: `current` → `<timestamp>/`
|
||||
6. Execute graceful nginx reload
|
||||
7. Verify reload success (health check)
|
||||
8. Report status to master
|
||||
9. Cleanup old deployments (keep N recent versions)
|
||||
|
||||
**Atomic Config Swap**:
|
||||
```rust
|
||||
async fn apply_config(&self, config: ConfigUpdate) -> Result<()> {
|
||||
let timestamp = generate_timestamp();
|
||||
let deploy_dir = self.conf_d_path.join(×tamp);
|
||||
let symlink_path = self.conf_d_path.join("current");
|
||||
|
||||
// 1. Render config to new timestamped directory
|
||||
self.render_config(&config, &deploy_dir).await?;
|
||||
|
||||
// 2. Validate BEFORE switching symlink (point to new folder directly)
|
||||
self.validate_config(&deploy_dir).await?;
|
||||
|
||||
// 3. Atomic symlink swap (Unix: symlink + rename)
|
||||
let temp_link = self.conf_d_path.join("current.tmp");
|
||||
tokio::fs::symlink(&deploy_dir, &temp_link).await?;
|
||||
tokio::fs::rename(&temp_link, &symlink_path).await?; // Atomic operation
|
||||
|
||||
// 4. Reload nginx (picks up new symlink target)
|
||||
self.reload_nginx().await?;
|
||||
|
||||
// 5. Verify and cleanup
|
||||
self.verify_health().await?;
|
||||
self.cleanup_old_deployments(5).await?; // Keep last 5 versions
|
||||
|
||||
self.report_success(config.id, timestamp).await;
|
||||
}
|
||||
```
|
||||
|
||||
**Rollback Strategy**:
|
||||
```rust
|
||||
async fn rollback(&self, target_timestamp: &str) -> Result<()> {
|
||||
let target_dir = self.conf_d_path.join(target_timestamp);
|
||||
let symlink_path = self.conf_d_path.join("current");
|
||||
|
||||
// Verify target exists
|
||||
if !target_dir.exists() {
|
||||
return Err(Error::RollbackTargetNotFound);
|
||||
}
|
||||
|
||||
// Atomic symlink swap back to previous deployment
|
||||
let temp_link = self.conf_d_path.join("current.tmp");
|
||||
tokio::fs::symlink(&target_dir, &temp_link).await?;
|
||||
tokio::fs::rename(&temp_link, &symlink_path).await?;
|
||||
|
||||
// Reload nginx
|
||||
self.reload_nginx().await?;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### AF-003: Health Monitoring and Reporting
|
||||
|
||||
**Description**: Continuous health monitoring of nginx and the host system.
|
||||
|
||||
**Health Checks**:
|
||||
- **Nginx Health**: HTTP request to nginx health endpoint
|
||||
- **Configuration Health**: Verify current config matches expected
|
||||
- **Resource Health**: CPU, memory, disk usage
|
||||
- **Connection Health**: Active connections, request rate
|
||||
|
||||
**Health Report Structure**:
|
||||
```rust
|
||||
struct HealthReport {
|
||||
agent_id: Uuid,
|
||||
timestamp: DateTime,
|
||||
nginx_status: NginxStatus,
|
||||
system_metrics: SystemMetrics,
|
||||
config_checksum: String,
|
||||
alerts: Vec<Alert>,
|
||||
}
|
||||
|
||||
struct NginxStatus {
|
||||
is_running: bool,
|
||||
pid: Option<u32>,
|
||||
uptime_seconds: u64,
|
||||
active_connections: u32,
|
||||
requests_per_second: f64,
|
||||
}
|
||||
|
||||
struct SystemMetrics {
|
||||
cpu_percent: f64,
|
||||
memory_used_mb: u64,
|
||||
memory_total_mb: u64,
|
||||
disk_used_gb: u64,
|
||||
disk_total_gb: u64,
|
||||
}
|
||||
```
|
||||
|
||||
**Reporting Interval**: Configurable (default: 30 seconds)
|
||||
|
||||
---
|
||||
|
||||
### AF-004: Metrics Collection and Export
|
||||
|
||||
**Description**: Collect and expose metrics in Prometheus format.
|
||||
|
||||
**Metrics Endpoint**: `GET /metrics` (on agent)
|
||||
|
||||
**Built-in Metrics**:
|
||||
```
|
||||
# Nginx metrics (parsed from stub_status)
|
||||
nxmesh_nginx_connections_active{agent_id="..."} 42
|
||||
nxmesh_nginx_connections_reading{agent_id="..."} 5
|
||||
nxmesh_nginx_connections_writing{agent_id="..."} 30
|
||||
nxmesh_nginx_connections_waiting{agent_id="..."} 7
|
||||
nxmesh_nginx_requests_total{agent_id="..."} 1234567
|
||||
|
||||
# Agent metrics
|
||||
nxmesh_agent_uptime_seconds{agent_id="..."} 86400
|
||||
nxmesh_agent_master_connection_status{agent_id="..."} 1
|
||||
nxmesh_agent_config_version{agent_id="...",version="123"} 1
|
||||
|
||||
# System metrics
|
||||
nxmesh_system_cpu_percent{agent_id="..."} 25.5
|
||||
nxmesh_system_memory_used_bytes{agent_id="..."} 1073741824
|
||||
nxmesh_system_disk_used_bytes{agent_id="..."} 53687091200
|
||||
```
|
||||
|
||||
**Custom Metrics**: Agents can collect custom metrics from nginx access logs
|
||||
|
||||
---
|
||||
|
||||
### AF-005: Offline Operation and Recovery
|
||||
|
||||
**Description**: Agent can operate independently when master is unreachable.
|
||||
|
||||
**Offline Capabilities**:
|
||||
- Continue serving traffic with cached configuration
|
||||
- Local health monitoring continues
|
||||
- Metrics are buffered for later transmission
|
||||
- Automatic reconnection attempts
|
||||
|
||||
**Recovery Flow**:
|
||||
1. Detect disconnection from master
|
||||
2. Enter "offline mode"
|
||||
3. Continue operating with cached config
|
||||
4. Buffer metrics and logs
|
||||
5. Attempt reconnection with exponential backoff
|
||||
6. On reconnection:
|
||||
- Sync configuration (compare checksums)
|
||||
- Transmit buffered metrics
|
||||
- Resume normal operation
|
||||
|
||||
---
|
||||
|
||||
## Configuration Management
|
||||
|
||||
### CM-001: Virtual Host Configuration
|
||||
|
||||
**Description**: Define nginx server blocks (virtual hosts) via API/UI.
|
||||
|
||||
**VirtualHost Entity**:
|
||||
```rust
|
||||
struct VirtualHost {
|
||||
id: Uuid,
|
||||
workspace_id: Uuid,
|
||||
name: String, // Human-readable name
|
||||
server_name: String, // Domain name(s), comma-separated
|
||||
listen_port: u16, // Usually 80 or 443
|
||||
ssl_enabled: bool,
|
||||
ssl_certificate_id: Option<Uuid>,
|
||||
|
||||
// Routing configuration
|
||||
locations: Vec<Location>,
|
||||
|
||||
// Advanced settings
|
||||
http2_enabled: bool,
|
||||
http3_enabled: bool,
|
||||
gzip_enabled: bool,
|
||||
rate_limiting: Option<RateLimitConfig>,
|
||||
|
||||
// Target agents
|
||||
target_agents: AgentSelector,
|
||||
}
|
||||
|
||||
struct Location {
|
||||
path: String, // e.g., "/api" or "~ \.php$"
|
||||
proxy_pass: Option<String>, // e.g., "http://backend"
|
||||
upstream_id: Option<Uuid>,
|
||||
root: Option<String>, // For static files
|
||||
index: Option<String>, // e.g., "index.html"
|
||||
custom_headers: Vec<Header>,
|
||||
rewrite_rules: Vec<RewriteRule>,
|
||||
}
|
||||
```
|
||||
|
||||
**Validation Rules**:
|
||||
- `server_name` must be valid domain(s)
|
||||
- `listen_port` must be 1-65535
|
||||
- SSL certificate must exist if `ssl_enabled` is true
|
||||
- At least one location must be defined
|
||||
|
||||
---
|
||||
|
||||
### CM-002: Upstream Configuration
|
||||
|
||||
**Description**: Define backend server pools for load balancing.
|
||||
|
||||
**Upstream Entity**:
|
||||
```rust
|
||||
struct Upstream {
|
||||
id: Uuid,
|
||||
workspace_id: Uuid,
|
||||
name: String, // Used as upstream identifier
|
||||
|
||||
// Load balancing algorithm
|
||||
algorithm: LoadBalanceAlgorithm, // RoundRobin, LeastConn, IPHash, etc.
|
||||
|
||||
// Backend servers
|
||||
servers: Vec<UpstreamServer>,
|
||||
|
||||
// Health check configuration
|
||||
health_check: Option<HealthCheckConfig>,
|
||||
|
||||
// Connection settings
|
||||
keepalive_connections: Option<u32>,
|
||||
keepalive_timeout: Option<u32>,
|
||||
}
|
||||
|
||||
struct UpstreamServer {
|
||||
address: String, // IP:port or hostname:port
|
||||
weight: u32, // Default: 1
|
||||
backup: bool, // Backup server
|
||||
down: bool, // Temporarily down
|
||||
max_fails: u32, // Default: 1
|
||||
fail_timeout: u32, // Seconds, default: 10
|
||||
}
|
||||
|
||||
enum LoadBalanceAlgorithm {
|
||||
RoundRobin,
|
||||
LeastConnections,
|
||||
IPHash,
|
||||
WeightedRoundRobin,
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CM-003: Configuration Versioning
|
||||
|
||||
**Description**: Track all configuration changes with full history.
|
||||
|
||||
**Versioning Features**:
|
||||
- Every change creates a new version
|
||||
- Versions are immutable
|
||||
- Rollback to any previous version
|
||||
- Diff between versions
|
||||
- Audit log of who changed what
|
||||
|
||||
**Version Entity**:
|
||||
```rust
|
||||
struct ConfigVersion {
|
||||
id: Uuid,
|
||||
resource_type: String, // "virtual_host", "upstream", etc.
|
||||
resource_id: Uuid,
|
||||
version_number: u64, // Auto-incrementing
|
||||
data: Json, // Full configuration snapshot
|
||||
checksum: String, // SHA-256 of data
|
||||
created_by: Uuid, // User ID
|
||||
created_at: DateTime,
|
||||
change_summary: String, // Human-readable description
|
||||
}
|
||||
```
|
||||
|
||||
**API Endpoints**:
|
||||
- `GET /api/v1/virtual-hosts/{id}/versions` - List versions
|
||||
- `GET /api/v1/virtual-hosts/{id}/versions/{version}` - Get specific version
|
||||
- `POST /api/v1/virtual-hosts/{id}/rollback` - Rollback to version
|
||||
- `GET /api/v1/virtual-hosts/{id}/diff?from=v1&to=v2` - Compare versions
|
||||
|
||||
---
|
||||
|
||||
## Observability
|
||||
|
||||
### OB-001: Structured Logging
|
||||
|
||||
**Description**: Comprehensive logging with structured format.
|
||||
|
||||
**Log Levels**: ERROR, WARN, INFO, DEBUG, TRACE
|
||||
|
||||
**Log Fields**:
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-03-02T10:30:00Z",
|
||||
"level": "INFO",
|
||||
"component": "agent",
|
||||
"agent_id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"trace_id": "abc123",
|
||||
"span_id": "def456",
|
||||
"message": "Configuration applied successfully",
|
||||
"fields": {
|
||||
"config_id": "config-123",
|
||||
"version": 42,
|
||||
"duration_ms": 150
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Log Targets**:
|
||||
- Master: systemd journal, file, or centralized (ELK/Loki)
|
||||
- Agent: stdout (Docker), file (standalone), or remote
|
||||
|
||||
---
|
||||
|
||||
### OB-002: Distributed Tracing
|
||||
|
||||
**Description**: OpenTelemetry tracing for request flow visualization.
|
||||
|
||||
**Traced Operations**:
|
||||
- Configuration push (master → agent → nginx)
|
||||
- Health check cycles
|
||||
- Certificate issuance
|
||||
- API requests
|
||||
|
||||
**Span Attributes**:
|
||||
- `nxmesh.agent_id`
|
||||
- `nxmesh.config_id`
|
||||
- `nxmesh.workspace_id`
|
||||
- `nxmesh.organization_id`
|
||||
|
||||
---
|
||||
|
||||
### OB-003: Access Log Aggregation
|
||||
|
||||
**Description**: Collect and query nginx access logs from all agents.
|
||||
|
||||
**Features**:
|
||||
- Centralized access log storage
|
||||
- Real-time log streaming
|
||||
- SQL-like query interface
|
||||
- Log retention policies
|
||||
|
||||
**Access Log Schema**:
|
||||
```rust
|
||||
struct AccessLogEntry {
|
||||
id: Uuid,
|
||||
agent_id: Uuid,
|
||||
timestamp: DateTime,
|
||||
|
||||
// Request details
|
||||
remote_addr: String,
|
||||
method: String,
|
||||
uri: String,
|
||||
protocol: String,
|
||||
host: String,
|
||||
|
||||
// Response details
|
||||
status: u16,
|
||||
body_bytes_sent: u64,
|
||||
response_time_ms: f64,
|
||||
|
||||
// Additional fields
|
||||
user_agent: Option<String>,
|
||||
referer: Option<String>,
|
||||
request_id: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
**Query API**:
|
||||
```graphql
|
||||
# Example query
|
||||
query {
|
||||
accessLogs(
|
||||
filter: {
|
||||
agentId: "...",
|
||||
timeRange: { from: "2026-03-01", to: "2026-03-02" },
|
||||
statusCode: { gte: 500 }
|
||||
},
|
||||
limit: 100
|
||||
) {
|
||||
timestamp
|
||||
method
|
||||
uri
|
||||
status
|
||||
responseTimeMs
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Security Features
|
||||
|
||||
### SF-001: Authentication and Authorization
|
||||
|
||||
**Description**: Multi-method authentication with fine-grained RBAC.
|
||||
|
||||
**Authentication Methods**:
|
||||
- JWT (for API/Web UI)
|
||||
- Password-based login (local user accounts)
|
||||
- OAuth2/OIDC (Google, GitHub, enterprise SSO)
|
||||
- API Keys (for service accounts)
|
||||
- **TLS + Shared Secret** (for agent communication)
|
||||
- Server-side TLS (auto-generated self-signed or custom certificates)
|
||||
- Bootstrap token for initial registration
|
||||
- Session key with HMAC signing for ongoing requests
|
||||
- Primary/secondary key rotation
|
||||
|
||||
**RBAC Model**:
|
||||
```rust
|
||||
struct Role {
|
||||
id: Uuid,
|
||||
name: String,
|
||||
permissions: Vec<Permission>,
|
||||
}
|
||||
|
||||
enum Permission {
|
||||
// Organization scope
|
||||
OrganizationRead,
|
||||
OrganizationWrite,
|
||||
OrganizationDelete,
|
||||
|
||||
// Workspace scope
|
||||
WorkspaceRead,
|
||||
WorkspaceWrite,
|
||||
WorkspaceDelete,
|
||||
|
||||
// Agent scope
|
||||
AgentRead,
|
||||
AgentWrite,
|
||||
AgentReload,
|
||||
AgentDelete,
|
||||
|
||||
// Config scope
|
||||
ConfigRead,
|
||||
ConfigWrite,
|
||||
ConfigDeploy,
|
||||
ConfigDelete,
|
||||
|
||||
// Certificate scope
|
||||
CertificateRead,
|
||||
CertificateWrite,
|
||||
CertificateDelete,
|
||||
|
||||
// User management
|
||||
UserRead,
|
||||
UserWrite,
|
||||
UserDelete,
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### SF-002: Secret Management
|
||||
|
||||
**Description**: Secure storage and distribution of sensitive data.
|
||||
|
||||
**Secrets**:
|
||||
- SSL private keys
|
||||
- API tokens
|
||||
- Database passwords
|
||||
- External service credentials
|
||||
|
||||
**Security Measures**:
|
||||
- Encryption at rest (AES-256-GCM)
|
||||
- Encryption in transit (TLS 1.3)
|
||||
- Automatic secret rotation
|
||||
- Audit logging for secret access
|
||||
|
||||
---
|
||||
|
||||
### SF-003: Network Security
|
||||
|
||||
**Description**: Network-level security controls.
|
||||
|
||||
**Features**:
|
||||
- IP allowlisting for agent connections
|
||||
- Rate limiting on API endpoints
|
||||
- DDoS protection recommendations
|
||||
- Security headers enforcement (HSTS, CSP, etc.)
|
||||
|
||||
**Agent Connection Security**:
|
||||
- **TLS Encryption**: Server-side TLS (auto-generated or custom certificates)
|
||||
- Development: Self-signed certificates auto-generated on first start
|
||||
- Production: Valid certificates (Let's Encrypt or corporate CA)
|
||||
- **Bootstrap Authentication**: One-time token for initial registration
|
||||
- **Session Authentication**: HMAC-signed requests with shared session key
|
||||
- **Key Rotation**: Primary/secondary key design for seamless rotation
|
||||
- **Certificate Pinning**: Optional fingerprint verification for additional security
|
||||
Reference in New Issue
Block a user