491 lines
26 KiB
Markdown
491 lines
26 KiB
Markdown
# DreamChat System Architecture
|
|
|
|
## Overview
|
|
|
|
DreamChat is a character simulation platform built with a modular, extensible architecture. The system follows clean architecture principles with clear separation of concerns.
|
|
|
|
## High-Level Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ Client Layer │
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
|
|
│ │ React │ │ Vite │ │ WebSocket │ │ OpenAPI Generator │ │
|
|
│ │ (UI) │ │ (Build) │ │ Client │ │ (API Client) │ │
|
|
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ API Gateway Layer │
|
|
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
|
│ │ NestJS Backend │ │
|
|
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
|
|
│ │ │ Auth │ │ Guards │ │ Validation │ │ WebSocket │ │ │
|
|
│ │ │ Module │ │ (JWT/Keycloak)│ │ Pipes │ │ Gateway │ │ │
|
|
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │
|
|
│ └─────────────────────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌──────────────────┼──────────────────┐
|
|
▼ ▼ ▼
|
|
┌──────────────────────┐ ┌──────────────────────┐ ┌──────────────────────┐
|
|
│ Domain Modules │ │ Service Layer │ │ Infrastructure │
|
|
│ ┌────────────────┐ │ │ ┌────────────────┐ │ │ ┌────────────────┐ │
|
|
│ │ Character │ │ │ │ LLM Service │ │ │ │ LangChain │ │
|
|
│ │ Module │ │ │ │ (OpenRouter) │ │ │ │ Integration │ │
|
|
│ └────────────────┘ │ │ └────────────────┘ │ │ └────────────────┘ │
|
|
│ ┌────────────────┐ │ │ ┌────────────────┐ │ │ ┌────────────────┐ │
|
|
│ │ Chat Module │ │ │ │ Memory Service │ │ │ │ Vector Store │ │
|
|
│ │ (MVP Focus) │ │ │ │ (Vector DB) │ │ │ │ (pgvector) │ │
|
|
│ └────────────────┘ │ │ └────────────────┘ │ │ └────────────────┘ │
|
|
│ ┌────────────────┐ │ │ ┌────────────────┐ │ │ ┌────────────────┐ │
|
|
│ │ Story Module │ │ │ │ Import Service │ │ │ │ Puppeteer │ │
|
|
│ │ (Phase 2) │ │ │ │ (Adapter) │ │ │ │ (Scraper) │ │
|
|
│ └────────────────┘ │ │ └────────────────┘ │ │ └────────────────┘ │
|
|
│ ┌────────────────┐ │ │ ┌────────────────┐ │ │ ┌────────────────┐ │
|
|
│ │ Multi-Char │ │ │ │ File Processor │ │ │ │ PDF Parser │ │
|
|
│ │ (Phase 3) │ │ │ │ Service │ │ │ │ MD Parser │ │
|
|
│ └────────────────┘ │ │ └────────────────┘ │ │ └────────────────┘ │
|
|
└──────────────────────┘ └──────────────────────┘ └──────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ Data Layer │
|
|
│ ┌─────────────┐ ┌─────────────────────┐ ┌─────────────────────────────┐ │
|
|
│ │ PostgreSQL │ │ pgvector Extension │ │ Keycloak (External) │ │
|
|
│ │ (Primary) │ │ (Vector Store) │ │ (Auth Provider) │ │
|
|
│ └─────────────┘ └─────────────────────┘ └─────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Module Dependencies
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Application Module │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
|
|
│ │ Auth │ │ User │ │ Config │ │
|
|
│ │ Module │──│ Module │ │ (Global) │ │
|
|
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────────┼─────────────────────┐
|
|
▼ ▼ ▼
|
|
┌───────────────┐ ┌───────────────┐ ┌───────────────────┐
|
|
│ Character │ │ Chat │ │ Import/Export │
|
|
│ Module │ │ Module │ │ Module │
|
|
│ │ │ (MVP Focus) │ │ │
|
|
│ • Attributes │ │ │ │ • File Adapter │
|
|
│ • Personality│ │ • Messages │ │ • Web Adapter │
|
|
│ • Backstory │ │ • WebSocket │ │ • Preprocessing │
|
|
└───────────────┘ └───────────────┘ └───────────────────┘
|
|
│
|
|
┌───────────────┴───────────────┐
|
|
▼ ▼
|
|
┌───────────────────┐ ┌───────────────────┐
|
|
│ Story Module │ │ Multi-Char Module│
|
|
│ (Phase 2) │ │ (Phase 3) │
|
|
│ │ │ │
|
|
│ • Branching Tree │ │ • Group Chat │
|
|
│ • Open-ended Gen │ │ • Char-to-Char │
|
|
│ • Tree View API │ │ • Address Direct │
|
|
└─────────────────────┘ └───────────────────┘
|
|
```
|
|
|
|
## Component Details
|
|
|
|
### Backend (NestJS)
|
|
|
|
#### 1. Auth Module
|
|
```typescript
|
|
// Dual authentication strategy
|
|
- KeycloakStrategy (OAuth2/OIDC)
|
|
- LocalStrategy (Password-based)
|
|
- JWT Guard for stateless auth
|
|
- Roles: USER, ADMIN
|
|
|
|
// Prisma User model
|
|
- id, email, username
|
|
- passwordHash, keycloakSub
|
|
- role, isActive
|
|
```
|
|
|
|
#### 2. Character Module
|
|
```typescript
|
|
- CharacterController (REST)
|
|
- CharacterService (Business Logic)
|
|
- CharacterRepository (Prisma)
|
|
- DTOs: CreateCharacterDto, UpdateCharacterDto, CharacterResponseDto
|
|
|
|
Entities:
|
|
- Character
|
|
- id, name, avatar
|
|
- personalityPrompt: string
|
|
- attributes: JSON (complex attribute system)
|
|
- backstory: string
|
|
- createdBy: User
|
|
- createdAt, updatedAt
|
|
```
|
|
|
|
#### 3. Chat Module (MVP)
|
|
```typescript
|
|
- ChatGateway (WebSocket)
|
|
- ChatService
|
|
- MessageService
|
|
- ConversationRepository (Prisma)
|
|
|
|
Prisma Models:
|
|
- Conversation
|
|
- id, title
|
|
- characterId (relation)
|
|
- userId (relation)
|
|
- messages: Message[]
|
|
- messageCount, totalTokens
|
|
- createdAt, updatedAt
|
|
|
|
- Message
|
|
- id, role (MessageRole enum: user | assistant | system)
|
|
- content: String
|
|
- tokensUsed: Int?
|
|
- model: String?
|
|
- metadata: Json?
|
|
- conversationId (relation)
|
|
- createdAt: DateTime
|
|
|
|
WebSocket Events:
|
|
- client:send_message → server:receive_message → llm:generate → server:stream_response → client:receive_chunk
|
|
```
|
|
|
|
#### 4. Memory Service (LangChain + pgvector + Local Embeddings)
|
|
```typescript
|
|
- EmbeddingService (Adapter Pattern)
|
|
- generateEmbeddings(texts: string[]): Promise<number[][]>
|
|
- getDimension(): number
|
|
|
|
Implementations:
|
|
- LocalEmbeddingProvider: Loads HuggingFace model via @xenova/transformers
|
|
- HuggingFaceAPIProvider: Uses HuggingFace Inference API
|
|
|
|
- VectorStoreService (uses Prisma with pgvector extension)
|
|
- addDocument(conversationId, content, metadata)
|
|
- similaritySearch(conversationId, query, k=5)
|
|
- Uses raw Prisma queries with pgvector operators
|
|
|
|
- MemoryManager
|
|
- buildContext(conversationId, currentMessage): string
|
|
- summarizeOldMessages(conversationId): Promise<void>
|
|
- retrieveRelevantMemories(conversationId, query): Document[]
|
|
|
|
Prisma Model:
|
|
- VectorMemory
|
|
- id
|
|
- conversationId (relation)
|
|
- content: String
|
|
- embedding: Unsupported("vector") // pgvector type
|
|
- metadata: Json?
|
|
- createdAt: DateTime
|
|
```
|
|
|
|
#### 5. LLM Service (Adapter Pattern)
|
|
```typescript
|
|
interface LLMProvider {
|
|
generate(messages: Message[]): Promise<string>;
|
|
stream(messages: Message[]): AsyncIterable<string>;
|
|
getTokenCount(text: string): number;
|
|
}
|
|
|
|
class OpenRouterProvider implements LLMProvider { ... }
|
|
class OpenAIProvider implements LLMProvider { ... }
|
|
class OllamaProvider implements LLMProvider { ... }
|
|
|
|
// Configuration via environment
|
|
LLM_PROVIDER=openrouter
|
|
LLM_MODEL=openai/gpt-4o
|
|
LLM_API_KEY=...
|
|
```
|
|
|
|
#### 6. Import Module (Adapter Pattern)
|
|
```typescript
|
|
interface ImportAdapter {
|
|
canHandle(source: ImportSource): boolean;
|
|
import(source: ImportSource): Promise<Document[]>;
|
|
}
|
|
|
|
// File Adapters
|
|
class TextFileAdapter implements ImportAdapter { ... }
|
|
class PdfFileAdapter implements ImportAdapter { ... }
|
|
class MarkdownFileAdapter implements ImportAdapter { ... }
|
|
|
|
// Web Adapters (Predefined scrapers)
|
|
abstract class WebScraperAdapter implements ImportAdapter {
|
|
protected abstract canHandleUrl(url: string): boolean;
|
|
protected abstract extractContent(page: Page): Promise<string>;
|
|
}
|
|
|
|
class AO3Scraper extends WebScraperAdapter { ... }
|
|
class FanfictionNetScraper extends WebScraperAdapter { ... }
|
|
// Each scraper validates URL pattern before processing
|
|
// Uses Puppeteer for headless browser
|
|
|
|
// Data Preprocessing Pipeline
|
|
class DataPreprocessor {
|
|
clean(text: string): string;
|
|
chunk(text: string, maxChunkSize: number): string[];
|
|
extractEntities(text: string): Entity[];
|
|
}
|
|
```
|
|
|
|
### Frontend (React + Vite)
|
|
|
|
#### Component Hierarchy
|
|
```
|
|
App
|
|
├── AuthProvider (Keycloak + Local)
|
|
├── Router
|
|
│ ├── /login
|
|
│ │ └── LoginPage
|
|
│ │ ├── KeycloakLoginButton
|
|
│ │ └── PasswordLoginForm
|
|
│ ├── /characters
|
|
│ │ └── CharacterListPage
|
|
│ │ ├── CharacterCard[]
|
|
│ │ └── CreateCharacterButton
|
|
│ ├── /characters/:id
|
|
│ │ └── CharacterDetailPage
|
|
│ │ ├── CharacterAttributesEditor
|
|
│ │ ├── PersonalityPromptEditor
|
|
│ │ └── ChatHistory (if any)
|
|
│ ├── /chat/:conversationId (MVP Focus)
|
|
│ │ └── ChatPage
|
|
│ │ ├── ChatHeader (character info)
|
|
│ │ ├── MessageList
|
|
│ │ │ └── MessageBubble[]
|
|
│ │ └── ChatInput
|
|
│ │ └── MessageComposer
|
|
│ ├── /stories (Phase 2)
|
|
│ │ └── StoryListPage
|
|
│ │ └── StoryTreeView
|
|
│ └── /import
|
|
│ └── ImportPage
|
|
│ ├── FileUpload (Drag & Drop)
|
|
│ │ └── FileDropzone
|
|
│ ├── UrlInput
|
|
│ │ └── ScraperSelector
|
|
│ └── ProcessingProgress
|
|
└── Layout
|
|
├── Sidebar
|
|
└── Header
|
|
```
|
|
|
|
#### State Management
|
|
```typescript
|
|
// Using Zustand or React Query
|
|
- authStore: AuthState
|
|
- characterStore: Character[]
|
|
- chatStore:
|
|
- currentConversation: Conversation
|
|
- messages: Message[]
|
|
- isStreaming: boolean
|
|
- wsConnection: WebSocket
|
|
- importStore: ImportJob[]
|
|
```
|
|
|
|
#### API Client Generation
|
|
```bash
|
|
# Generated from OpenAPI spec
|
|
npx openapi-generator-cli generate \
|
|
-i http://localhost:3000/api-json \
|
|
-g typescript-fetch \
|
|
-o src/api/generated
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### Chat Flow (MVP)
|
|
```
|
|
┌─────────┐ ┌──────────┐ ┌────────────┐ ┌─────────────┐
|
|
│ User │────▶│ Frontend │────▶│ WebSocket │────▶│ Chat │
|
|
│ │ │ │ │ Gateway │ │ Gateway │
|
|
└─────────┘ └──────────┘ └────────────┘ └──────┬──────┘
|
|
│
|
|
┌────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌────────────────────────────────────────────────────────┐
|
|
│ ChatService │
|
|
│ 1. Save user message to DB │
|
|
│ 2. Call MemoryManager.buildContext() │
|
|
│ 3. Retrieve relevant memories (vector search) │
|
|
│ 4. Build system prompt + context + user message │
|
|
│ 5. Call LLMService.generateStream() │
|
|
└────────────────────────┬───────────────────────────────┘
|
|
│
|
|
▼
|
|
┌────────────────────────────────────────────────────────┐
|
|
│ LLMService │
|
|
│ 1. Select provider (OpenRouter) │
|
|
│ 2. Format messages for provider │
|
|
│ 3. Stream response chunks │
|
|
│ 4. Return async iterator │
|
|
└────────────────────────┬───────────────────────────────┘
|
|
│
|
|
▼
|
|
┌────────────────────────────────────────────────────────┐
|
|
│ Stream Response │
|
|
│ 1. Send chunks via WebSocket │
|
|
│ 2. Accumulate full response │
|
|
│ 3. Save assistant message to DB │
|
|
│ 4. Store in vector memory │
|
|
└────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### File Import Flow
|
|
```
|
|
┌──────────┐ ┌─────────────┐ ┌──────────────┐ ┌─────────────┐
|
|
│ User │────▶│ Frontend │────▶│ POST /api │────▶│ Import │
|
|
│ Upload │ │ FileSelect │ │ /import/file│ │ Controller│
|
|
└──────────┘ └─────────────┘ └──────────────┘ └──────┬──────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ ImportService │
|
|
│ 1. Validate file (type, size < 50MB) │
|
|
│ 2. Select adapter based on mime-type │
|
|
│ 3. Parse file to raw text │
|
|
│ 4. Run DataPreprocessor.clean() │
|
|
│ 5. Chunk into segments │
|
|
│ 6. Store in import_documents table │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Web Scraping Flow
|
|
```
|
|
┌──────────┐ ┌─────────────┐ ┌──────────────┐ ┌─────────────┐
|
|
│ User │────▶│ Frontend │────▶│ POST /api │────▶│ Import │
|
|
│ Enter URL│ │ URL Input │ │ /import/url │ │ Controller│
|
|
└──────────┘ └─────────────┘ └──────────────┘ └──────┬──────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ WebImportService │
|
|
│ 1. Validate URL format │
|
|
│ 2. Find matching scraper (or reject) │
|
|
│ 3. Launch Puppeteer, navigate to URL │
|
|
│ 4. Extract content using scraper selectors │
|
|
│ 5. Run DataPreprocessor.clean() │
|
|
│ 6. Chunk and store │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Directory Structure (pnpm Monorepo)
|
|
|
|
```
|
|
dreamchat/
|
|
├── apps/
|
|
│ ├── backend/
|
|
│ │ ├── src/
|
|
│ │ │ ├── app.module.ts
|
|
│ │ │ ├── main.ts
|
|
│ │ │ ├── config/
|
|
│ │ │ ├── common/
|
|
│ │ │ ├── modules/
|
|
│ │ │ │ ├── auth/
|
|
│ │ │ │ ├── users/
|
|
│ │ │ │ ├── characters/
|
|
│ │ │ │ ├── chat/
|
|
│ │ │ │ ├── import/
|
|
│ │ │ │ ├── story/
|
|
│ │ │ │ └── multi-character/
|
|
│ │ │ └── shared/
|
|
│ │ │ ├── services/
|
|
│ │ │ └── prisma/
|
|
│ │ ├── test/
|
|
│ │ ├── Dockerfile
|
|
│ │ └── package.json
|
|
│ │
|
|
│ └── frontend/
|
|
│ ├── src/
|
|
│ │ ├── main.tsx
|
|
│ │ ├── App.tsx
|
|
│ │ ├── api/
|
|
│ │ ├── components/
|
|
│ │ ├── pages/
|
|
│ │ ├── stores/
|
|
│ │ ├── hooks/
|
|
│ │ └── utils/
|
|
│ ├── public/
|
|
│ ├── Dockerfile
|
|
│ └── package.json
|
|
│
|
|
├── packages/
|
|
│ ├── shared/ # Shared types & WebSocket definitions
|
|
│ │ ├── src/
|
|
│ │ │ ├── websocket/
|
|
│ │ │ │ ├── events.ts # WebSocket event types
|
|
│ │ │ │ ├── messages.ts
|
|
│ │ │ │ └── index.ts
|
|
│ │ │ ├── api/
|
|
│ │ │ │ ├── dto.ts # Shared DTOs
|
|
│ │ │ │ └── index.ts
|
|
│ │ │ └── index.ts
|
|
│ │ ├── package.json
|
|
│ │ └── tsconfig.json
|
|
│ │
|
|
│ └── config/ # Shared configurations
|
|
│ ├── eslint/
|
|
│ └── typescript/
|
|
│
|
|
├── prisma/ # Database schema (shared)
|
|
│ ├── schema.prisma
|
|
│ ├── migrations/
|
|
│ └── seed.ts
|
|
│
|
|
├── docker-compose.yml
|
|
├── pnpm-workspace.yaml
|
|
├── package.json # Root package.json
|
|
├── .npmrc
|
|
├── .devcontainer/
|
|
└── doc/
|
|
```
|
|
|
|
### Package Management
|
|
|
|
```yaml
|
|
# pnpm-workspace.yaml
|
|
packages:
|
|
- 'apps/*'
|
|
- 'packages/*'
|
|
```
|
|
|
|
```bash
|
|
# Install all dependencies
|
|
pnpm install
|
|
|
|
# Add dependency to specific app
|
|
pnpm --filter @dreamchat/backend add @nestjs/jwt
|
|
|
|
# Add shared package to apps
|
|
pnpm --filter @dreamchat/backend add @dreamchat/shared@workspace:*
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
1. **Authentication**: JWT tokens with refresh strategy
|
|
2. **Authorization**: Role-based access control (RBAC)
|
|
3. **File Upload**:
|
|
- Size limit: 50MB
|
|
- Mime-type validation
|
|
- Storage outside web root
|
|
4. **Web Scraping**:
|
|
- URL whitelist (predefined scrapers only)
|
|
- Rate limiting per domain
|
|
- Content sanitization
|
|
5. **WebSocket**:
|
|
- Auth token validation on connection
|
|
- Message rate limiting per user
|
|
6. **Database**:
|
|
- Prepared statements (Prisma)
|
|
- Connection pooling
|