๐Ÿ“ฆ vinta / pangu.js

๐Ÿ“„ CLAUDE.md ยท 269 lines
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269# CLAUDE.md

Extends: @~/.claude/CLAUDE.md (mandatory base instructions)

Everything in the base instructions MUST be followed strictly.

## Project Overview

`pangu.js` is a text spacing library that automatically inserts whitespace between CJK (Chinese, Japanese, Korean) characters and half-width characters (alphabetical letters, numerical digits, and symbols) for better readability.

- Language: TypeScript (migrated from JavaScript)
- Dependencies: Zero runtime dependencies
- Build targets:
  1. npm package - JavaScript library for Node.js and browser use (ESM/CommonJS/UMD)
  2. Chrome extension (Manifest V3) - Automatically adds spacing to web pages

## Common Development Commands

### Building

```bash
npm run build              # Build all targets (library + extension)
npm run build:lib          # Build library (shared, browser, node)
npm run build:extension    # Build Chrome extension TypeScript files
npm run clean              # Clean all build artifacts
npm run watch              # Watch both library and extension files
npm run watch:lib          # Watch library files for changes
npm run watch:extension    # Watch extension files (uses nodemon)
```

### Testing

```bash
npm run test               # Run all tests (vitest + playwright)
npm run test:shared        # Test core/shared logic
npm run test:node          # Test Node.js-specific code
npm run test:browser       # Test browser code (uses Playwright)
```

### Linting

```bash
npm run lint               # Run ESLint on src/ and scripts/
npm run lint:fix           # Run ESLint with auto-fix
```

### Publishing & Packaging

```bash
# Version management
npm run publish-package 1.2.3   # Bump version, update docs, build, commit, and tag

# Extension packaging
npm run pack-extension          # Package browser extensions
npm run pack-extension:chrome   # Package Chrome extension only (.zip)
```

## Code Architecture

### Directory Structure

```
src/
โ”œโ”€โ”€ shared/                    # Core text spacing logic (platform-agnostic)
โ”‚   โ””โ”€โ”€ index.ts               # Main Pangu class with regex patterns
โ”œโ”€โ”€ browser/                   # Browser-specific implementation
โ”‚   โ”œโ”€โ”€ pangu.ts               # BrowserPangu class with DOM manipulation
โ”‚   โ”œโ”€โ”€ pangu.umd.ts           # UMD wrapper for browser builds
โ”‚   โ”œโ”€โ”€ dom-walker.ts          # DOM tree traversal utilities
โ”‚   โ”œโ”€โ”€ task-scheduler.ts      # Idle-time task scheduling
โ”‚   โ”œโ”€โ”€ visibility-detector.ts # CSS visibility detection
โ”‚   โ””โ”€โ”€ banner.txt             # Build banner text
โ””โ”€โ”€ node/                      # Node.js implementation
    โ”œโ”€โ”€ index.ts               # NodePangu class with file operations
    โ”œโ”€โ”€ index.cjs.ts           # CommonJS re-export wrapper
    โ””โ”€โ”€ cli.ts                 # Command-line interface
```

### Build Output Structure

```
dist/                           # Library builds
โ”œโ”€โ”€ shared/                     # Core module
โ”‚   โ”œโ”€โ”€ index.js                # ESM module
โ”‚   โ””โ”€โ”€ index.cjs               # CommonJS module
โ”œโ”€โ”€ browser/                    # Browser builds
โ”‚   โ”œโ”€โ”€ pangu.js                # ESM bundle
โ”‚   โ””โ”€โ”€ pangu.umd.js            # UMD bundle (window.pangu)
โ””โ”€โ”€ node/                       # Node.js builds
    โ”œโ”€โ”€ index.js                # ESM module
    โ”œโ”€โ”€ index.cjs               # CommonJS module
    โ”œโ”€โ”€ cli.js                  # CLI executable (ESM)
    โ””โ”€โ”€ cli.cjs                 # CLI executable (CommonJS)
```

### Core API

**All Platforms (Shared):**

- `spacingText(text)` - Process text strings (main method)
- `hasProperSpacing(text)` - Check if text already has proper spacing

**Node.js-specific (NodePangu):**

- `spacingFile(path)` - Process files asynchronously (returns Promise)
- `spacingFileSync(path)` - Process files synchronously

**Browser-specific (BrowserPangu):**

- `spacingPage()` - Process entire page (title + body)
- `spacingNode(node)` - Process specific DOM node and its descendants
- `autoSpacingPage(config?)` - Auto-spacing with MutationObserver
  - `config.pageDelayMs` - Delay before initial page spacing (default: 1000ms)
  - `config.nodeDelayMs` - Delay before spacing new nodes (default: 500ms)
  - `config.nodeMaxWaitMs` - Max wait time for node mutations (default: 2000ms)
- `stopAutoSpacingPage()` - Stop auto-spacing
- `isElementVisuallyHidden(element)` - Check if element is hidden by CSS
- `taskScheduler.config` - Task scheduling configuration (direct property access)
- `visibilityDetector.config` - Visibility detection configuration (direct property access)

### Build System

- **Build Tool**: Vite 6.x with TypeScript
- **TypeScript**: Configured with separate tsconfig files for browser/node
- **Output Formats**: ESM, CommonJS, and UMD
- **Source Maps**: Generated for all builds
- **Type Definitions**: Auto-generated .d.ts files with vite-plugin-dts
- **Watch Mode**: Concurrent watch for library and extension development

### Testing Strategy

- **Unit Tests**: Vitest 3.x for shared/node code
- **Browser Tests**: Playwright 1.53.x for cross-browser testing
- **Test Fixtures**: Located in `fixtures/`
- **Coverage**: 106 tests covering various Unicode blocks
- **Test Structure**: Separate test directories for shared, node, and browser code

### Chrome Extension

- **Manifest Version**: V3 (modern Chrome extension format)
- **Location**: `browser-extensions/chrome/`
- **Source**: TypeScript files in `browser-extensions/chrome/src/`
- **Build Output**: `browser-extensions/chrome/dist/`
- **Build Command**: `npm run build:extension`
- **Permissions**: Uses `activeTab` instead of broad `tabs` permission
- **Content Scripts**: Dynamically registered based on user settings
- **Match Patterns**: Uses Chrome's match pattern format for blacklist/whitelist
- **UI Framework**: Pure TypeScript

### Chrome Extension Architecture

- **Service Worker**: `service-worker.ts` - Handles background tasks and content script registration
- **Content Script**: `content-script.ts` - Injected into web pages for auto-spacing
- **Popup**: `popup.ts` - Extension popup UI
- **Options**: `options.ts` - Settings page
- **Utils**: `utils/` - Shared utilities including type definitions

#### Settings Structure

```typescript
interface Settings {
  spacing_mode: 'spacing_when_load' | 'spacing_when_click';
  spacing_rule: 'blacklist' | 'whitelist';
  blacklist: string[];
  whitelist: string[];
  is_mute_sound_effects: boolean;
}
```

#### Idle Processing Configuration

```typescript
interface IdleSpacingConfig {
  enabled: boolean; // Default: true
  chunkSize: number; // Default: 40 (text nodes per cycle)
  timeout: number; // Default: 2000ms
}
```

#### Visibility Check Configuration

```typescript
interface VisibilityCheckConfig {
  enabled: boolean; // Default: true in VisibilityDetector, false in BrowserPangu
  commonHiddenPatterns: {
    clipRect: boolean; // clip: rect(1px, 1px, 1px, 1px)
    displayNone: boolean; // display: none
    visibilityHidden: boolean; // visibility: hidden
    opacityZero: boolean; // opacity: 0
    heightWidth1px: boolean; // height: 1px; width: 1px
  };
}
```

#### Task Scheduler Configuration

```typescript
interface TaskSchedulerConfig {
  enabled: boolean; // Whether to use task scheduling
  chunkSize: number; // Number of tasks to process per chunk
  timeout: number; // Timeout between chunks in milliseconds
}
```

## Development Guidelines

### Code Style

- Follow existing patterns in the codebase
- ESLint 9.x with unicorn/prefer-node-protocol enabled
- Prettier 3.x with @trivago/prettier-plugin-sort-imports
- Maintain zero runtime dependencies
- Keep regex patterns readable with comments
- Always use `node:` prefix for Node.js built-in modules

### Implementation Details

- Core spacing logic: `src/shared/index.ts`
- Core test cases: `tests/shared/index.test.ts`
- Browser DOM processing: Uses TreeWalker API for 5.5x performance improvement
- Idle processing: Uses requestIdleCallback() for non-blocking operations
- Visibility detection: Detects CSS-hidden elements to avoid unnecessary spacing
- Task scheduling: `src/browser/task-scheduler.ts` - Manages task queue for async processing
- Visibility detector: `src/browser/visibility-detector.ts` - Checks element visibility

### Performance Optimizations v7

- **TreeWalker Migration**: Replaced XPath with TreeWalker API for ~5.5x performance gain
- **Idle Processing**: Heavy operations use requestIdleCallback() to prevent blocking
- **Visibility Checks**: Skip spacing for CSS-hidden elements (disabled by default)
- **Debounced MutationObserver**: Batches DOM mutations for efficient processing

### Paranoid Text Spacing Algorithm v7

**Core Features:**

- **Context-Aware Symbol Handling**:

  - Operators (`= + - * / < > & ^`): Always add spaces when CJK is present
  - Separators (`_ |`): Never add spaces regardless of context
  - Dual-behavior slash `/`: Single occurrence = operator (add spaces), multiple = file path separator (no spaces)

- **Smart Pattern Recognition**:

  - Preserves compound words: `state-of-the-art`, `GPT-5`, `claude-4-opus`
  - Handles programming terms correctly: `C++`, `A+`, `i++`, `D-`, `C#`, `F#`
  - Protects file paths: Unix (`/usr/bin`, `src/main.py`) and Windows (`C:\Users\`)
  - Special handling for grades: `A+` before CJK becomes `A+ ` not `A + `

- **Improved Punctuation**:

  - No longer converts half-width punctuation to full-width
  - Smart handling of quotes, brackets, and special characters
  - Preserves multiple consecutive punctuation marks

- **HTML Support**:

  - Processes text within HTML attributes while preserving tag structure
  - Protects HTML tags from being altered by spacing rules

- **Performance Enhancements**:
  - 5.5x faster with TreeWalker API replacing XPath
  - Non-blocking processing with requestIdleCallback()
  - CSS visibility detection to skip hidden elements

## Future Improvements

See @.claude/TODO.md for planned improvements and technical debt.