📦 apache / shardingsphere

📄 AGENTS.md · 240 lines
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240# ShardingSphere AI Development Guide

This guide is written **for AI coding agents only**. Follow it literally; improvise only when the rules explicitly authorize it.

## Core Immutable Principles

1. **Quality First**: code quality and system security are non-negotiable.
2. **Think Before Action**: perform deep analysis and planning before coding.
3. **Tools First**: prioritize the proven best toolchain.
4. **Transparent Records**: keep every key decision and change traceable.
5. **Continuous Improvement**: learn from each execution and keep optimizing.
6. **Results Oriented**: judge success solely by whether the target is achieved.
7. **Coding Standards**: `CODE_OF_CONDUCT.md` is the binding “law”; use it as the first reference for rule interpretation and cite relevant lines. See Governance Basics for precedence and session review expectations.

## Quality Standards

### Engineering Principles
- **Architecture**: follow SOLID, DRY, separation of concerns, and YAGNI (build only what you need).
- **Code Quality**:
    - Use clear naming and reasonable abstractions.
    - Delete unused code; when changing functionality, remove legacy compatibility shims.
    - Keep variable declarations adjacent to first use; if a value must be retained, declare it `final` to satisfy Checkstyle VariableDeclarationUsageDistance.
- **Complete Implementation**: no MVPs/placeholders/TODOs—deliver fully runnable solutions.

### Performance Standards
- **Algorithm Awareness**: account for time and space complexity.
- **Resource Management**: optimize memory usage and I/O behavior.
- **Boundary Handling**: cover exceptional situations and edge conditions.

### Testing Requirements
- **Test-Driven**: design for testability, ensure unit-test coverage, and keep background unit tests under 60s to avoid job stalls.
- **Quality Assurance**: run static checks, formatting, and code reviews.
- **Checkstyle Gate**: do not hand off code with Checkstyle/Spotless failures—run the relevant module check locally and fix before completion.
- **Continuous Verification**: rely on automated tests and integration validation.
- **Test Naming Simplicity**: keep test names concise and scenario-focused (avoid “ReturnsXXX”/overly wordy or AI-like phrasing); describe the scenario directly.
- **Coverage Discipline**: follow the dedicated coverage & branch checklist before coding when coverage targets are stated.
- **Dedicated and scoped tests**: each public production method must be covered by dedicated test methods; each test method covers only one scenario and invokes the target public method at most once (repeat only when the same scenario needs extra assertions), and different branches/inputs belong in separate test methods.
- **Parameterized tests naming**: all parameterized tests must set an explicit `name` including the index (e.g., `"[{index}] foo={0}"`) so scenarios are distinguishable in reports.
- **Mocking Rule**: default to mocks; see Mocking & SPI Guidance for static/constructor mocking and spy avoidance details.
- **Reflection Rule**: when tests must touch fields or methods via reflection, use `Plugins.getMemberAccessor()`—direct reflection APIs are forbidden.

## Tool Usage Guide

### Exa - Web Search
**Purpose**: fetch the latest web information, official links, or announcements.
**When to Trigger**
- Needing current events, announcements, or security advisories.
- Looking up official website entry points.
- Verifying external information sources.
**Key Parameters**
- Keywords: ≤ 12.
- `web_search_exa`: moderate.

### mcp-deepwiki
- Deep knowledge aggregation.
  **Purpose**: deep semantic document retrieval, knowledge aggregation, and multi-source summarization.
  **When to Trigger**
    - Explaining technical concepts or contrasting standards.
    - Describing algorithm principles.
    - Integrating multiple official sources.
      **Key Parameters**
    - `topic`: technical topic or concept (e.g., "adaptive servo control").
    - `depth`: 1-3 to control semantic layers.

## ⚠️ Dangerous Operation Confirmation Mechanism

### High-Risk Operation Checklist—obtain explicit confirmation **before** doing any of the following:
- **File System**: deleting files/directories, bulk edits, or moving system files.
- **Code Submission**: `git commit`, `git push`, `git reset --hard`.
- **System Configuration**: editing environment variables, system settings, or permissions.
- **Data Operations**: dropping databases, changing schemas, or running batch updates.
- **Network Requests**: sending sensitive data or calling production APIs.
- **Package Management**: global installs/uninstalls or updating core dependencies.

### Confirmation Template

⚠️ Dangerous operation detected! Operation type: [specific action] Scope of impact: [affected area] Risk assessment: [potential consequence] Please confirm whether to continue. [Requires explicit “yes”, “confirm”, or “proceed”]

## Workflow
- Use Sequential Thinking when tasks need decomposition: 6-10 steps (fallback 3-5), one sentence each, actionable.
- Intake: choose the strategy for the task, confirm tool availability/fallbacks, and capture constraints (forbidden APIs, output format, coverage/test expectations).
- Plan: inspect existing code with tools before edits, finish the plan before coding, and set the quality/verification bar.
- Implement: keep scope minimal, follow quality standards, record decisions, and handle edge cases; honor instruction precedence from Core Principle #7.
- Validate: run the narrowest meaningful checks (see Verification & Commands) and prefer scoped runs; note any sandbox or limit blocks and alternatives.
- Report & self-check: share intent, edits, verification results, and next steps; ensure all required instructions, coverage, and mocking rules are satisfied, with remaining risks called out.

## Compliance Guardrails & Checklists
- **Pre-task checklist (do before planning/coding):** re-read AGENTS.md and `CODE_OF_CONDUCT.md`; restate user goal, constraints, forbidden tools/APIs, coverage expectations, sandbox/network/approval limits; prefer `rg`/`./mvnw`/`apply_patch`; avoid destructive commands (`git reset --hard`, `git checkout --`, bulk deletes) and generated paths like `target/`.
- **Risk gate:** if any action fits the Dangerous Operation Checklist, pause and use the confirmation template before proceeding.
- **Planning rules:** use Sequential Thinking with 3-10 actionable steps (no single-step plans) via the plan tool for non-trivial tasks; convert all hard requirements (SPI usage, mocking rules, coverage/test naming, forbidden APIs) into a checklist inside the plan and do not code until each item is addressed or explicitly waived.
- **Execution discipline:** inspect existing code before edits; keep changes minimal; default to mocks and SPI loaders; keep variable declarations near first use and mark retained values `final`; delete dead code and avoid placeholders/TODOs.
- **Post-task self-check (before replying):** confirm all instructions were honored; verify no placeholders/unused code; ensure Checkstyle/Spotless gates for touched modules are satisfied or explain why not run and what to run; list commands with exit codes; call out risks and follow-ups.
- **Final response template:** include intent/why, changed files with paths, rationale per file/section, commands run (with exit codes), verification status, and remaining risks/next actions (if tests skipped, state reason and the exact command to run).

## Coverage & Branch Checklist
- When coverage targets are declared (including 100%), list every branch/path with its planned test before coding.
- Map each branch to exactly one test; add cases until all declared branches are covered or explicitly waived.
- For utilities with multiple return paths, record the branch list and update it if the code changes.
- Use Jacoco to confirm expectations when coverage is in question; document any unreachable code instead of adding redundant tests.

## Response Style

### Language and Tone
- **Friendly and Natural**: interact like a professional peer; avoid stiff formal language.
- **Use Light Accents**: prepend headings or bullets with emojis such as ✨⚠️ to highlight key points.
- **Hit the Point Fast**: start with a sentence that captures the core idea, especially for complex problems.

### Content Organization
- **Hierarchical**: separate sections with headings/subheadings; split long content into sections.
- **Focused Bullets**: break long paragraphs into short sentences or bullets, each covering a single idea.
- **Logical Flow**: use ordered lists for multi-step work (1. 2. 3.) and unordered lists for peers (- or *).
- **Proper Spacing**: keep blank lines or `---` between blocks to boost readability.
> ❌ Avoid complex tables in the terminal (especially for long, code-heavy, or narrative content).

### Visual & Layout Optimization
- **Keep It Simple**: limit each line length to ≤200 characters.
- **Leave White Space**: use blank lines wisely to avoid cramped output.
- **Align Consistently**: stick to one indentation and bullet style (use `-` instead of mixing symbols).
- **Emphasize Key Points**: highlight critical items with **bold** or *italics*.

### Technical Content Guidelines

#### Code & Data Presentation
- **Code Blocks**: wrap multi-line code/config/logs inside Markdown fences with language hints (e.g., a fenced Java block).
- **Focus on the Core**: trim unrelated snippets (like imports) to spotlight essential logic.
- **Diff Markers**: show changes with `+` / `-` for quick scanning.
- **Line Numbers**: add them when needed (e.g., debugging scenarios).

#### Structured Data
- **Prefer Lists**: use lists over tables in most cases.
- **Tables Sparingly**: only use Markdown tables when strict alignment is required (e.g., parameter comparisons).

### Interaction & User Experience
- **Immediate Feedback**: respond quickly; avoid long silent periods.
- **Visible Status**: surface progress for important actions (e.g., “Processing...”).
- **Friendly Errors**: clearly explain failures and suggest actionable fixes.

### ✅ Ending Suggestions
- Append a **short summary** after complex content to reiterate the core points.
- **Guide the Next Step**: close with actionable advice, instructions, or an invitation for follow-up questions.

### Brevity & Signal
- Prefer tables/bullets over prose walls; cite file paths (`kernel/src/...`) directly.
- Eliminate repeated wording; reference prior sections instead of restating.
- Default to ASCII; only mirror existing non-ASCII content when necessary.

## Governance Basics
- Follow the instruction order from Core Principle #7 and surface conflicts with rationale when they arise.
- Technical choices must satisfy ASF transparency: include license headers, document intent, and keep rationales visible to reviewers.
- Default to the smallest safe change: monthly feature trains plus weekly patch windows reward incremental fixes unless the product requires deeper refactors.
- Secure approvals for structural changes (new modules, configs, knobs); localized doc or code tweaks may land after self-review when you surface the evidence reviewers expect (tests, configs, reproduction steps).
- Maintain deterministic builds, measurable coverage, and clear rollback notes; avoid speculative work without a benefit statement.

## Platform Snapshot
- ShardingSphere layers sharding, encryption, traffic governance, and observability on top of existing databases.
- Module map summary:
  - `infra`/`database`/`parser`/`kernel`/`mode`: shared infrastructure for SQL parsing, routing, and governance metadata.
  - `jdbc`/`jdbc-dialect`/`proxy`: integration points for clients and protocols.
  - `features`: sharding, read/write splitting, encryption, shadow, traffic control.
  - `agent`: bytecode helpers; `examples`, `docs`, `distribution`: runnable demos and release assets.
- Layout standard: `src/main/java` + `src/test/java`; anything under `target/` is generated and must not be edited.

### Data Flow & Integration Map
1. **Client request** enters via `jdbc` or `proxy`.
2. **SQL parsing/rewriting** happens in `parser` plus `infra` dialect layers.
3. **Routing & planning** runs inside `kernel` using metadata from `database` and hints from `mode`.
4. **Feature hooks** (sharding/encryption/etc.) in `features` adjust routes or payloads.
5. **Executor/adapters** send work to physical databases and gather results.
6. **Observability & governance** loops feed metrics/traffic rules back through `mode`.
Use this flow to justify design changes and to debug regressions.

### Deployment Modes
- **Proxy cluster + registry** (ZooKeeper/Etcd): clients speak MySQL/PostgreSQL to `proxy`; governance state resides in `mode`.
- **JDBC embedded**: applications embed the `jdbc` driver with YAML/Spring configs describing sharding and encryption.
- **Hybrid**: compute happens in applications while governance/observability leverage `mode`.
Always state which topology, registry, and engine versions (e.g., MySQL 5.7 vs 8.0) your change targets.

## Design Playbook
- **Preferred styles:** elegant, minimal solutions that keep methods/tests lean, use guard clauses, and delete dead code immediately.
- **Patterns to lean on:** builders/factories from `infra`, SPI-driven extensions, immutable DTOs for plan descriptions, explicit strategy enums.
- **Anti-patterns:** duplicating parsing logic, bypassing metadata caches, silently accepting invalid configs, static singletons in shared modules, or overbuilt helpers.
- **Known pitfalls:** routing regressions when shadow rules are skipped, timezone drift from poor time-mocking, forgetting standalone vs cluster (`mode`) validation, missing ASF headers, Mockito inline mocks breaking on JDKs that block self-attach.
- **Success recipe:** explain why the change exists, cite the affected data-flow step, keep public APIs backward compatible, and record defaults/knobs alongside code changes.

## Verification & Commands
- Core commands: `./mvnw clean install -B -T1C -Pcheck` (full build), `./mvnw test -pl <module>[-am]` (scoped unit tests), `./mvnw spotless:apply -Pcheck [-pl <module>]` (format), `./mvnw -pl <module> -DskipITs -Dspotless.skip=true -Dtest=ClassName test` (fast verification), `./mvnw -pl proxy -am -DskipTests package` (proxy packaging/perf smoke).
- Coverage: when tests change or targets demand it, run `./mvnw test jacoco:check@jacoco-check -Pcoverage-check` or scoped `-pl <module> -am -Djacoco.skip=false test jacoco:report`; pair with the Coverage & Branch Checklist.
- Style: `./mvnw checkstyle:check -Pcheck` (scoped with `-pl <module> -am -Pcheck` when possible) unless told otherwise.
- Scoped defaults: prefer module-scoped runs over whole-repo builds; include `-Dsurefire.failIfNoSpecifiedTests=false` when targeting specific tests.
- Testing ground rules: JUnit 5 + Mockito, `ClassNameTest` naming, Arrange–Act–Assert, mock external systems/time/network, reset static caches, and reuse swappers/helpers for complex configs.
- API bans: if a user forbids a tool/assertion, add it to the plan, avoid it during implementation, and cite verification searches (e.g., `rg assertEquals`) in the final report.

## Run & Triage Quick Sheet
- **Proxy quick start:** `./mvnw -pl proxy -am package` then `shardingsphere-proxy/bin/start.sh -c conf/server.yaml`; report command, exit code, config path, and protocol.
- **JDBC smoke:** `./mvnw -pl jdbc -am test -Dtest=YourTest` with datasource configs from `examples`; note test name, datasource setup, and failure logs.
- **Config validation:** update standalone `server.yaml` and cluster `mode/` configs together; call out defaults and any edits that affect both.
- **Failure triage:** collect `proxy/logs/` plus `target/surefire-reports`, quote the relevant log lines, map them to the data-flow step, and propose the next diagnostic.
- **Routing mistakes:** check feature-rule configs, metadata freshness, and parser dialect; include SQL + config snippet plus impacted module (`features` or `kernel`), and add/plan targeted tests.
- **Proxy won’t start:** verify configs/mode/ports and reuse known-good example configs; share the log snippet and files inspected without editing generated artifacts.
- **Sandbox/network block:** if a command is denied, state what you ran, why it failed, and the approval or alternative plan required.
- **Single-module tests:** prefer scoped commands over repo-wide runs; use fully-qualified test class names and append `-Dsurefire.failIfNoSpecifiedTests=false` when targeting specific tests (see Verification & Commands for flags).

## Compatibility, Performance & External Systems
- Specify targeted engines and dialect files (MySQL 5.7/8.0, PostgreSQL 13+, openGauss, etc.) and guarantee backward-compatible behavior.
- Call out registry/config-center expectations (ZooKeeper, Etcd, Consul) and note risks when governance logic changes.
- Mention observability or agent impacts (metrics exporters, tracing hooks) whenever touched.
- Capture performance guardrails—baseline vs new latency/QPS, CPU, memory observations—when runtime paths change.

## Collaboration Patterns & Rituals
- **Response focus:** for change requests list goal/constraints/suspected files/validation plan; for code reviews highlight issues + risks + suggested fixes; for status updates summarize completed work + verification + risks/TODOs; for failures/blockers share attempted command, reason for failure, and what you need next.
- **Anti-patterns:** do not accept vague orders (“optimize stuff”); always clarify module + expected result.
- **Hand-off checklist:** intent, edited files, rationale, executed commands (with exit codes), open risks, related issues/PRs.
- **Release & rollback:** (1) restate why the release needs the change and cite affected modules/configs, (2) after implementation capture tests/perf smokes and document Spotless/Jacoco/static results plus translation/backport tasks, (3) outline rollback steps (disable knob, revert YAML/module) and how to confirm success, (4) prep release-note snippets or anchor updates for reviewers.
- **Escalation etiquette:** commit messages use `module: intent`, reviews prioritize regression risks, approval requests follow “Command → Purpose → Sandbox limitation → Expected output,” and halt for guidance whenever sandbox limits conflict with `CODE_OF_CONDUCT.md`.

## Prompt & Reference Snapshot
- **Parser / dialect:** include target database version, sample SQL, expected AST deltas, and downstream modules relying on the result.
- **Kernel routing & features:** describe metadata shape (tables, binding, sharding/encryption rules), knob values, and which `features` hook processes the change.
- **Proxy runtime & governance:** state startup command, `conf/server.yaml`, registry/mode config, relevant logs, and client protocol.
- **Observability / agent:** mention metrics or tracing plugins touched plus expected signal format and dashboards affected.
- **Docs / config updates:** specify audience (user/admin), file path, translation implications, and any screenshots/assets added.
- **Process & releases:** cite commit/PR intent, release-train context, maturity level, and rollback plan.
- **Compliance & conduct:** flag license-header risk, third-party code usage, and inclusive language considerations.

## Mocking & SPI Guidance
- Favor Mockito over bespoke fixtures; only add new fixture classes when mocks cannot express the scenario.
- Use marker interfaces when distinct rule/attribute types are needed; reuse SPI types such as `ShardingSphereRule` where possible.
- Name tests after the production method under test; never probe private helpers directly—document unreachable branches instead.
- Mock heavy dependencies (database/cache/registry/network) and prefer mocking over building deep object graphs.
- For static/constructor mocking, use `@ExtendWith(AutoMockExtension.class)` with `@StaticMockSettings`; avoid hand-written `mockStatic`/`mockConstruction` unless you documented why the extension cannot be used.
- When static methods or constructors need mocking, prefer `@ExtendWith(AutoMockExtension.class)` with `@StaticMockSettings` (or the extension’s constructor-mocking support); when a class is listed in `@StaticMockSettings`, do not call `mockStatic`/`mockConstruction` directly—stub via `when(...)` instead. Only if AutoMockExtension cannot be used and the reason is documented in the plan may you fall back to `mockStatic`/`mockConstruction`, wrapped in try-with-resources.
- Before coding tests, follow the Coverage & Branch Checklist to map inputs/branches to planned assertions.
- When a component is available via SPI (e.g., `TypedSPILoader`, `DatabaseTypedSPILoader`, `PushDownMetaDataRefresher`), obtain the instance through SPI by default; note any exceptions in the plan.
- If the class under test implements `TypedSPI` or `DatabaseTypedSPI`, instantiate it via `TypedSPILoader` or `DatabaseTypedSPILoader` instead of calling `new` directly.
- Do not mix Mockito matchers with raw arguments; choose a single style per invocation, and ensure the Mockito extension aligns with the mocking approach.
- Compliance is mandatory: before any coding, re-read AGENTS.md and convert all hard requirements (SPI usage, no FQCN, mocking rules, coverage targets, planning steps) into a checklist in the plan; do not proceed or report completion until every item is satisfied or explicitly waived by the user.

## Session Notes
- MySQLSchemataQueryExecutorFactoryTest：public 方法分别测试，`accept` 与 `newInstance` 各自使用独立测试方法。