## Why?
Fory has lacked a schema-first Interface Definition Language (IDL) for
cross-language serialization.
Currently, teams must manually craft type registration and
cross-language models. This is Ok and perfered option if we are suign
one language to build the system, since we can **serialize domain object
directly.**
But it is error-prone when we have multiple languages due to the **type
system inconsistency** and makes it difficult to guarantee consistent
schemas, type IDs, and reference-tracking behavior across languages.
Users who want to migrate from Protocol Buffers to Fory for better
performance, reference tracking, or polymorphism support had to manually
define structs in every language.
This PR addresses this gap by introducing FDL (Fory Definition Language)
- a native schema IDL specifically designed for Fory's cross-language
serialization capabilities.
**Key motivations:**
- Enable schema-first development workflow for Fory
- Provide deterministic, cross-language type IDs and registration rules
- Generate native code with minimal/no runtime overhead
- Ensure consistent reference-tracking and polymorphism behavior across
languages
- Simplify migration from Protocol Buffers by supporting similar syntax
patterns
## What does this PR do?
### 1. FDL (Fory Definition Language) Specification
- Defines file structure: `package`, `import`, `enum`, `message`
- Supports field modifiers: `optional`, `ref`, `repeated`
- Supports collection types: `repeated` (list), `map<K,V>`
- Supports nested types (message within message, enum within message)
- Supports `reserved` names/IDs for backward compatibility
- Supports file/type/field `options` (protobuf-style and bracket-style)
### 2. Compiler Frontend (Python-based)
- **Hand-written Lexer** (`fory_compiler/parser/lexer.py`): Tokenizes
FDL source files
- **Recursive-Descent Parser** (`fory_compiler/parser/parser.py`):
Parses tokens into AST
- **AST/IR Definitions** (`fory_compiler/parser/ast.py`): Schema,
Message, Enum, Field, Import, and type representations
- **Schema Validation**: Detects duplicate names, duplicate IDs, unknown
type references, duplicate field numbers
- **Import Resolution**: Supports relative and search-path based imports
- **Circular Import Detection**: Prevents infinite recursion in imports
### 3. Multi-Language Code Generation
| Language | Output | Features |
|----------|--------|----------|
| **Java** | POJOs with Fory annotations | Getters/setters,
equals/hashCode, registration helper class, nested type support |
| **Python** | Dataclasses with type hints | Native type mappings,
registration function, nested type flattening |
| **Go** | Structs with struct tags | Fory struct tags, registration
function, nested type flattening |
| **Rust** | Structs with derive macros | `#[derive(Fory, ...)]`,
registration function, nested type flattening |
| **C++** | Structs with FORY macros | `FORY_STRUCT`, `FORY_FIELD_INFO`,
registration helper, nested type support |
### 4. Compiler CLI & Build Integration
- **CLI command**: `fory compile` with comprehensive options
- **Language-specific output directories**: `--java_out`,
`--python_out`, `--go_out`, `--rust_out`, `--cpp_out`
- **Include paths**: `-I/--proto_path` for import resolution
- **Package installable**: `pip install -e .` with `fory` entrypoint
### 5. C++ Improvements
- Added `FORY_PP_IS_EMPTY`, `FORY_PP_HAS_ARGS` macros for empty struct
support
- Added `FORY_STRUCT_0` variant to support structs with no fields
- Fixed `FORY_STRUCT` macro to detect empty argument lists
### 6. Integration Tests
- Full cross-language roundtrip tests in `integration_tests/idl_tests/`
- Tests cover Java ↔ Python ↔ Go ↔ Rust ↔ C++ serialization
compatibility
- CI workflow integration for all language combinations
### 7. Comprehensive Documentation
- FDL Overview - Introduction and quick start
- FDL Syntax Reference - Complete language syntax
-Type System - Primitive types, collections, and mappings
- Compiler Guide - CLI usage and build integration
- Generated Code - Output format for each language
- Protocol Buffers vs FDL - Feature comparison and migration guide
### Example FDL Schema
```protobuf
package addressbook;
message Person [id=100] {
string name = 1;
int32 id = 2;
string email = 3;
repeated string tags = 4;
map<string, int32> scores = 5;
enum PhoneType [id=101] {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber [id=102] {
string number = 1;
PhoneType phone_type = 2;
}
repeated PhoneNumber phones = 6;
}
message AddressBook [id=103] {
repeated Person people = 1;
map<string, Person> people_by_name = 2;
}
```
## Related issues
Closes #3163 #3164 #3165 #3167 #3168 #3169 #3173 #3174 #3175 #3176 #3177
#1197
#1945
#3099
## Does this PR introduce any user-facing change?
Yes, this PR introduces a new FDL compiler tool that can be installed
via pip:
```bash
cd compiler
pip install -e .
fory compile --lang java,python,go,rust,cpp schema.fdl -o output/
```
- [x] Does this PR introduce any public API change?
- New `fory compile` CLI command
- New FDL schema language specification
- Generated code uses existing Fory APIs (annotations, macros, derive
macros)
- [ ] Does this PR introduce any binary protocol compatibility change?
- No changes to the serialization protocol
- Generated code uses standard Fory serialization
## Benchmark
N/A - This PR focuses on code generation tooling. The generated code
uses existing Fory serialization APIs which have been benchmarked
separately. The compiler itself is a build-time tool and does not impact
runtime serialization performance.