Skip to content

Transformer.Prototype - A better translator architecture#1041

Open
konard wants to merge 3 commits into
masterfrom
issue-609-145503fd
Open

Transformer.Prototype - A better translator architecture#1041
konard wants to merge 3 commits into
masterfrom
issue-609-145503fd

Conversation

@konard

@konard konard commented Oct 26, 2025

Copy link
Copy Markdown
Owner

Summary

This PR implements a prototype of a better translator architecture as requested in issue #609.

Problem

The current translator architecture (RegularExpressions.Transformer) has limitations:

  • Translation unit: Single file
  • Parsing: Regular expressions
  • Algorithm: Markov algorithm (Turing Complete regex substitution)
  • Intermediate storage: File system
  • Context: Limited to single file

Solution

This prototype demonstrates a new architecture with significant improvements:

Feature Old Architecture New Architecture
Translation Unit Single file Entire repository/project
Parsing Method Regular expressions Roslyn Compiler API (full AST)
Transformation Rules Regex pattern substitution C# functions on graph
Intermediate Storage File system Doublets graph database
Context Single file Entire project with semantics
Algorithm Markov (regex-based) Function-based

Architecture

Core Components

  1. IRepositoryTransformer - Main orchestrator

    • Parses entire repository into AST
    • Converts AST to Doublets representation
    • Applies transformation functions
    • Generates target code
  2. IAstToDoubletsConverter - AST → Graph conversion

    • Uses Roslyn Compiler API for parsing
    • Stores AST structure in Doublets
    • Preserves semantic information
  3. ITransformationFunction - Function-based transformations

    • Replaces regex substitution rules
    • Can query full graph structure
    • Access to semantic model
    • Type-safe transformations
  4. IDoubletsToCodeGenerator - Graph → Code generation

    • Generates code from transformed graph
    • Language-specific generators
    • Preserves formatting and style

Transformation Pipeline

Source Repository
       ↓
   [Parse with Roslyn]
       ↓
    AST Trees
       ↓
  [Convert to Doublets]
       ↓
  Doublets Graph (intermediate storage)
       ↓
  [Apply transformation functions]
       ↓
  Transformed Graph
       ↓
  [Generate code]
       ↓
Target Repository

Advantages

1. Repository-Level Context

  • Access to entire codebase structure
  • Can analyze cross-file dependencies
  • Preserve relationships between classes, interfaces, namespaces

2. Semantic Understanding

  • Full AST with type information
  • Can query semantic model for type resolution
  • Better handling of language-specific constructs
  • Distinguishes between types, variables, comments

3. Flexible Transformations

Instead of:

new SubstitutionRule(@"\bstring\b", "std::string")

We have:

public class StringTypeTransformation : ITransformationFunction<ulong>
{
    public bool CanTransform(ulong link) 
    {
        return IsTypeNode(link) && GetTypeName(link) == "string";
    }
    
    public ulong Transform(ulong link) 
    {
        return CreateTypeNode("std::string");
    }
}

Benefits:

  • Context-aware (knows if it's actually a type)
  • Can handle qualified names (System.String)
  • Won't match string in comments
  • Access to full graph for complex logic

4. Graph Storage

  • All code structure in Doublets
  • Can apply multiple transformation passes
  • Can inspect and debug intermediate state
  • Enables advanced graph queries
  • Persistent storage for incremental transformations

Implementation Status

This is a prototype demonstrating the architecture. It includes:

✅ Core interfaces defining the architecture
✅ Simplified implementation showing the workflow
✅ Example demonstrating usage
✅ Comprehensive documentation
✅ Builds successfully

For production use, would need:

⏳ Complete Doublets API integration
⏳ Language-specific code generators (C++, Java, Python, etc.)
⏳ Comprehensive transformation library matching CSharpToCpp
⏳ Full semantic analysis support
⏳ Performance optimization
⏳ Incremental transformation support

Files Added

  • Platform/Platform.Transformer.Prototype/ - Main prototype implementation

    • Core interfaces: IRepositoryTransformer, IAstToDoubletsConverter, ITransformationFunction, IDoubletsToCodeGenerator
    • Simplified implementations demonstrating the architecture
    • Comprehensive README with architecture documentation
  • examples/transformer-prototype/ - Example usage

    • Demo program showing how to use the new architecture
    • Example transformation function

Comparison with Existing Transformers

The existing RegularExpressions.Transformer.CSharpToCpp has ~100 regex rules that transform C# to C++.

Limitations of regex approach:

  • Can't distinguish context (type vs variable vs comment)
  • No semantic understanding
  • Single-file scope
  • Fragile to code variations
  • Hard to maintain complex transformations

New architecture solves these:

  • Full AST with semantic information
  • Repository-wide context
  • Type-safe transformations
  • Easier to test and maintain
  • Can handle complex multi-step transformations

Testing

The prototype includes a demo program that can be run to see the architecture in action:

cd examples/transformer-prototype
dotnet run

Documentation

See Platform/Platform.Transformer.Prototype/README.md for detailed architecture documentation, examples, and future work.

Related Issues

Fixes #609


🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Adding CLAUDE.md with task information for AI processing.
This file will be removed when the task is complete.

Issue: undefined
@konard konard self-assigned this Oct 26, 2025
Implements a prototype of the new translator architecture requested in issue #609.

Key improvements over existing RegularExpressions.Transformer:
- Repository-level translation (not file-level)
- AST parsing using Roslyn Compiler API (not regex)
- Doublets as intermediate storage (not files)
- Function-based transformations (not regex substitutions)
- Full project context for transformations

Components:
- IRepositoryTransformer: Main interface for repository transformation
- IAstToDoubletsConverter: Converts AST to Doublets graph
- ITransformationFunction: Function-based transformation rules
- IDoubletsToCodeGenerator: Generates code from Doublets

This prototype demonstrates the architecture. Full implementation would require:
- Complete Doublets API integration
- Language-specific code generators
- Comprehensive transformation library
- Semantic analysis support

Fixes #609

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@konard konard changed the title [WIP] A better translator architecture prototype Transformer.Prototype - A better translator architecture Oct 26, 2025
@konard konard marked this pull request as ready for review October 26, 2025 11:06
@konard

konard commented Oct 26, 2025

Copy link
Copy Markdown
Owner Author

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

📎 Log file uploaded as GitHub Gist (417KB)
🔗 View complete solution draft log


Now working session is ended, feel free to review and add any feedback on the solution draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A better translator architecture prototype

1 participant