StyleTranslator

Neural machine translation with reinforcement learning for style-aware translation. Maintains source text style (law, literature, news, science) in target translation.

Features

Style-Aware Translation: Preserves document style across languages
Multi-Component Rewards: Format validation + Semantic quality (COMET) + Style consistency (BERT)
GRPO Training: Group Relative Policy Optimization for RL fine-tuning
Hydra Configuration: Flexible, composable configuration management
Modular Architecture: Clean separation with dependency injection

Quick Start

# Install dependencies
conda env create -f environment.yml
conda activate style_translator

# Train style detector (BERT-based classifier)
cd style_detector
python train.py

# Train translation model with RL
cd ../rl
python scripts/train_rl.py

Project Structure

StyleTranslator/
├── style_detector/         # Style classification (BERT)
│   ├── corpus/             # Corpus generation
│   ├── dataset/            # Dataset loaders
│   ├── model/              # StyleDetector model
│   ├── config.yaml         # Training config
│   └── train.py            # Training script
│
├── rl/                     # RL training module
│   ├── configs/            # Hydra configurations
│   │   ├── env/            # Environment settings
│   │   ├── reward/         # Reward presets
│   │   └── model/          # Model configs
│   ├── src/
│   │   ├── rewards/        # Reward system
│   │   ├── trainer/        # GRPO trainer
│   │   └── utils/          # Utilities
│   ├── scripts/            # Entry points
│   └── data/               # Training data

Components

1. Style Detector

BERT-based classifier for detecting text style (law, literature, news, science).

cd style_detector
python train.py  # Train on your corpus

Config: style_detector/config.yaml

2. RL Training

GRPO-based reinforcement learning for style-aware translation.

cd rl

# Default training
python scripts/train_rl.py

# Server environment + style-weighted rewards
python scripts/train_rl.py env=server reward=style_weighted

# Override parameters
python scripts/train_rl.py training.num_epochs=5 device=cuda

Configs: rl/configs/

Reward System

Component	Weight	Description
Format	1.0	XML tag validation (`<think>`, `<translate>`)
Semantic	6.0	COMET translation quality
Style	4.0	BERT style consistency (source ↔ target)

Customize in rl/configs/reward/*.yaml.

Requirements

Python 3.8+
PyTorch 2.0+
Transformers
TRL (Transformers Reinforcement Learning)
COMET
Hydra
PyTorch Lightning

See environment.yml for full dependencies.

Architecture

Style Detection

Text → BERT → Style Classifier → [law, literature, news, science]

RL Training

Source Text → LLM → Translation
           ↓
    Format + Semantic + Style Rewards
           ↓
      GRPO Optimization

License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.ipynb_checkpoints		.ipynb_checkpoints
rl		rl
style_detector		style_detector
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StyleTranslator

Features

Quick Start

Project Structure

Components

1. Style Detector

2. RL Training

Reward System

Requirements

Architecture

Style Detection

RL Training

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

StyleTranslator

Features

Quick Start

Project Structure

Components

1. Style Detector

2. RL Training

Reward System

Requirements

Architecture

Style Detection

RL Training

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages