Universal Baseline Comparison Framework Documentation¶

This is a universal baseline comparison framework that provides standardized training, evaluation, and hyperparameter tuning across multiple baseline algorithms. Currently focused on weak supervision methods, it’s designed to be extensible for any type of baseline comparison.

Wiki & Documentation

Documentation guidelines, RST syntax guide, and other wiki resources for contributors and users working with the documentation system.

To the wiki

Baseline Algorithms

Comprehensive documentation of implemented baseline algorithms with pseudocode, implementation details, and evaluation results.

To the baselines

API reference

The reference guide contains a detailed description of the common utilities API. The reference describes how the shared components work and which parameters can be used. It focuses on the src.common module that all baselines utilize.

To the reference guide

Examples

The example gallery contains examples showing how to integrate baselines with minimal changes. The examples illustrate fair comparison setups and shared utility usage.

To the examples

Overview¶

This framework provides standardized baseline comparison with minimal modifications to existing code:

Common Utilities (src.common): Shared infrastructure for fair comparison
BaseTrainer: Abstract interface for wrapping existing baseline implementations
Configuration Management: TOML-based config with hyperparameter tuning support
Standardized Evaluation: Consistent metrics and output formats across all baselines
Hyperparameter Tuning: Optuna-based optimization with minimal baseline changes
Extensible Design: Easy integration of new baselines with existing open source code

Quick Example¶

Here’s how to integrate a baseline with minimal changes:

import src.common as lib
from src.lol.trainer import LoLTrainer

# Load configuration
config = lib.read_config('exp/lol/youtube/eval.toml')

# Create trainer (wraps existing LoL code)
trainer = LoLTrainer(config)

# Standard training pipeline
train_data, val_data, test_data = trainer.load_data()
model = trainer.train(train_data)
results = trainer.evaluate(model, test_data)

# Consistent output format
lib.save_json(results, config.output.folder + '/results.json')
print(f"Test accuracy: {results['metrics']['test']['accuracy']}")

Key Features¶

Minimal Code Changes: Wrap existing baseline implementations with thin adapter layers, preserving original code.
Fair Comparison Environment: Standardized data loading, preprocessing, evaluation metrics, and random seeds across all baselines.
Shared Utilities (src.common): Reusable components for configuration, data loading, evaluation, and hyperparameter tuning.
Consistent Output Format: All baselines produce standardized result structures for easy comparison and analysis.
Hyperparameter Tuning: Unified Optuna-based optimization without modifying original baseline hyperparameter logic.
Extensible Design: Easy addition of new baselines by implementing the BaseTrainer interface.

Universal Baseline Comparison Framework Documentation¶

Overview¶

Quick Example¶

Key Features¶

Indices and tables¶