Universal Baseline Comparison Framework Documentation

This is a universal baseline comparison framework that provides standardized training, evaluation, and hyperparameter tuning across multiple baseline algorithms. Currently focused on weak supervision methods, it’s designed to be extensible for any type of baseline comparison.

Wiki & Documentation

Documentation guidelines, RST syntax guide, and other wiki resources for contributors and users working with the documentation system.

Baseline Algorithms

Comprehensive documentation of implemented baseline algorithms with pseudocode, implementation details, and evaluation results.

API reference

The reference guide contains a detailed description of the common utilities API. The reference describes how the shared components work and which parameters can be used. It focuses on the src.common module that all baselines utilize.

Examples

The example gallery contains examples showing how to integrate baselines with minimal changes. The examples illustrate fair comparison setups and shared utility usage.

Overview

This framework provides standardized baseline comparison with minimal modifications to existing code:

  • Common Utilities (src.common): Shared infrastructure for fair comparison

  • BaseTrainer: Abstract interface for wrapping existing baseline implementations

  • Configuration Management: TOML-based config with hyperparameter tuning support

  • Standardized Evaluation: Consistent metrics and output formats across all baselines

  • Hyperparameter Tuning: Optuna-based optimization with minimal baseline changes

  • Extensible Design: Easy integration of new baselines with existing open source code

Quick Example

Here’s how to integrate a baseline with minimal changes:

import src.common as lib
from src.lol.trainer import LoLTrainer

# Load configuration
config = lib.read_config('exp/lol/youtube/eval.toml')

# Create trainer (wraps existing LoL code)
trainer = LoLTrainer(config)

# Standard training pipeline
train_data, val_data, test_data = trainer.load_data()
model = trainer.train(train_data)
results = trainer.evaluate(model, test_data)

# Consistent output format
lib.save_json(results, config.output.folder + '/results.json')
print(f"Test accuracy: {results['metrics']['test']['accuracy']}")

Key Features

Minimal Code Changes

Wrap existing baseline implementations with thin adapter layers, preserving original code.

Fair Comparison Environment

Standardized data loading, preprocessing, evaluation metrics, and random seeds across all baselines.

Shared Utilities (src.common)

Reusable components for configuration, data loading, evaluation, and hyperparameter tuning.

Consistent Output Format

All baselines produce standardized result structures for easy comparison and analysis.

Hyperparameter Tuning

Unified Optuna-based optimization without modifying original baseline hyperparameter logic.

Extensible Design

Easy addition of new baselines by implementing the BaseTrainer interface.

Indices and tables