Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
Eval-Framework v0.2.9
Eval-Framework v0.2.9

Getting Started

  • Installation
  • Using the CLI

User Guides

  • Creating Completion Tasks
  • How to Add a New Benchmark to Eval Framework
  • Included Benchmark Tasks
  • Controlling HuggingFace Upload Results Guide
  • Docker Guide
  • How to Evaluate HuggingFace Models with Eval Framework
  • Creating Loglikelihood Tasks
  • Model Arguments
  • Overview Dataloading
  • Understanding Evaluation Results Guide
  • Using Determined
  • Utils in eval-framework
  • Weights and Biases Integration with Eval-Framework

Contributing Guidelines

  • Contributing to Eval Framework
  • Testing

API Reference

  • API Reference
    • eval_framework package
      • eval_framework.context package
      • eval_framework.llm package
      • eval_framework.metrics package
        • eval_framework.metrics.completion package
        • eval_framework.metrics.efficiency package
        • eval_framework.metrics.llm package
        • eval_framework.metrics.loglikelihood package
      • eval_framework.result_processors package
      • eval_framework.tasks package
        • eval_framework.tasks.benchmarks package
    • eval_framework.context package
    • eval_framework.llm package
    • eval_framework.metrics package
      • eval_framework.metrics.completion package
      • eval_framework.metrics.efficiency package
      • eval_framework.metrics.llm package
      • eval_framework.metrics.loglikelihood package
    • eval_framework.metrics.completion package
    • eval_framework.metrics.efficiency package
    • eval_framework.metrics.llm package
    • eval_framework.metrics.loglikelihood package
    • eval_framework.result_processors package
    • eval_framework.tasks package
      • eval_framework.tasks.benchmarks package
    • eval_framework.tasks.benchmarks package
    • eval_framework
      • eval_framework package
        • eval_framework.context package
        • eval_framework.llm package
        • eval_framework.metrics package
          • eval_framework.metrics.completion package
          • eval_framework.metrics.efficiency package
          • eval_framework.metrics.llm package
          • eval_framework.metrics.loglikelihood package
        • eval_framework.result_processors package
        • eval_framework.tasks package
          • eval_framework.tasks.benchmarks package
Back to top
Copyright © 2025, Aleph Alpha Research
Made with Sphinx and @pradyunsg's Furo