Skip to main content

Agents Arena

The Agents Arena is an exciting feature that allows users to compare the performance of different AI agents by posing the same question to two agents of the same mode. This interactive experience not only showcases the capabilities of your AI agents but also provides valuable insights into their strengths and limitations.

How It Works

  1. Select a Mode: Choose from three available modes:

    • Coder: For programming and technical questions
    • Casual: For general conversation and everyday queries
    • Retriever: For information retrieval, fact-based questions , function calling and internet search
  2. Ask Your Question: Enter a question that you'd like both agents to answer.

  3. Compare Responses: Two AI agents of the selected mode will provide their answers to your question.

  4. Rate the Best Answer: After reviewing both responses, you can rate the best answer by clicking the thumbs up icon next to the preferred agent's response.

  5. Start a New Battle: If you want to ask another question or try a different mode, simply click the "New Battle" button on the bottom left side of the interface.

Benefits of the Agents Arena

  • Direct Comparison: Easily compare the performance of different AI agents side by side.
  • Diverse Perspectives: Gain insights from multiple AI-generated responses to the same query.
  • Continuous Improvement: Your feedback helps us refine and enhance your agents and models.
  • Educational Experience: Learn about the strengths and limitations of different AI agents.

Tips for Using the Agents Arena

  1. Be Specific: The more precise your question, the better you can evaluate the agents' performance.
  2. Try Different Modes: Experiment with all three modes to see how agents perform in various contexts.
  3. Consistent Rating: Try to be consistent in your rating criteria to provide valuable feedback.
  4. Explore Edge Cases: Test the agents with challenging or unusual questions to see how they handle complex scenarios.

Agents Leaderboard

The Agents Leaderboard is a dynamic feature that complements the Agents Arena, allowing users to view the performance statistics of their AI agents across different modes.

Accessing the Leaderboard

You can easily switch between the Arena and the Leaderboard using the toggle button located in the top-right corner of the interface. This seamless transition allows you to quickly check agent rankings after participating in battles.

Leaderboard Features

  1. Mode Selection: Just like in the Arena, you can select the mode (Coder, Casual, or Retriever) to view specific leaderboard results.

  2. Visual Performance Breakdown: A visually appealing doughnut chart showcases the wins of each agent, providing an at-a-glance understanding of their performance.

  3. Detailed Statistics: Below the chart, you'll find a table with more detailed statistics for each agent, including:

    • Total battles participated
    • Wins
    • Draws

Understanding the Rankings

The current ranking system is based on the total number of wins and the win percentage. However, we're excited to announce that an Elo rating system will be implemented in the coming months. This will provide a more nuanced ranking that takes into account the difficulty of each battle and the performance of opponents.

Enjoy your time in the Agents Arena, and may the best AI win!

info

When the same agent is selected on both sides, the agent will be duplicated and a warning message will appear. The same applies if the workspace only has one agent for the selected mode.