aigeon-ai/judge0-ce

3.8

If you are the rightful owner of judge0-ce and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

Judge0 CE is a robust, scalable, and open-source online code execution system.

MCPHub score:3.81

Has a README

Github repo has a README.md.

Has a License

Github repo doesn't have a valid license.

Server can be inspected

Currently can not be tried on MCPHub.

Server schema can be extracted

Can get at lease one tool info from the README or server.

Online hosted on MCPHub

More deployment information is needed.

Has social accounts

Do not have any social accounts.

Claimed by the author or certified by MCPHub

If you are the author, claim authorship

AI Evaluation Report
Total Score: 2/10

The agent demonstrates a significant inability to perform its core functions related to code submission and execution using the Judge0-CE plugin. In all tested scenarios, the agent failed to create valid code submissions due to persistent issues with input validation and tool malfunction. Additionally, the agent was unable to provide a valid comparison between Judge0 CE and Judge0 Extra CE, indicating a lack of comprehensive data retrieval capabilities. The agent's responses consistently reflect a misunderstanding or misapplication of the required input structure for successful operations, leading to repeated failures. This highlights a critical weakness in handling the tool's API and processing user requests effectively. Overall, the agent's performance is severely limited by these operational deficiencies.

Test case 1
Score: 2/10
Perform the operation of creating a new code submission in Judge0-CE to execute a Python 3.8.1 program that prints 'Hello, World!', and then retrieve and display the execution results.
No valid answer is generated due to tool malfunction. The response indicates that there were persistent issues with the submission creation process, and despite attempts to format the submission correctly, the errors persisted. This suggests a malfunction in the tool or its API.
Test case 2
Score: 6/10
What are the differences between Judge0 CE and Judge0 Extra CE in terms of supported programming languages?
The response does not provide a valid answer to the task of comparing the supported programming languages between Judge0 CE and Judge0 Extra CE. The response only lists the programming languages supported by Judge0 CE and does not include information about Judge0 Extra CE. Therefore, the task is incomplete and no valid comparison can be made. The failure is due to the operation targets being in an invalid state, as the list of languages for Judge0 Extra CE was not retrieved.
Test case 3
Score: 6/10
Perform the operation of creating a new code submission in Judge0-CE to execute a C program that calculates the factorial of a given number, and then retrieve and display the execution results.
No valid answer is generated due to invalid input. The response indicates repeated validation errors regarding the expected structure of the input parameters for creating a submission. The source code and language ID were not structured correctly according to the tool's requirements, leading to consistent failures.
Test case 4
Score: 6/10
Perform the operation of creating a new code submission in Judge0-CE to execute a Java program that reads an integer from standard input and prints its square, and then retrieve and display the execution results.
No valid answer is generated due to invalid input. The response indicates that the code submission failed because the source code and language ID were not provided correctly. Multiple attempts to create a submission with various structures resulted in validation errors regarding unexpected keyword arguments. The tool may require a specific input format that has not been correctly identified.