👋 Welcome to RefineBench — a comprehensive evaluation library for testing refinement capabilities of language models across multiple settings and domains. To reproduce the full results reported in ...
The Sophia Script is an open-source PowerShell module designed to debloat and fine-tune Windows 11 (and Windows 10). It is ...
COBOL is in the headlines again, and this time it is because of artificial intelligence (AI) – sparking conversations with tools emerging that claim t.
CATArena (Code Agent Tournament Arena) is an open-ended environment where LLMs write executable code agents to battle each other and then learn from each other. CATArena is an engineering-level ...
Abstract: Over the past decades, the speed and bandwidth of internet systems have dramatically improved. Alongside this, the expansion of cloud server providers, in terms of both price and efficiency, ...
Abstract: Our research focuses on the intersection of artificial intelligence (AI) and software development, particularly the role of AI models in automating code generation. With advancements in ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results