Benchmark All Open Source Models

9don MSN

China’s Moonshot releases a new open-source model Kimi K2.5 and a coding agent

The company said that the model was trained on 15 trillion mixed visual and text tokens.

Qwen3-Coder-Next offers vibe coders a powerful open source, ultra-sparse model with 10x higher throughput for repo tasks

On SWE-Bench Verified, the model achieved a score of 70.6%. This performance is notably competitive when placed alongside ...

Kimi K2.5 Makes Agent Work 4.5x Faster : Matching Top Models in Vision & Code

Kimi K2.5 handles up to 100 sub-agents and 1,500 tool calls, cutting task time 4.5x so you finish complex work sooner.

Morningstar

Lakera Launches Open-Source Security Benchmark for LLM Backends in AI Agents

The b3 is built around a new idea called threat snapshots. Instead of simulating an entire AI agent from start to finish, threat snapshots zoom in on the critical points where vulnerabilities in large ...

Infosecurity-magazine.com

Open Source “b3” Benchmark to Boost LLM Security for Agents

The UK AI Security Institute (AISI) has partnered with the commercial security sector on a new open source framework designed to help large language model (LLM) developers improve security posture.

Moonshot AI releases open-source Kimi K2.5 model with 1T parameters

Kimi has a standard mode and a Thinking mode that offers higher output quality. Additionally, a capability called K2.5 Agent ...

Inc Arabia on MSN

MBZUAI releases fully sovereign 70B open-source reasoning model K2 Think V2

Abu Dhabi-based Mohamed bin Zayed University of Artificial Intelligence’s (MBZUAI) Institute of Foundation Models has released K2 Think V2, a 70 billion-parameter open-source reasoning model that the ...

8don MSN

Tiny startup Arcee AI built a 400B open source LLM from scratch to best Meta’s Llama

30-person startup Arcee AI has released a 400B model called Trinity, which it says is one of the biggest open source foundation models from a US company.

Mistral drops Voxtral Transcribe 2, an open-source speech model that runs on-device for pennies

Mistral AI has launched Voxtral Transcribe 2, a new on-device speech-to-text model family featuring real-time transcription, ...

Searchenginejournal.com

Why OpenAI’s Open Source Models Are A Big Deal

OpenAI has released two new open-weight language models under the permissive Apache 2.0 license. These models are designed to deliver strong real-world performance while running on consumer hardware, ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results