Coding Neuropsychological Testing

GPT-5.3 Codex Raises the Bar, but Opus 4.6 Still Owns Deep Reasoning

In benchmark tests such as Swaybench Pro and Terminal Bench, GPT-5.3 Codex consistently outperformed its predecessors, setting new standards for speed and execution. When compared to Anthropic’s Opus ...

GPT 5.3 Codex, OpenAI's new agentic coding model, helped create itself

GPT-5.3 Codex merges the advanced coding capabilities of GPT-5.2 Codex with the reasoning and professional knowledge of GPT-5 ...

OpenAI’s GPT-5.3-Codex thinks deeper and wider about coding work

The company says its latest model’s agentic skills also apply to a broader set of knowledge work such as presentations and ...

OpenAI’s GPT-5.3-Codex drops as Anthropic upgrades Claude — AI coding wars heat up ahead of Super Bowl ads

OpenAI launched GPT-5.3-Codex as Anthropic released Claude Opus 4.6 in a simultaneous drop that kicks off the AI coding wars, with benchmark claims, enterprise agent ambitions, and cybersecurity ...

blockchain

AI Vibe Coding Tools Test Shows Mixed Results for Non-Developers

A hands-on test of 5 AI coding platforms reveals stark differences in usability for beginners, with Manus and Lovable leading while Cursor fails completely. The promise of building software by simply ...

InfoWorld

Output from vibe coding tools prone to critical security flaws, study finds

Popular vibe coding platforms consistently generate insecure code in response to common programming prompts, including creating vulnerabilities rated as ‘critical,’ new testing has found. Security ...

Bleeping Computer

OpenAI is rolling out GPT-5.2 “Codex-Max” for some users

OpenAI is testing a new model for Codex called "GPT-5.2-Codex-Max." Some users have spotted a new model, GPT-5.2-Codex-Max, when they ask Codex what model it is using. OpenAI rolled out Codex with GPT ...

Ars Technica

The Ars Technica AI coding agent test: Minesweeper edition

I just tried this on gemini.google.com, using gemini 3 thinking in canvas mode with the prompt "I want to implement a mine sweeper". It generated a fully functional minesweeper game playable in html ...

Becker's ASC

Texas physician practice to pay $13.6M in drug test fraud case

Austin, Texas-based Advanced Pain Care and its founder Mark Malone, MD, have agreed to pay $13.6 million to resolve allegations of submitting false claims to federal and state healthcare programs for ...

Ars Technica

A new open-weights AI coding model is closing in on proprietary options

On Tuesday, French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed to work as part of an autonomous software engineering agent. The model achieves ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results