Microsoft Critique: GPT and Claude Collaborate to Review AI Responses

Microsoft introduces Critique, a feature for its Copilot Researcher tool. Its mechanism is based on collaboration between two AI models: OpenAI's GPT generates an initial response and Anthropic's Claude acts as a critical reviewer. This internal double-checking process aims to elevate the accuracy of the final result, achieving a 13.8% improvement in deep research tasks.

Two AI assistants, one generating content and the other critically reviewing it, over a document with accuracy improvement charts.

The Multi-Model Orchestration Architecture 🤖

The feature is part of a technical orchestration strategy, where different specialized AI models work in sequence or in parallel. Critique uses a serial flow: one model produces and another evaluates. Alongside it, Council allows comparing outputs from several models at once. This approach reduces dependency on a single provider and mitigates systematic errors, aiming for greater reliability in complex tasks.

AI Holds a Team Meeting to Avoid Mistakes 😅

It seems AI has adopted the corporate culture of endless reviews. Now GPT can't send a report without Claude returning the document full of red comments. It's the classic let me see it before it goes out. In the end, the user receives more polished work, but one wonders if the next step will be the models arguing in a Slack channel about the best way to cite a source.