Skip to content
July 2, 2026
  • AI & Copilot
  • Azure Cloud
  • How To Guides
  • Microsoft 365 Office
  • Windows
  • XBOX
  • Privacy Policy

Microsoft News Now

The Home of Microsoft News Today

Primary Menu
  • AI & Copilot
  • Azure Cloud
  • How To Guides
  • Microsoft 365 Office
  • Windows
  • XBOX
  • Privacy Policy
Light/Dark Button
Subscribe

Home - News - Microsoft Open-Sources Harrier, A New Embedding Blockbuster Model For The Agentic Web

  • News
  • AI and Copilot

Microsoft Open-Sources Harrier, A New Embedding Blockbuster Model For The Agentic Web

Microsoft open-sources Harrier, a new multilingual embedding model family that tops MTEB v2 and targets the agentic web’s grounding layer.
Dave W. Shanahan 3 months ago (Last updated: 3 months ago) 5 minutes read
Microsoft Open-Sources Harrier, A New Embedding Heavyweight Model For The Agentic Web

Microsoft has open-sourced a powerful new family of text embedding models called Harrier, aiming to become the default retrieval backbone for the emerging “agentic web.” The Harrier-OSS-v1 series is designed to improve how AI systems search, rank, and connect information, and it already sits at the top of the multilingual MTEB v2 benchmark, signaling state-of-the-art performance among open-weight models.

Another SOTA model drop! This time from the @Bing team: meet Harrier, a new open-source embedding model with state-of-the-art performance and the #1 spot on the industry standard multilingual MTEB-v2 benchmark. https://t.co/yEG74uAjqv

— Mustafa Suleyman (@mustafasuleyman) April 7, 2026

Microsoft’s Harrier: A New Grounding Engine

In modern AI systems, grounding—the connection between model outputs and real-world, verifiable data—depends heavily on the quality of embeddings that power retrieval. Microsoft frames Harrier as that foundational layer: the model family converts text into dense vectors that can be used for search, RAG, clustering, classification, and more, dramatically influencing factual accuracy, latency, and multi-step agent stability. In its announcement, Microsoft describes Harrier as “industry leading” and explicitly positions it as infrastructure for the agent era, where AI agents must maintain memory, search across diverse sources, and orchestrate long workflows over time.

The company’s own blog on grounding for the AI web makes clear that embeddings are no longer just a niche retrieval primitive—they highlight how Bing, Copilot, and third-party agents decide what to read, cite, and trust.

Benchmark Results: Harrier Tops Multilingual MTEB v2

Microsoft Open-Sources Harrier, A New Embedding Heavyweight Model For The Agentic Web

Harrier-OSS-v1 is a three-model family: a large 27B-parameter flagship and two smaller variants at 0.6B and 270M parameters for more constrained hardware. Across 131 tasks on the multilingual MTEB v2 benchmark, the 27B model reports an average score around the mid‑70s, placing it first among open-source embedding models as of late March and early April 2026. Even the smaller 0.6B and 270M variants significantly outperform many popular open-weight baselines like multilingual-E5-large and Qwen3 embedding models, according to early leaderboards and third-party analyses.

Against closed models, Harrier’s numbers are equally aggressive. Public comparisons circulating in the community show the smallest Harrier model outscoring OpenAI’s text-embedding-3-large and Amazon’s Titan Embed v2 on the same multilingual benchmark, while the 27B variant pushes close to or past Google’s latest Gemini embedding offerings depending on the task mix. For practitioners building retrieval-augmented generation (RAG) systems, those deltas are large enough to mean fewer hallucinations, more consistent citations, and better cross-language search out of the box.

How Harrier Works Under The Hood

Technically, Harrier is a departure from the classic BERT-style encoder architecture that has dominated embedding models for years. Instead, Microsoft opted for a decoder-only transformer—the same architectural family used by large language models like GPT-4 and Llama 3—paired with last-token pooling and L2 normalization to produce embeddings. This setup lets the model handle long contexts natively, with a context window up to 32,768 tokens, which is crucial for embedding large documents, long web pages, and multi-step agent traces without aggressive chunking.

The training recipe behind Harrier is equally ambitious. Microsoft constructed a multilingual pipeline comprising more than 2 billion weakly supervised text pairs for contrastive pre-training and over 10 million high-quality pairs for fine-tuning, combining real-world multilingual data with synthetic examples generated by frontier models such as GPT‑5. Large language model–based re-rankers are used as teachers for knowledge distillation, both to clean noisy data and to transfer judgment signals from bigger models into the smaller 0.6B and 270M variants. This approach builds on prior Microsoft work like E5, Multilingual E5, E5-Mistral, and GritLM, but scales both data and supervision far beyond earlier generations.

Designed For Agents, Search, And RAG

Harrier is explicitly instruction-tuned for embedding tasks, meaning it expects a short, task-specific instruction on the query side while encoding documents without instructions. That design lets developers steer retrieval—whether they want similarity search, question answering, translation matching, or clustering—without retraining the model for each new use case. The model family supports over 100 languages, making it especially attractive for global products and multilingual RAG systems that need consistent behavior across locales.

Microsoft is already signaling that Harrier is part of a bigger shift in its own stack. The company has teased a new grounding service that will sit underneath Bing and other experiences, using the same advances in embeddings to improve retrieval quality, semantic understanding, and context selection at scale. For developers, Harrier’s open weights on Hugging Face mean the same underlying technology can now be pulled into custom search engines, enterprise knowledge bots, and autonomous agents that need reliable grounding without relying entirely on proprietary APIs.

Why This Release Matters

For the broader AI ecosystem, Harrier’s release resets the expectations for what open-source embeddings can do. By offering a family that scales from 270M parameters up to 27B, Microsoft is making the retrieval layer both more accessible—small models for edge and on-prem—and more capable, with a flagship that competes directly with or outperforms leading closed offerings on multilingual benchmarks. That balance of portability, quality, and open licensing could make Harrier a default choice for developers who want powerful embeddings without being locked into a single vendor.

It also fits a strategic pattern: Microsoft’s February 2026 Bing grounding blog emphasized that the next generation of AI search is less about flashy chat UIs and more about robust retrieval, citation, and ranking infrastructure. Harrier is a concrete manifestation of that philosophy—an embedding backbone optimized for the agent era, where the line between “search engine,” “copilot,” and “autonomous agent” continues to blur.

About The Author

Harrier

Dave W. Shanahan

I’m Dave W. Shanahan, a Microsoft enthusiast with a passion for Windows, Xbox, Microsoft 365 Copilot, Azure, and more. I started MSFTNewsNow.com to keep the world updated on Microsoft news. Based in Massachusetts, you can email me at davewshanahan@gmail.com.

See author's posts

Like this:

LikeLoading…

Related


Discover more from Microsoft News Now

Subscribe to get the latest posts sent to your email.

Tags: AmazonBingCopilotEnterpriseGoogleLinkedinMicrosoftOpenAITwitter

Post navigation

Previous: Xbox Game Pass Wave 1 Is Super Stacked This Month (April 2026) With Hades II, Kiln, Vampire Crawlers, And More
Next: Starfield’s Biggest Update Yet: Free Lanes, Terran Armada DLC, and PS5 Launch Redefine Bethesda’s Space RPG

Related Stories

Microsoft Frontier Company: Microsoft's $2.5B Bet On Trusted Enterprise AI Transformation
  • News
  • Enterprise

Microsoft Frontier Company: Microsoft’s Big $2.5B Bet On Trusted Enterprise AI Transformation

Dave W. Shanahan 4 hours ago 0
Microsoft Teams Rolls Out Smarter Bot Protection To Keep Unwanted AI Out Of Your Meetings
  • News
  • Microsoft 365/Office

Smarter Microsoft Teams Bot Protection Rolls Out To Keep Unwanted AI Out Of Your Meetings

Dave W. Shanahan 22 hours ago 0
Microsoft 365 November 2025 Update: All The Unforeseen Changes, Upcoming Retirements, New Features, and Enterprise Licensing Shifts Explored
  • Microsoft 365/Office
  • News

Big Microsoft 365 Price Hike Kicks In Today: What It Really Means For Your Business

Dave W. Shanahan 22 hours ago 0

AccessibilityAmazonAndroidAuthenticationAzureCall of DutyCopilotCybersecurityDeveloperEnterpriseFree Play DaysGamingGenerative AIGitHubGoogleLinkedinMicrosoftMicrosoft 365Microsoft 365 CopilotMicrosoft CopilotMicrosoft EdgeMicrosoft StoreMicrosoft TeamsNext Week on XBOXOpenAIOutlookPatch TuesdayPrivacySecuritySettingsSharePointSurfaceTwitterWindowsWindows 10Windows 11Windows InsiderXBOXXBOX Game PassXBOX Game Pass UltimateXBOX OneXBOX Play AnywhereXBOX Series XXBOX Series X|SXBOX Wire

Useful Links

  • AI and Copilot (249)
  • Azure & Cloud (35)
  • Developers (3)
  • Enterprise (4)
  • How To Guides (99)
  • Microsoft 365/Office (97)
  • Microsoft Announcements (97)
  • News (1,270)
  • Security (78)
  • Surface (47)
  • Windows (168)
  • XBOX and Gaming (416)

You May Have Missed

Microsoft Frontier Company: Microsoft's $2.5B Bet On Trusted Enterprise AI Transformation
  • News
  • Enterprise

Microsoft Frontier Company: Microsoft’s Big $2.5B Bet On Trusted Enterprise AI Transformation

Dave W. Shanahan 4 hours ago 0
Microsoft Teams Rolls Out Smarter Bot Protection To Keep Unwanted AI Out Of Your Meetings
  • News
  • Microsoft 365/Office

Smarter Microsoft Teams Bot Protection Rolls Out To Keep Unwanted AI Out Of Your Meetings

Dave W. Shanahan 22 hours ago 0
Microsoft 365 November 2025 Update: All The Unforeseen Changes, Upcoming Retirements, New Features, and Enterprise Licensing Shifts Explored
  • Microsoft 365/Office
  • News

Big Microsoft 365 Price Hike Kicks In Today: What It Really Means For Your Business

Dave W. Shanahan 22 hours ago 0
XBOX Indie Selects July 2026: 5 Must-Play Indie Games to Heat Up Your Summer
  • News
  • XBOX and Gaming

Exciting XBOX Indie Selects July 2026: 5 Must-Play Indie Games to Heat Up Your Fun Summer

Dave W. Shanahan 23 hours ago 0
  • AI & Copilot
  • Azure Cloud
  • How To Guides
  • Microsoft 365 Office
  • Windows
  • XBOX
  • Privacy Policy
Copyright © 2026 All rights reserved. ReviewNews by AF themes.

    %d