Cohere Embed - Embedding Model

Multimodal Embedding Model C AI Processing & RAG

Basic Information

  • Company/Brand: Cohere
  • Country/Region: Canada (Toronto)
  • Official Website: https://cohere.com
  • Type: Multimodal Embedding Model
  • Latest Version: Embed v4 (Released January 2026)
  • Predecessor: Embed v3

Product Description

Cohere Embed is a multilingual, multimodal embedding model series developed by Cohere, capable of converting text and images into semantic vector representations. The latest Embed v4 is a comprehensive multimodal model that can handle text, images, and interleaved text and image content, transforming them into unified vector representations. It ranks at the top of the MTEB benchmark with a score of 65.2.

Core Features/Characteristics

  • Multimodal Embedding: Supports embedding for both text and images, as well as interleaved content
  • Multilingual Support: Excellent performance in both English and multilingual environments
  • Flexible Dimensions: Configurable output dimensions ranging from 256 to 1536
  • Multiple Quantization Formats: float, int8, uint8, binary, ubinary
  • Matryoshka Embedding: Supports dimension cropping for further compression
  • Ultra-Long Input: Supports input up to 128,000 tokens
  • Base64 Image Input: Supports direct embedding of Base64 encoded images
  • Search Type Optimization: Supports input_type parameter to specify query/document type

Technical Specifications

  • Maximum Dimension: 1536
  • Minimum Dimension: 256
  • Maximum Input: 128,000 tokens
  • Quantization Options: float / int8 / uint8 / binary / ubinary
  • Storage Compression: Up to 48x compression with binary quantization

Business Model

  • Text Embedding: $0.12 / million tokens
  • Image Embedding: $0.47 / million image tokens
  • Deployment Options:
  • Direct API calls via Cohere API
  • AWS Bedrock
  • Azure AI
  • Oracle Cloud
  • Google Cloud
  • Trial Quota: Free trial quota available

Target Users

  • Developers of enterprise search and RAG systems
  • Application developers requiring multimodal retrieval
  • Multilingual content processing teams
  • Enterprises needing deployment across multiple cloud platforms
  • Teams working on recommendation systems and content classification

Competitive Advantages

  • Highest MTEB benchmark score (65.2), leading in precision
  • Mature multimodal capabilities (unified embedding for text + images)
  • 128K ultra-long input support, capable of handling large documents
  • Flexible quantization and compression options, reducing storage costs
  • Multi-cloud platform deployment support (AWS, Azure, Oracle, GCP)
  • Seamless integration with Cohere Rerank product, offering a one-stop solution for retrieval + reranking

Limitations

  • Closed-source commercial model, cannot be deployed locally
  • Text embedding price ($0.12/MTok) is relatively high
  • Image embedding price ($0.47/MTok) is expensive
  • Lacks customization flexibility compared to open-source models (e.g., BGE-M3)

Version Evolution

VersionRelease DateKey Features
Embed v22023Basic multilingual embedding
Embed v3Late 2023Improved precision, int8/binary quantization
Embed v4January 2026Multimodal, 128K input, enhanced compression

Relationship with OpenClaw Ecosystem

Cohere Embed can serve as the embedding solution for OpenClaw aiming for the highest retrieval precision. Its multimodal capabilities enable OpenClaw to retrieve not only text but also image content. The 128K ultra-long input support allows users to embed large documents without excessive chunking. Combined with the Cohere Rerank product, it can build a complete high-precision retrieval pipeline with retrieval + reranking.