Cohere Embed - Embedding Model
Basic Information
- Company/Brand: Cohere
- Country/Region: Canada (Toronto)
- Official Website: https://cohere.com
- Type: Multimodal Embedding Model
- Latest Version: Embed v4 (Released January 2026)
- Predecessor: Embed v3
Product Description
Cohere Embed is a multilingual, multimodal embedding model series developed by Cohere, capable of converting text and images into semantic vector representations. The latest Embed v4 is a comprehensive multimodal model that can handle text, images, and interleaved text and image content, transforming them into unified vector representations. It ranks at the top of the MTEB benchmark with a score of 65.2.
Core Features/Characteristics
- Multimodal Embedding: Supports embedding for both text and images, as well as interleaved content
- Multilingual Support: Excellent performance in both English and multilingual environments
- Flexible Dimensions: Configurable output dimensions ranging from 256 to 1536
- Multiple Quantization Formats: float, int8, uint8, binary, ubinary
- Matryoshka Embedding: Supports dimension cropping for further compression
- Ultra-Long Input: Supports input up to 128,000 tokens
- Base64 Image Input: Supports direct embedding of Base64 encoded images
- Search Type Optimization: Supports input_type parameter to specify query/document type
Technical Specifications
- Maximum Dimension: 1536
- Minimum Dimension: 256
- Maximum Input: 128,000 tokens
- Quantization Options: float / int8 / uint8 / binary / ubinary
- Storage Compression: Up to 48x compression with binary quantization
Business Model
- Text Embedding: $0.12 / million tokens
- Image Embedding: $0.47 / million image tokens
- Deployment Options:
- Direct API calls via Cohere API
- AWS Bedrock
- Azure AI
- Oracle Cloud
- Google Cloud
- Trial Quota: Free trial quota available
Target Users
- Developers of enterprise search and RAG systems
- Application developers requiring multimodal retrieval
- Multilingual content processing teams
- Enterprises needing deployment across multiple cloud platforms
- Teams working on recommendation systems and content classification
Competitive Advantages
- Highest MTEB benchmark score (65.2), leading in precision
- Mature multimodal capabilities (unified embedding for text + images)
- 128K ultra-long input support, capable of handling large documents
- Flexible quantization and compression options, reducing storage costs
- Multi-cloud platform deployment support (AWS, Azure, Oracle, GCP)
- Seamless integration with Cohere Rerank product, offering a one-stop solution for retrieval + reranking
Limitations
- Closed-source commercial model, cannot be deployed locally
- Text embedding price ($0.12/MTok) is relatively high
- Image embedding price ($0.47/MTok) is expensive
- Lacks customization flexibility compared to open-source models (e.g., BGE-M3)
Version Evolution
| Version | Release Date | Key Features |
|---|---|---|
| Embed v2 | 2023 | Basic multilingual embedding |
| Embed v3 | Late 2023 | Improved precision, int8/binary quantization |
| Embed v4 | January 2026 | Multimodal, 128K input, enhanced compression |
Relationship with OpenClaw Ecosystem
Cohere Embed can serve as the embedding solution for OpenClaw aiming for the highest retrieval precision. Its multimodal capabilities enable OpenClaw to retrieve not only text but also image content. The 128K ultra-long input support allows users to embed large documents without excessive chunking. Combined with the Cohere Rerank product, it can build a complete high-precision retrieval pipeline with retrieval + reranking.