mirror of
https://github.com/geoffsee/predict-otron-9001.git
synced 2025-09-08 22:46:44 +00:00
update docs
This commit is contained in:
@@ -1,4 +1,100 @@
|
||||
# Embeddings Engine
|
||||
|
||||
A high-performance text embeddings service that generates vector representations of text using state-of-the-art models.
|
||||
This crate wraps the fastembed crate to provide embeddings and partially adapts the openai specification.
|
||||
A high-performance text embeddings service that generates vector representations of text using state-of-the-art models. This crate wraps the FastEmbed library to provide embeddings with OpenAI-compatible API endpoints.
|
||||
|
||||
## Overview
|
||||
|
||||
The embeddings-engine provides a standalone service for generating text embeddings that can be used for semantic search, similarity comparisons, and other NLP tasks. It's designed to be compatible with OpenAI's embeddings API format.
|
||||
|
||||
## Features
|
||||
|
||||
- **OpenAI-Compatible API**: `/v1/embeddings` endpoint matching OpenAI's specification
|
||||
- **FastEmbed Integration**: Powered by the FastEmbed library for high-quality embeddings
|
||||
- **Multiple Model Support**: Support for various embedding models
|
||||
- **High Performance**: Optimized for fast embedding generation
|
||||
- **Standalone Service**: Can run independently or as part of the predict-otron-9000 platform
|
||||
|
||||
## Building and Running
|
||||
|
||||
### Prerequisites
|
||||
- Rust toolchain
|
||||
- Internet connection for initial model downloads
|
||||
|
||||
### Standalone Server
|
||||
```bash
|
||||
cargo run --bin embeddings-engine --release
|
||||
```
|
||||
|
||||
The service will start on port 8080 by default.
|
||||
|
||||
## API Usage
|
||||
|
||||
### Generate Embeddings
|
||||
|
||||
**Endpoint**: `POST /v1/embeddings`
|
||||
|
||||
**Request Body**:
|
||||
```json
|
||||
{
|
||||
"input": "Your text to embed",
|
||||
"model": "nomic-embed-text-v1.5"
|
||||
}
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```json
|
||||
{
|
||||
"object": "list",
|
||||
"data": [
|
||||
{
|
||||
"object": "embedding",
|
||||
"index": 0,
|
||||
"embedding": [0.1, 0.2, 0.3, ...]
|
||||
}
|
||||
],
|
||||
"model": "nomic-embed-text-v1.5",
|
||||
"usage": {
|
||||
"prompt_tokens": 0,
|
||||
"total_tokens": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example Usage
|
||||
|
||||
**Using cURL**:
|
||||
```bash
|
||||
curl -s http://localhost:8080/v1/embeddings \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"input": "The quick brown fox jumps over the lazy dog",
|
||||
"model": "nomic-embed-text-v1.5"
|
||||
}' | jq
|
||||
```
|
||||
|
||||
**Using Python OpenAI Client**:
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
base_url="http://localhost:8080/v1",
|
||||
api_key="dummy" # Not validated but required by client
|
||||
)
|
||||
|
||||
response = client.embeddings.create(
|
||||
input="Your text here",
|
||||
model="nomic-embed-text-v1.5"
|
||||
)
|
||||
|
||||
print(response.data[0].embedding)
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The service can be configured through environment variables:
|
||||
- `SERVER_PORT`: Port to run on (default: 8080)
|
||||
- `RUST_LOG`: Logging level (default: info)
|
||||
|
||||
## Integration
|
||||
|
||||
This service is designed to work seamlessly with the predict-otron-9000 main server, but can also be deployed independently for dedicated embeddings workloads.
|
Reference in New Issue
Block a user