Rate Limits
All API endpoints are protected by rate limiting to ensure fair usage and system stability. Rate limits are applied per API key and reset automatically after the time window expires.
Rate Limit Types
We use two types of rate limiting depending on the endpoint's computational cost:
1. Upload Rate Limits (Dual Protection)
Upload endpoints use dual rate limiting with both burst protection and sustained limits:
| Limit Type | Window | Requests | Purpose |
|---|---|---|---|
| Burst Protection | 10 seconds | 10 | Prevents rapid-fire abuse |
| Sustained Limit | 1 minute | 20 | Overall quota protection |
How it works:
- You can upload up to 10 files quickly (within 10 seconds)
- After that, you must wait for the 10-second window to reset
- Maximum 20 uploads total per minute
- Both limits reset independently
Affected Endpoints:
POST /v1/public/documents/upload-presignedPOST /v1/public/documents/upload-presigned-finalizePOST /v1/public/documents/upload-directPOST /v1/public/documents/upload-from-url
2. Search Rate Limits (Strict)
Search operations are computationally expensive and use strict rate limiting:
| Limit Type | Window | Requests |
|---|---|---|
| Search Operations | 1 minute | 10 |
Affected Endpoints:
POST /v1/public/searches/densePOST /v1/public/searches/sparsePOST /v1/public/searches/hybridPOST /v1/public/searches/builtin-rerank
3. Read Rate Limits (Generous)
Read operations are lightweight and have generous limits:
| Limit Type | Window | Requests |
|---|---|---|
| Read Operations | 1 minute | 100 |
Affected Endpoints:
GET /v1/public/documents/:documentIdGET /v1/public/documentsGET /v1/public/documents/upload-presigned/:uploadIdDELETE /v1/public/documents/:documentIdGET /v1/public/searches/:searchIdGET /v1/public/searches/:searchId/resultsGET /v1/public/searches/groups/:searchGroupIdGET /v1/public/searches/groups/:searchGroupId/resultsGET /v1/public/searches/groups
Complete Rate Limit Reference
| Endpoint | Method | Burst Limit | Minute Limit | Notes |
|---|---|---|---|---|
/v1/public/documents/upload-presigned | POST | 10 per 10s | 20 per min | Generate presigned URL |
/v1/public/documents/upload-presigned/:uploadId | GET | - | 100 per min | Verify upload |
/v1/public/documents/upload-presigned-finalize | POST | 10 per 10s | 20 per min | Finalize upload |
/v1/public/documents/upload-direct | POST | 10 per 10s | 20 per min | Direct file upload |
/v1/public/documents/upload-from-url | POST | 10 per 10s | 20 per min | Upload from URL |
/v1/public/documents/:documentId | GET | - | 100 per min | Get document details |
/v1/public/documents/:documentId | DELETE | - | 100 per min | Delete document |
/v1/public/documents | GET | - | 100 per min | List documents |
/v1/public/searches/dense | POST | - | 10 per min | Dense search |
/v1/public/searches/sparse | POST | - | 10 per min | Sparse search |
/v1/public/searches/hybrid | POST | - | 10 per min | Hybrid search |
/v1/public/searches/builtin-rerank | POST | - | 10 per min | Builtin-rerank search |
/v1/public/searches/:searchId | GET | - | 100 per min | Get search status |
/v1/public/searches/:searchId/results | GET | - | 100 per min | Get search results |
/v1/public/searches/groups/:searchGroupId | GET | - | 100 per min | Get group status |
/v1/public/searches/groups/:searchGroupId/results | GET | - | 100 per min | Get group results |
/v1/public/searches/groups | GET | - | 100 per min | List search groups |
Rate Limit Headers
All API responses include rate limit headers so you can track your usage:
RateLimit-Limit: 10
RateLimit-Remaining: 7
RateLimit-Reset: 1699884000| Header | Description | Example |
|---|---|---|
RateLimit-Limit | Maximum requests allowed in the window | 10 |
RateLimit-Remaining | Requests remaining in current window | 7 |
RateLimit-Reset | Unix timestamp when the limit resets | 1699884000 |
Convert reset time to human-readable:
const resetTime = new Date(headers["RateLimit-Reset"] * 1000);
console.log("Rate limit resets at:", resetTime.toLocaleString());Rate Limit Exceeded (HTTP 429)
When you exceed a rate limit, the API returns HTTP 429 Too Many Requests with details about which limit you hit:
Upload Rate Limit (Burst)
{
"error": "Rate limit exceeded",
"message": "Too many requests too quickly. Maximum 10 uploads per 10 seconds allowed.",
"limit": 10,
"window": "10 seconds",
"retryAfter": "8",
"hint": "Slow down! You're making requests too fast. Wait a few seconds between uploads."
}Upload Rate Limit (Sustained)
{
"error": "Rate limit exceeded",
"message": "Too many upload requests. Maximum 20 uploads per minute allowed per API key.",
"limit": 20,
"window": "1 minute",
"retryAfter": "23",
"hint": "You've used your minute quota. Wait for the limit to reset or implement client-side queueing."
}Search Rate Limit
{
"error": "Rate limit exceeded",
"message": "Too many search requests. Maximum 10 searches per minute allowed.",
"limit": 10,
"window": "1 minute",
"retryAfter": "45",
"hint": "Search operations are computationally expensive. Please reduce query frequency."
}Read Rate Limit
{
"error": "Rate limit exceeded",
"message": "Too many read requests. Maximum 100 reads per minute allowed.",
"limit": 100,
"window": "1 minute",
"retryAfter": "12"
}Handling Rate Limits
1. Respect the Headers
Always check rate limit headers before making requests:
async function makeRequest(url, options) {
const response = await fetch(url, options);
// Check remaining requests
const remaining = parseInt(response.headers.get("RateLimit-Remaining"));
const reset = parseInt(response.headers.get("RateLimit-Reset"));
console.log(`Requests remaining: ${remaining}`);
if (remaining === 0) {
const waitTime = reset * 1000 - Date.now();
console.log(`Rate limit reached. Waiting ${waitTime}ms...`);
await new Promise((resolve) => setTimeout(resolve, waitTime));
}
return response;
}2. Implement Exponential Backoff
When you hit a rate limit, wait before retrying:
async function requestWithBackoff(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options);
if (response.status !== 429) {
return response;
}
// Rate limited - wait before retry
const retryAfter = parseInt(response.headers.get("Retry-After") || "5");
const waitTime = retryAfter * 1000 * Math.pow(2, i); // Exponential backoff
console.log(
`Rate limited. Waiting ${waitTime}ms before retry ${i + 1}/${maxRetries}...`
);
await new Promise((resolve) => setTimeout(resolve, waitTime));
}
throw new Error("Max retries reached");
}3. Use Request Queuing
For bulk operations, implement a queue to respect rate limits:
class RateLimitedQueue {
constructor(limit, windowMs) {
this.limit = limit;
this.windowMs = windowMs;
this.queue = [];
this.processing = false;
}
async add(fn) {
return new Promise((resolve, reject) => {
this.queue.push({ fn, resolve, reject });
this.process();
});
}
async process() {
if (this.processing || this.queue.length === 0) return;
this.processing = true;
const timestamps = [];
while (this.queue.length > 0) {
const now = Date.now();
// Remove timestamps outside the window
const cutoff = now - this.windowMs;
while (timestamps.length > 0 && timestamps[0] < cutoff) {
timestamps.shift();
}
// Check if we can make a request
if (timestamps.length < this.limit) {
const { fn, resolve, reject } = this.queue.shift();
timestamps.push(now);
try {
const result = await fn();
resolve(result);
} catch (error) {
reject(error);
}
} else {
// Wait until oldest timestamp expires
const waitTime = timestamps[0] + this.windowMs - now;
await new Promise((resolve) => setTimeout(resolve, waitTime));
}
}
this.processing = false;
}
}
// Usage for uploads (20 per minute)
const uploadQueue = new RateLimitedQueue(20, 60000);
// Queue upload requests
const results = await Promise.all(
files.map((file) => uploadQueue.add(() => uploadFile(file)))
);4. Monitor Your Usage
Track your API usage to avoid hitting limits:
class RateLimitMonitor {
constructor() {
this.requests = {
uploads: [],
searches: [],
reads: [],
};
}
trackRequest(type) {
const now = Date.now();
this.requests[type].push(now);
// Clean old entries (older than 1 minute)
const cutoff = now - 60000;
this.requests[type] = this.requests[type].filter((t) => t > cutoff);
}
getUsage(type) {
const limits = {
uploads: 20,
searches: 10,
reads: 100,
};
const used = this.requests[type].length;
const limit = limits[type];
const remaining = limit - used;
return {
used,
limit,
remaining,
percentUsed: (used / limit) * 100,
};
}
canMakeRequest(type) {
const usage = this.getUsage(type);
return usage.remaining > 0;
}
}
// Usage
const monitor = new RateLimitMonitor();
async function uploadWithMonitoring(file) {
if (!monitor.canMakeRequest("uploads")) {
throw new Error("Upload rate limit would be exceeded");
}
const result = await uploadFile(file);
monitor.trackRequest("uploads");
const usage = monitor.getUsage("uploads");
console.log(
`Uploads: ${usage.used}/${usage.limit} (${usage.remaining} remaining)`
);
return result;
}Rate Limit Increase Requests
Need higher limits for your use case? Contact us at support@floreal.ai with:
- Your API key (first 8 characters only)
- Current usage patterns (requests per minute/hour)
- Desired limits (what you need and why)
- Use case description (what you're building)
We review requests on a case-by-case basis and can provision higher limits for legitimate use cases.
Summary
| Operation Type | Limit | Window | Best Practice |
|---|---|---|---|
| Uploads | 10 burst, 20 sustained | 10s, 60s | Space uploads 3s apart |
| Searches | 10 | 60s | Queue searches with 6s delay |
| Reads | 100 | 60s | Cache aggressively |
Key Takeaways:
- ✅ Always check
RateLimit-Remainingheader - ✅ Implement exponential backoff for 429 responses
- ✅ Use request queuing for bulk operations
- ✅ Cache read results to reduce API calls
- ✅ Poll status endpoints every 5-10 seconds, not faster