Rate Limits

All API endpoints are protected by rate limiting to ensure fair usage and system stability. Rate limits are applied per API key and reset automatically after the time window expires.

Rate Limit Types

We use two types of rate limiting depending on the endpoint's computational cost:

1. Upload Rate Limits (Dual Protection)

Upload endpoints use dual rate limiting with both burst protection and sustained limits:

Limit Type	Window	Requests	Purpose
Burst Protection	10 seconds	10	Prevents rapid-fire abuse
Sustained Limit	1 minute	20	Overall quota protection

How it works:

You can upload up to 10 files quickly (within 10 seconds)
After that, you must wait for the 10-second window to reset
Maximum 20 uploads total per minute
Both limits reset independently

Affected Endpoints:

POST /v1/public/documents/upload-presigned
POST /v1/public/documents/upload-presigned-finalize
POST /v1/public/documents/upload-direct
POST /v1/public/documents/upload-from-url

2. Search Rate Limits (Strict)

Search operations are computationally expensive and use strict rate limiting:

Limit Type	Window	Requests
Search Operations	1 minute	10

Affected Endpoints:

POST /v1/public/searches/dense
POST /v1/public/searches/sparse
POST /v1/public/searches/hybrid
POST /v1/public/searches/builtin-rerank

3. Read Rate Limits (Generous)

Read operations are lightweight and have generous limits:

Limit Type	Window	Requests
Read Operations	1 minute	100

Affected Endpoints:

GET /v1/public/documents/:documentId
GET /v1/public/documents
GET /v1/public/documents/upload-presigned/:uploadId
DELETE /v1/public/documents/:documentId
GET /v1/public/searches/:searchId
GET /v1/public/searches/:searchId/results
GET /v1/public/searches/groups/:searchGroupId
GET /v1/public/searches/groups/:searchGroupId/results
GET /v1/public/searches/groups

Complete Rate Limit Reference

Endpoint	Method	Burst Limit	Minute Limit	Notes
`/v1/public/documents/upload-presigned`	POST	10 per 10s	20 per min	Generate presigned URL
`/v1/public/documents/upload-presigned/:uploadId`	GET	-	100 per min	Verify upload
`/v1/public/documents/upload-presigned-finalize`	POST	10 per 10s	20 per min	Finalize upload
`/v1/public/documents/upload-direct`	POST	10 per 10s	20 per min	Direct file upload
`/v1/public/documents/upload-from-url`	POST	10 per 10s	20 per min	Upload from URL
`/v1/public/documents/:documentId`	GET	-	100 per min	Get document details
`/v1/public/documents/:documentId`	DELETE	-	100 per min	Delete document
`/v1/public/documents`	GET	-	100 per min	List documents
`/v1/public/searches/dense`	POST	-	10 per min	Dense search
`/v1/public/searches/sparse`	POST	-	10 per min	Sparse search
`/v1/public/searches/hybrid`	POST	-	10 per min	Hybrid search
`/v1/public/searches/builtin-rerank`	POST	-	10 per min	Builtin-rerank search
`/v1/public/searches/:searchId`	GET	-	100 per min	Get search status
`/v1/public/searches/:searchId/results`	GET	-	100 per min	Get search results
`/v1/public/searches/groups/:searchGroupId`	GET	-	100 per min	Get group status
`/v1/public/searches/groups/:searchGroupId/results`	GET	-	100 per min	Get group results
`/v1/public/searches/groups`	GET	-	100 per min	List search groups

Rate Limit Headers

All API responses include rate limit headers so you can track your usage:

RateLimit-Limit: 10
RateLimit-Remaining: 7
RateLimit-Reset: 1699884000

Header	Description	Example
`RateLimit-Limit`	Maximum requests allowed in the window	`10`
`RateLimit-Remaining`	Requests remaining in current window	`7`
`RateLimit-Reset`	Unix timestamp when the limit resets	`1699884000`

Convert reset time to human-readable:

const resetTime = new Date(headers["RateLimit-Reset"] * 1000);
console.log("Rate limit resets at:", resetTime.toLocaleString());

Rate Limit Exceeded (HTTP 429)

When you exceed a rate limit, the API returns HTTP 429 Too Many Requests with details about which limit you hit:

Upload Rate Limit (Burst)

{
  "error": "Rate limit exceeded",
  "message": "Too many requests too quickly. Maximum 10 uploads per 10 seconds allowed.",
  "limit": 10,
  "window": "10 seconds",
  "retryAfter": "8",
  "hint": "Slow down! You're making requests too fast. Wait a few seconds between uploads."
}

Upload Rate Limit (Sustained)

{
  "error": "Rate limit exceeded",
  "message": "Too many upload requests. Maximum 20 uploads per minute allowed per API key.",
  "limit": 20,
  "window": "1 minute",
  "retryAfter": "23",
  "hint": "You've used your minute quota. Wait for the limit to reset or implement client-side queueing."
}

Search Rate Limit

{
  "error": "Rate limit exceeded",
  "message": "Too many search requests. Maximum 10 searches per minute allowed.",
  "limit": 10,
  "window": "1 minute",
  "retryAfter": "45",
  "hint": "Search operations are computationally expensive. Please reduce query frequency."
}

Read Rate Limit

{
  "error": "Rate limit exceeded",
  "message": "Too many read requests. Maximum 100 reads per minute allowed.",
  "limit": 100,
  "window": "1 minute",
  "retryAfter": "12"
}

Handling Rate Limits

1. Respect the Headers

Always check rate limit headers before making requests:

async function makeRequest(url, options) {
  const response = await fetch(url, options);

  // Check remaining requests
  const remaining = parseInt(response.headers.get("RateLimit-Remaining"));
  const reset = parseInt(response.headers.get("RateLimit-Reset"));

  console.log(`Requests remaining: ${remaining}`);

  if (remaining === 0) {
    const waitTime = reset * 1000 - Date.now();
    console.log(`Rate limit reached. Waiting ${waitTime}ms...`);
    await new Promise((resolve) => setTimeout(resolve, waitTime));
  }

  return response;
}

2. Implement Exponential Backoff

When you hit a rate limit, wait before retrying:

async function requestWithBackoff(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    // Rate limited - wait before retry
    const retryAfter = parseInt(response.headers.get("Retry-After") || "5");
    const waitTime = retryAfter * 1000 * Math.pow(2, i); // Exponential backoff

    console.log(
      `Rate limited. Waiting ${waitTime}ms before retry ${i + 1}/${maxRetries}...`
    );
    await new Promise((resolve) => setTimeout(resolve, waitTime));
  }

  throw new Error("Max retries reached");
}

3. Use Request Queuing

For bulk operations, implement a queue to respect rate limits:

class RateLimitedQueue {
  constructor(limit, windowMs) {
    this.limit = limit;
    this.windowMs = windowMs;
    this.queue = [];
    this.processing = false;
  }

  async add(fn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ fn, resolve, reject });
      this.process();
    });
  }

  async process() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;
    const timestamps = [];

    while (this.queue.length > 0) {
      const now = Date.now();

      // Remove timestamps outside the window
      const cutoff = now - this.windowMs;
      while (timestamps.length > 0 && timestamps[0] < cutoff) {
        timestamps.shift();
      }

      // Check if we can make a request
      if (timestamps.length < this.limit) {
        const { fn, resolve, reject } = this.queue.shift();
        timestamps.push(now);

        try {
          const result = await fn();
          resolve(result);
        } catch (error) {
          reject(error);
        }
      } else {
        // Wait until oldest timestamp expires
        const waitTime = timestamps[0] + this.windowMs - now;
        await new Promise((resolve) => setTimeout(resolve, waitTime));
      }
    }

    this.processing = false;
  }
}

// Usage for uploads (20 per minute)
const uploadQueue = new RateLimitedQueue(20, 60000);

// Queue upload requests
const results = await Promise.all(
  files.map((file) => uploadQueue.add(() => uploadFile(file)))
);

4. Monitor Your Usage

Track your API usage to avoid hitting limits:

class RateLimitMonitor {
  constructor() {
    this.requests = {
      uploads: [],
      searches: [],
      reads: [],
    };
  }

  trackRequest(type) {
    const now = Date.now();
    this.requests[type].push(now);

    // Clean old entries (older than 1 minute)
    const cutoff = now - 60000;
    this.requests[type] = this.requests[type].filter((t) => t > cutoff);
  }

  getUsage(type) {
    const limits = {
      uploads: 20,
      searches: 10,
      reads: 100,
    };

    const used = this.requests[type].length;
    const limit = limits[type];
    const remaining = limit - used;

    return {
      used,
      limit,
      remaining,
      percentUsed: (used / limit) * 100,
    };
  }

  canMakeRequest(type) {
    const usage = this.getUsage(type);
    return usage.remaining > 0;
  }
}

// Usage
const monitor = new RateLimitMonitor();

async function uploadWithMonitoring(file) {
  if (!monitor.canMakeRequest("uploads")) {
    throw new Error("Upload rate limit would be exceeded");
  }

  const result = await uploadFile(file);
  monitor.trackRequest("uploads");

  const usage = monitor.getUsage("uploads");
  console.log(
    `Uploads: ${usage.used}/${usage.limit} (${usage.remaining} remaining)`
  );

  return result;
}

Rate Limit Increase Requests

Need higher limits for your use case? Contact us at support@floreal.ai with:

Your API key (first 8 characters only)
Current usage patterns (requests per minute/hour)
Desired limits (what you need and why)
Use case description (what you're building)

We review requests on a case-by-case basis and can provision higher limits for legitimate use cases.

Summary

Operation Type	Limit	Window	Best Practice
Uploads	10 burst, 20 sustained	10s, 60s	Space uploads 3s apart
Searches	10	60s	Queue searches with 6s delay
Reads	100	60s	Cache aggressively

Key Takeaways:

✅ Always check RateLimit-Remaining header
✅ Implement exponential backoff for 429 responses
✅ Use request queuing for bulk operations
✅ Cache read results to reduce API calls
✅ Poll status endpoints every 5-10 seconds, not faster

Rate Limits

On this page