fulfillmenttools
  • Welcome to the fulfillmenttools Platform Documentation
  • Getting Started
    • Setup your access to fulfillmenttools
    • Make your first API Calls
      • Add your first facility
      • Add your first listing
      • Place your first order
    • Core concepts & terminology
      • Order Flow
    • Postman Collection
    • Client SDKs
    • FAQ
  • Clients
    • Backoffice
      • First steps - Registration
      • Network view
        • Home
        • Orders
          • Unroutable orders
          • Pre-orders & Backorders
          • Order History
        • Inventory Management
          • Stock Overview
          • Channel Inventory
        • Facilities
        • Users
        • Returns
        • DOMS configuration
        • Settings
        • Analytics
          • DOMS Pages
          • Fulfillment Operations Pages
          • Inventory Pages
          • Downloads Page
      • Facility view
        • Home
        • Inbound
        • Tasks
        • Listings
        • Storage Locations
        • Facility
        • Users
    • Inventory app
      • Registration Inventory App
      • App sections
        • Inbound
        • Storage and relocation
    • Operations app
      • Android
        • Manual Registration
        • Android Enterprise Registration
        • Sections
          • Picking
            • Load Units (legacy)
            • Substitute items
            • Weighed or measured products
            • Scanning configuration
            • Picking Methods
              • Batch Picking
              • Multi Order Picking
          • Packing
          • Handover
          • Returns (legacy)
        • Printing
        • Notifications
      • Webapp
        • Packing
      • Overview features Android & Webapp
    • Technical requirements
      • Zebra Hardware Scanner Configuration
      • Honeywell Hardware Scanner Configuration
      • Supported barcodes for camera scanning
      • Requirements for fft applications
      • Zebra printer
    • Returns app
      • Handle unannounced returns
      • Handle announced returns
  • Products
    • Core Functionality
      • Process
        • External actions
      • Add and manage facilities
      • Notification Center
      • Checking on features
      • Tags and Stickers Concept
      • GDPR
      • Remote Configuration
      • Expiry
      • Target time
      • Time calculation for queries of future availabilities (LPS-calculation)
      • Interfacility Transfer
    • Carrier Management
      • Overview
        • Available Carriers
      • Concepts
        • Carrier Country Service Mapping (CCSM)
        • Non-delivery-days
        • Custom Carrier & Headless operation of Carriers
      • Providing needed data
    • Fulfillment Options
      • Fulfillability Check
      • Checkout Options
        • Available fulfillment options based on basket
        • Earliest possible delivery date
        • Available delivery dates within time-period
        • Availability for delivery date
      • Delivery Promise
    • Inventory Management
      • Configurations
      • Entities
        • Listing
        • Stock
          • Stock Properties
        • Storage Location
        • Zone
      • Global Inventory
        • Stock availability
        • Channel Inventory
        • Expected stock
        • Inbound Process
        • Reservations
        • Safety Stock
      • Inventory Control
        • Inventory Traits
        • Measurement Units
        • Outbound Inventory Tracking
        • Storage Location Recommendations
    • Order Fulfillment
      • Headless Order Fulfillment
      • Pick Jobs
      • Zone picking
      • Load Units
      • Custom Service
      • Handover Jobs
      • Add External Documents
      • Configurations
        • Picking Configuration
          • Picking methods
          • Short Pick Reasons
        • Packing Configuration
          • Packing Container Types
        • Print / Document Configuration
        • Tag Configurations
          • Parcel Tag Configuration
        • Handover Configuration
        • Operative Container Types
    • Order Routing
      • Entities
        • Ship-from-Store Orders
        • Click-and-Collect Orders
        • Locked Orders
        • Custom Services Orders
          • Simple Custom Service Order
          • Complex Custom Service Order
      • Fences
      • Ratings
      • Order Split
        • Order split - initial routing
        • Order split after shortpick
        • Item bundles
      • Reroute
      • Shape the routing with the DOMS Toolkit
      • Decision logs
    • Returns Management
      • Returns legacy
        • Available status
      • Returns 2.0
        • Return Reasons
        • Item Conditions
        • Integrating Returns with Events
    • Use Cases
      • Demand-Driven Replenishment
      • Expected stock in availability
      • Multi Order Picking
      • Interfacility transfer
      • Assigned Users
  • Connecting to fulfillmenttools
    • General Topics
      • Use external identity providers to authenticate to fulfillmenttools
        • Microsoft Entra ID / Azure Active Directory (AD)
      • Public Event Export
      • Available Regions
      • Backup Policies
    • GraphQL API
    • RESTful API
      • General Topics
        • API Release Life Cycle
        • Versioning
        • Authorization
        • Customization via Attributes
        • Update Guarantees
        • Rate Limits
        • Resource Timestamps
        • Pagination Interface
        • Localization
        • Custom Attributes
      • OpenAPI Specification
        • Swagger UI
        • OpenAPI 3.0 Spec
    • Eventing
      • Structure of an Event
      • Available Events
      • Tutorial
    • commercetools Connect
    • Integration Tutorial
      • Adding facilities
      • Adding listings to facilities
      • Configuring stocks
      • Carrier configuration
      • Placing orders
      • Checkout Options
      • Distributed Order Management System (Routing)
      • Local fulfillment configuration
  • Incident Reporting
    • How to report incidents in fulfillmenttools
    • How to define incident priorities
  • Release Notes
    • Release Summary – May 2024
    • Release Summary – June 2024
    • Release Summary – July 2024
    • Release Summary – August 2024
    • Release Summary – September 2024
    • Release Summary – October 2024
Powered by GitBook
On this page
  • Scaling behavior under load
  • "When does the system scale / add more instances?"
  • Best Practice: Use Exponential Backoffs
  • Best practice: Apply retries for other HTTP status codes

Was this helpful?

  1. Connecting to fulfillmenttools
  2. RESTful API
  3. General Topics

Rate Limits

Learn about how fulfillmenttools handles load and how your client should be designed to allow for seamless scaling

PreviousUpdate GuaranteesNextResource Timestamps

Last updated 5 months ago

Was this helpful?

This page is outdated. Please go to our new documentation under .

Currently fulfillmenttools APIs has a rate limit of ~1000 requests per second. This limit can be raised on demand. Also we reserve the right to impose a limit that lowers 1000 requests per second in the future.

Consider a system that provides APIs as the primary way on how to interact with it. Thus API Calls ("Requests") have to be issued from any clients side in order to create/update or read data at/from the platform. When the number of Requests rises during usage a given infrastructure either needs to block calls (e.g. impose "rate limits" that limit the resource to only 1 instance) or it needs to scale with the load. fulfillmenttools decided for the latter: the platform scales with load imposed by the clients using the API.

That means in general the usage of the API can scale with the business of our customers. In order to provide the necessary service level however the platform transparently scales the needed services horizontally - which means new containers, that can answer requests, need to be created. In that case the HTTP Response Code 429: Too many Requests comes into play - potentially way earlier than at 1000 requests a second as you will learn in the following.

Scaling behavior under load

Low Load Phase

When there is low load the provided resources are at a minimum. In the example above one instance is enough to handle all the calls in the depicted "Low Load Phase". That makes sense, because when there is nothing to do - why provide resources, that are idling?

Entering a High Load Phase

In the beginning of a "High load Phase" we also enter the "Scale Up Phase". On server side one or more Instances are starting in order to provide the needed resources to handle the current and the anticipated load of the near future. This happens fully automated and of course without any manual effort.

When entering a scale up phase the API may respond with HTTP Response Code 429 (Too many requests) to some requests. This does not necessarily mean, that you reached a rate limit, but it means that the current call could not be processed due to the lack of resources.

When you receive this response to a call an Instance is already starting. However, you need to re-issue the request in order to have it processed.

During time the needed resource will be provided and takes over some of the load. In this example from now on 2 instances are able to handle incoming requests.

"Will all my requests fail with '429: Too many requests' during a scale up phase?!"

Of course not. As the typical usage pattern of an API does not show sudden increases of load by doubling or even quadrupling the number of requests the better part of all the requests are being answered successfully by the existing instance(s).

During a High Load Phase

Leaving a High Load Phase

When the usage of the API drops again, when the load lowers not needed instances are being shutdown to safe resources. This happens completely transparent to the clients that issue requests. After the load is gone in above example only one instance is enough to handle the load.

"When does the system scale / add more instances?"

There is no definitive answer to this question. It depends on multiple parameters, such as complexity of the call, required CPU or Memory and the number and type of parallel calls that need to be processed as well as the currently available number of instances.

Also there is a number of concurrent requests that one instance is able to handle. Whenever an instance is about to reach this value another instance will be created.

Best Practice: Use Exponential Backoffs

tl;dr; Exponential Backoff in the context of web calls is the idea of re-issuing requests in case of errors in a decreasing cadence over time.

One example would be: A request is issued. The Requetst is anwered with 429: Too many requests as an answer.

The first retry is then done after 1 second. If the answer is still 429 the next retry call is issued after 3 seconds. When the answer is still 429 for the call the next request is done after 10 seconds and so on and so forth.

This pattern continues until either the service answers with a positive response code or a (client induced) time limit was reached which would trigger fail-over strategies.

The example above is, by all means, an extreme example. There are good chances that a request reaches an instance with available resources after the first call already. However, it merely serves as an example for exponential backoff in action.

This approach might sound like over-complicating things. This is in fact not the case.

On the contrary: Implementing such mechanisms adds substantial resilience to the system as your client is aware and able to cope with the situation, that for a brief period of time the functionality of a remote service is not available. This could and will - rarely, but still - happen in any distributed system!

Luckily for us developers you do not need to implement the behavior of exponential backoff by yourself. There are powerful libraries out there that do a close to perfect job for this problem. Here are some libraries that we found helpful, but of course you are free to choose another or do your own implementation:

Library
Programming Language
Link

Resilience4J

Java

exponential-backoff

Node

Best practice: Apply retries for other HTTP status codes

Once you have exponential retry mechanism in place it makes sense to also leverage this mechanism for other HTTP status codes. Among these codes are

  • 500 Internal Server Error

  • 502 Bad Gateway

  • 503 Service Unavailable

  • 504 Gateway Timeout

  • 408 Request Timeout

This again adds resilience to the system and allows for automated recovery of error states due to connectivity or temporary downtime.

However, best practice is, that a client should definitely be able to cope with the situation depicted above (see ).

During the time of high traffic / high request counts the provided resources handle the incoming calls. If the load rises further additional Instances are being provided and the behavior will be similar to the one described in . However, the percentage of affected calls will further decrease as the ratio of operationally available instance is rising over time.

So the true answer to that is (unfortunately): It depends. But the good news is, that following our this should not be an issue for any given client.

In order to issue retries in case of the described HTTP Response Code 429: Too many Requests we suggest the implementation of a retry mechanism based on algorithm.

Exponential Backoff
Best Practice
Entering a High Load Phase
Best Practice
https://github.com/resilience4j/resilience4j
https://www.npmjs.com/package/exponential-backoff
https://docs.fulfillmenttools.com/documentation
One example of "Exponential Backoff" in action
Drawing
Drawing