Skip to main content

Service Limits for Momento Cache

Momento Cache seeks to protect itself and its customers when it comes to service resources. To do this, every account, cache, and topic has service limits, or what we call "guardrails" (like a curvy mountain road) to help keep operations running how they should and as smoothly as possible. This page outlines the default service limits:

Momento Cache limitsValue
API rate per cache (data plane)100 operations/s
API rate per customer (control plane)5 operations/s
Throughput per cache1MB/s
Maximum item size1MB
Max cache count (per account)10
Time to live (TTL)1 day
Per collection (CDT) element size limit128KB
Permissions per API key or token (hard limit)10

Region Availability

If your prefered provider or Region are not listed, contact us and let's talk.

Region Availability (AWS)

Region NameRegion
US East (N.Virginia)us-east-1
US East (Ohio)us-east-2
US West (Oregon)us-west-2
Europe (Ireland)eu-west-1
Europe (Frankfurt)eu-central-1
Asia (Mumbai)ap-south-1
Asia (Tokyo)ap-northeast-1
Asia (Singapore)ap-southeast-1

Soft limits and support

The limits on this page are soft limits that can be altered unless specifically denoted. If you need limits adjusted, please reach out to Momento Support. Please include your login email, the name of the cache(s) to be altered, the cloud+region the cache is located in (e.g. AWS eu-west-1), and which limits from the list you’d like increased.

Operations

Service limits are based on the number of operations performed per second. Some cache data plane APIs can perform multiple operations in a single request.

Since multi-element operations are more efficient, the limit cost of these APIs is discounted at a 2:1 ratio. This means every two elements will count as one operation towards the limiter. For example, a SetAddElements request adding one or two elements costs one operation, but with three or four elements it costs two operations, etc.

The below table describes how the number of operations is calculated for all cache APIs.

API NameMulti-Element APIOperations
Set1
Get1
Delete1
Increment1
Ping1
ItemGetType1
KeyExists1
KeysExistNumber of keys in request/2
SetIfNotExists1
UpdateTtl1
IncreaseTtl1
DecreaseTtl1
ItemGetTtl1
DictionaryFetchNumber of fields in response/2, or 1 if dictionary is not found
DictionaryGetField1
DictionaryGetFieldsNumber of fields in request/2
DictionaryIncrement1
DictionaryRemoveField1
DictionaryRemoveFieldsNumber of fields in request/2
DictionarySetField1
DictionarySetFieldsNumber of fields in request/2
DictionaryLength1
ListFetchNumber of elements in response/2, or 1 if list is not found
ListConcatenateBackNumber of elements in request/2
ListConcatenateFrontNumber of elements in request/2
ListLength1
ListPopBack1
ListPopFront1
ListPushBack1
ListPushFront1
ListRemoveValue1
ListRetain1
SetAddElement1
SetAddElementsNumber of elements in request/2
SetFetchNumber of elements in response/2, or 1 if set is not found
SetRemoveElement1
SetRemoveElementsNumber of elements in request/2
SetContainsElement1
SetContainsElementsNumber of elements in request/2
SetLength1
SortedSetPutElement1
SortedSetPutElementsNumber of elements in request/2
SortedSetFetchByRankNumber of elements in response/2, or 1 if sorted set is not found
SortedSetFetchByScoreNumber of elements in response/2, or 1 if sorted set is not found
SortedSetGetScore1
SortedSetGetScoresNumber of elements in request/2
SortedSetRemoveElement1
SortedSetRemoveElementsNumber of elements in request/2
SortedSetGetRank1
SortedSetIncrementScore1
SortedSetLength1
SortedSetLengthByScore1
note

To further reduce the number of operations charged against your account, look into setting the read concern header to Express. This will reduce the charged operations count to 0.8x the default value, and will reduce latencies for frequently accessed keys.