A Chunked-Object Pattern for Multi-Region Large Payload Storage in Managed NoSQL Databases
By: Manideep Reddy Chinthareddy
Potential Business Impact:
Stores big files in databases without delays.
Many managed key-value and NoSQL databases - such as Amazon DynamoDB, Azure Cosmos DB, and Google Cloud Firestore - enforce strict maximum item sizes (e.g., 400 KB in DynamoDB). This constraint imposes significant architectural challenges for applications requiring low-latency, multi-region access to objects that exceed these limits. The standard industry recommendation is to offload payloads to object storage (e.g., Amazon S3) while retaining a pointer in the database. While cost-efficient, this "pointer pattern" introduces network overhead and exposes applications to non-deterministic replication lag between the database and the object store, creating race conditions in active-active architectures. This paper presents a "chunked-object" pattern that persists large logical entities as sets of ordered chunks within the database itself. We precisely define the pattern and provide a reference implementation using Amazon DynamoDB Global Tables. The design generalizes to any key-value store with per-item size limits and multi-region replication. We evaluate the approach using telemetry from a production system processing over 200,000 transactions per hour. Results demonstrate that the chunked-object pattern eliminates cross-system replication lag hazards and reduces p99 cross-region time-to-consistency for 1 MB payloads by keeping data and metadata within a single consistency domain.
Similar Papers
Chipmink: Efficient Delta Identification for Massive Object Graph
Databases
Saves computer data storage space and time.
The Chonkers Algorithm: Content-Defined Chunking with Strict Guarantees on Size and Locality
Data Structures and Algorithms
Makes computer files smaller and easier to update.
Building Scalable AI-Powered Applications with Cloud Databases: Architectures, Best Practices and Performance Considerations
Databases
Lets AI apps use data super fast.