Advanced DynamoDB: Leveraging Streams and TTL for Real-Time Processing

Unlock advanced, real-time patterns in DynamoDB. Learn how to use DynamoDB Streams to trigger Lambda functions on data changes and how to implement Time To Live (TTL) for automatic data expiration.

DynamoDB is more than just a simple key-value database; it's a powerful engine for building high-performance, event-driven applications. Two of its most potent features for creating such systems are DynamoDB Streams and Time To Live (TTL).

By understanding and combining these features, you can unlock advanced, real-time data processing patterns without managing any servers.

What are DynamoDB Streams?

A DynamoDB Stream is a time-ordered sequence of item-level modifications (creates, updates, and deletes) in a DynamoDB table. Think of it as a changelog for your table. When you enable a stream on a table, DynamoDB captures every modification and records it in the stream.

This is incredibly powerful because you can configure the stream to trigger an AWS Lambda function whenever a new event appears. This allows you to react to data changes in near real-time.

A Practical Example: Replicating Data to a Search Index

Imagine you have a Products table and you want to keep a search index in Amazon OpenSearch Service up to date. You can use a DynamoDB Stream to trigger a Lambda function that replicates the changes.

  1. Enable the Stream: In your Products table settings, enable DynamoDB Streams. A common configuration is New and old images, which provides both the new version of the item and its previous version in the stream record.

  2. Create a Lambda Function: Write a Python Lambda function to process the stream records.

    import json
    
    def handler(event, context):
        """Processes DynamoDB stream records and replicates them."""
        for record in event['Records']:
            if record['eventName'] == 'INSERT' or record['eventName'] == 'MODIFY':
                # Get the new item image from the stream record
                new_image = record['dynamodb']['NewImage']
                
                # In a real application, you would deserialize this 
                # and index it in your OpenSearch cluster.
                print(f"Indexing item: {new_image}")
                
            elif record['eventName'] == 'REMOVE':
                # Get the old item's keys to delete it from the index
                old_keys = record['dynamodb']['Keys']
                print(f"Deleting item from index: {old_keys}")
    
  3. Configure the Trigger: In the Lambda console, add a trigger and select your DynamoDB table's stream as the source.

Now, every time a product is created, updated, or deleted in your Products table, your Lambda function will be invoked almost instantly, keeping your search index perfectly in sync.

What is Time To Live (TTL)?

Time To Live (TTL) is a mechanism that allows you to define a per-item timestamp to determine when an item is no longer needed. DynamoDB will automatically delete items from your table once their specified TTL timestamp has expired, at no extra cost.

This is perfect for managing transient data like session state, temporary caches, or logs.

How to Use TTL

  1. Enable TTL on a Table: In your table settings, enable TTL and specify the attribute name that will hold the expiration timestamp (e.g., ttl).

  2. Set the TTL Attribute: When you write an item to the table, add an attribute with the name you specified. The value of this attribute must be a Unix epoch timestamp in seconds.

    import boto3
    import time
    from datetime import datetime, timedelta
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('MySessionTable')
    
    # Set the item to expire in 1 hour
    expiration_time = int((datetime.now() + timedelta(hours=1)).timestamp())
    
    table.put_item(
        Item={
            'sessionId': 'session-abc-123',
            'username': 'eric.wilson',
            'ttl': expiration_time
        }
    )
    

DynamoDB's TTL process typically deletes expired items within 48 hours of expiration. It's a background process, so it's not instantaneous.

Combining Streams and TTL for Powerful Patterns

Here's where things get interesting. When TTL deletes an item, the deletion event is published to the table's DynamoDB Stream. This allows you to take action when an item expires.

A classic use case is data archival. Imagine you want to keep your hot data in a DynamoDB table for 90 days and then archive it to S3 for long-term storage.

  1. Set a ttl attribute on your items to 90 days in the future.
  2. Enable a DynamoDB Stream on your table.
  3. Create a Lambda function that is triggered by the stream.
  4. In the Lambda function, check if the event is a REMOVE event and if the user identity in the record is dynamodb.amazonaws.com. This indicates the deletion was performed by the TTL service.
  5. If it was a TTL deletion, take the OldImage from the stream record and write it to an S3 bucket.

This pattern creates a fully automated, serverless archival system with minimal code and no maintenance.

Conclusion

DynamoDB Streams and TTL transform your database from a passive data store into an active, event-generating engine. By leveraging these features, you can build sophisticated, real-time, and event-driven systems for a wide range of use cases—from data replication and archival to complex business workflow automation—all in a completely serverless fashion.