DynamoDB doesn’t actually store data in relational tables; it uses a key-value and document model, meaning your "migration" is more of a re-architecting.

Let’s walk through migrating a simple relational database table, say users, to DynamoDB.

Here’s our hypothetical users table in a relational database:

CREATE TABLE users (
    user_id INT PRIMARY KEY,
    username VARCHAR(255) UNIQUE,
    email VARCHAR(255) UNIQUE,
    registration_date DATE,
    last_login TIMESTAMP,
    profile_data JSON
);

And some sample data:

user_id username email registration_date last_login profile_data
101 alice alice@example.com 2023-01-15 2024-03-10 09:30:00 {"bio": "Loves dogs"}
102 bob bob@example.com 2023-02-20 2024-03-11 14:00:00 {"bio": "Enjoys cats"}
103 charlie charlie@test.com 2023-03-01 2024-03-09 11:00:00 {"bio": "Likes birds"}

Designing the DynamoDB Table

In DynamoDB, every table needs a Partition Key and optionally a Sort Key. This combination uniquely identifies an item. We also need to consider Secondary Indexes for querying attributes other than the primary key.

For our users table, a good primary key would be user_id. Since user_id is already the primary key in our relational table, this maps directly. We don’t strictly need a sort key if user_id is sufficient to identify each user.

DynamoDB Table Definition:

  • Table Name: Users
  • Primary Key:
    • Partition Key: userId (Number)

Now, how do we query by username or email? These are unique in our relational table. In DynamoDB, we use Global Secondary Indexes (GSIs) for this.

Global Secondary Index 1 (for username lookup):

  • Index Name: UsernameIndex
  • Partition Key: username (String)
  • Sort Key: (None)
  • Projected Attributes: ALL (This means all attributes of the indexed items will be stored in the index, allowing us to retrieve them directly. For performance and cost, you might project only specific attributes.)

Global Secondary Index 2 (for email lookup):

  • Index Name: EmailIndex
  • Partition Key: email (String)
  • Sort Key: (None)
  • Projected Attributes: ALL

Other Attributes: registrationDate (String, ISO 8601 format), lastLogin (String, ISO 8601 format), profileData (Map).

The Migration Process

  1. Create the DynamoDB Table: You can do this via the AWS Console, AWS CLI, or SDKs. Using the AWS CLI:

    aws dynamodb create-table \
        --table-name Users \
        --attribute-definitions \
            AttributeName=userId,AttributeType=N \
            AttributeName=username,AttributeType=S \
            AttributeName=email,AttributeType=S \
        --key-schema \
            AttributeName=userId,KeyType=HASH \
        --global-secondary-indexes \
        '[
            {
                "IndexName": "UsernameIndex",
                "KeySchema": [
                    {"AttributeName": "username", "KeyType": "HASH"}
                ],
                "Projection": {
                    "ProjectionType": "ALL"
                },
                "ProvisionedThroughput": {
                    "ReadCapacityUnits": 5,
                    "WriteCapacityUnits": 5
                }
            },
            {
                "IndexName": "EmailIndex",
                "KeySchema": [
                    {"AttributeName": "email", "KeyType": "HASH"}
                ],
                "Projection": {
                    "ProjectionType": "ALL"
                },
                "ProvisionedThroughput": {
                    "ReadCapacityUnits": 5,
                    "WriteCapacityUnits": 5
                }
            }
        ]' \
        --provisioned-throughput \
            ReadCapacityUnits=5,WriteCapacityUnits=5 \
        --region us-east-1
    

    Note: ProvisionedThroughput values are examples. You’d set these based on expected load.

  2. Export Data from Relational Database: Export your users table data into a format like CSV or JSON. For CSV, it might look like this:

    user_id,username,email,registration_date,last_login,profile_data
    101,alice,alice@example.com,2023-01-15,2024-03-10T09:30:00Z,"{""bio"": ""Loves dogs""}"
    102,bob,bob@example.com,2023-02-20,2024-03-11T14:00:00Z,"{""bio"": ""Enjoys cats""}"
    103,charlie,charlie@test.com,2023-03-01,2024-03-09T11:00:00Z,"{""bio"": ""Likes birds""}"
    

    Notice the JSON string is escaped for CSV.

  3. Import Data into DynamoDB: The most efficient way to do this for large datasets is using the AWS Data Pipeline or AWS Glue. For smaller datasets, you can use a custom script with the AWS SDK. Here’s a conceptual Python example using boto3:

    import boto3
    import csv
    import json
    
    dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
    table = dynamodb.Table('Users')
    
    with open('users.csv', 'r') as f:
        reader = csv.DictReader(f)
        for row in reader:
            # Convert types and format dates
            item = {
                'userId': int(row['user_id']),
                'username': row['username'],
                'email': row['email'],
                'registrationDate': row['registration_date'], # Store as string, query with date functions if needed
                'lastLogin': row['last_login'], # Store as string
                'profileData': json.loads(row['profile_data']) # Parse JSON string into a DynamoDB Map
            }
            try:
                response = table.put_item(Item=item)
                print(f"Put item succeeded for userId: {row['user_id']}")
            except Exception as e:
                print(f"Error putting item for userId {row['user_id']}: {e}")
    
    

    This script reads the CSV, transforms the data (converting user_id to an integer, parsing profile_data JSON), and then uses put_item to insert each user into DynamoDB.

Querying in DynamoDB

  • Get user by userId (Primary Key):

    response = table.get_item(Key={'userId': 101})
    item = response.get('Item')
    print(item)
    
  • Get user by username (using GSI UsernameIndex):

    response = table.query(
        IndexName='UsernameIndex',
        KeyConditionExpression=boto3.dynamodb.conditions.Key('username').eq('alice')
    )
    items = response.get('Items')
    print(items)
    
  • Get user by email (using GSI EmailIndex):

    response = table.query(
        IndexName='EmailIndex',
        KeyConditionExpression=boto3.dynamodb.conditions.Key('email').eq('bob@example.com')
    )
    items = response.get('Items')
    print(items)
    

Key Differences and Considerations

  • Schema: DynamoDB is schema-less (or schema-on-read), but you define attribute types at the table/index level. Items in the same table can have different attributes.
  • Relationships: DynamoDB is not designed for complex relational joins. You’ll often denormalize data or use multiple queries to reconstruct related information.
  • Cost: DynamoDB is priced based on provisioned or on-demand throughput (reads/writes) and storage. GSIs incur additional costs for storage and throughput.
  • Data Types: DynamoDB has specific data types (String, Number, Binary, Boolean, Null, List, Map, Set). You need to map your relational types accordingly. JSON from your relational DB becomes a Map in DynamoDB. Dates and timestamps are often stored as ISO 8601 strings.
  • Scalability: DynamoDB scales horizontally. You manage capacity (provisioned or on-demand). Relational databases typically scale vertically or with more complex sharding strategies.

The most overlooked aspect of relational-to-NoSQL migration is how to model relationships. Instead of joins, you’ll often embed related data or use application-level joins that involve multiple DynamoDB lookups, which requires careful data modeling to avoid performance bottlenecks and excessive costs.

After migrating your primary table and indexes, you’ll likely need to consider how to handle foreign key relationships and implement them using DynamoDB’s patterns.

Want structured learning?

Take the full Dynamodb course →