Monday, July 15, 2024
No menu items!
HomeDatabase ManagementHow Scopely scaled “MONOPOLY GO!” for millions of players around the globe...

How Scopely scaled “MONOPOLY GO!” for millions of players around the globe with Amazon DynamoDB

This is a guest post co-written with Brandon Cuff, Principal Engineer at Scopely.

Scopely is the video game company behind “MONOPOLY GO!”, the biggest casual mobile game launched in history. “MONOPOLY GO!” makes extensive use of Amazon DynamoDB for workloads that require high scale, performance, and availability. This includes services such as player accounts, player inventory, and friend lists. Following the launch of the mobile game last year, the player engagement was immediately exceptional. While early retention beat expectations, the remarkable long-term stickiness of the experience has challenged the mobile casual game market, with over 8 million people playing every day of the week—representing more than 70% of the game’s 10 million+ total daily players.

In this post, we show you how Amazon DynamoDB enabled us to quickly respond to our rapid growth with consistent game performance and availability. We also describe how we improved the availability and performance of our matchmaking service with DynamoDB after facing challenges at scale with other solutions.

Background

Founded in 2011, Scopely is a leading video game company, home to the most successful mobile game ever launched in the U.S., “MONOPOLY GO!,” and many top-grossing, award-winning franchises including beloved favorites, “Stumble Guys,” “Star Trek Fleet Command,” “MARVEL Strike Force,” and “Yahtzee With Buddies,” among others. With global operations across North America, Central America, Europe, Middle East, Africa, and Asia, and additional game studio partners across four continents, our teams create, publish, and live-operate immersive games that empower players to play their own way, wherever, whenever, however they desire; on mobile, PC, console, and beyond. We are fueled by our world-class team and a proprietary technology platform Playgami that supports one of the most diversified portfolios in the mobile games industry. Recognized multiple times as one of Fast Company’s “World’s Most Innovative Companies” and most recently as a 2024 TIME100 “Most Influential Company In the World,” Scopely has emerged as the #1 mobile games publisher in the U.S., and top 5 in the world outside China, due to our ability to create long-lasting game experiences that players enjoy for years.

In April 2023, we launched “MONOPOLY GO!,” which shattered records as the fastest ever casual mobile game to reach $1 billion in revenue, just seven months after launch. The game is played by millions of people around the world and has achieved more than 150 million downloads.

Scopely has long used DynamoDB for other successful games, but never at the scale that “MONOPOLY GO!” achieved. We generally use DynamoDB to store any state that scales with the number of players in our games, including player accounts, inventory, and social graph. We have continued using DynamoDB for our games because:

We can scale out as far as we need without facing performance degradation as we would with a relational database.
We don’t have to manage a cluster ourselves. Instead of relying on a dedicated operations team to manage capacity planning, perform upgrades or patches, manage backups, and tune performance, we benefit from DynamoDB’s fully-managed, serverless architecture.

Shortly after launch, we saw more than 5 times the amount of traffic we expected in our best-case estimates, which was great! However, operating at this scale uncovered new challenges in our architecture that we hadn’t encountered before in previous games.

Matchmaking

Seeing your game enjoyed by an audience far larger than expected and watching the community grow around it is an amazing feeling. However, operating at an unexpected scale can present unexpected challenges, such as scaling our player matchmaking service.

“MONOPOLY GO!” is a social game where players both cooperate and compete. As with most social games, it’s often far more fun to play with friends and people within your broader social network. Your friends and their friends are a major component in how we identify groups of players that will have the most fun playing together.

As our player base expanded, the number of friend connections grew exponentially. While the database we initially used to power our friends service performed well past our initial player count estimates, it was not designed for the limitless scaling we were used to with DynamoDB. As a result, with our huge influx of players, we saw a ceiling approaching where we could no longer effectively scale our resources to meet demand. After evaluating several database options, we ultimately decided to migrate our matchmaking service to DynamoDB. Our choice was influenced by many of the reasons we love DynamoDB, including its consistent performance at any scale.

Since our requirements were more complex than simple key value lookups, we had to think about what data model would be best optimized for cost and performance. We first considered a read-optimized solution that used DynamoDB Streams and AWS Lambda functions to build and update materialized views of newly added friendships. This solution would allow an efficient single query to retrieve everyone within two hops of a player in the social graph. We decided against this solution because the Lambda updates significantly increased our write throughput.

Instead, we chose a data model that helped us meet our performance requirements while reducing our write volume. In this solution, we first query to find a player’s friends, and then run multiple queries in parallel to find the friends of those friends. DynamoDB’s ability to handle large numbers of simultaneous queries with consistent performance meant we were able to easily meet our SLAs. While this design pattern results in many queries, especially for popular players, it was more cost effective than the write-heavy materialized view approach. For the distribution of reads and writes in our application, the cost savings through reduced writes more than offset the slight increase in read cost.

Buddy list schema:

{
“UserID”: {“S”: “exampleUserId”},
“Friends”: {“L”: [{“S”: “friendId1”}, {“S”: “friendId2”}, {“S””friendId3”}, {“S”: “friendId4”}, {“S”: “friendId5”}]}
}

After we get the friend lists of all friends, mutual friends are sorted by their connectedness. Then we pull their user documents to apply other filters like, are they active? Are they eligible for the event?

User Document:

{
“UserID”: {“S”: “exampleUserId”},
“LastActivity”: {“N”: “1711497017”},
“TimeZone”: {“N”: “-7”},
“RollBalance”: {“N”: “50”}
}

On-demand mode

When our team launched “MONOPOLY GO!,” we weren’t able to project how much usage we’d have in the initial weeks and months. DynamoDB provides two capacity modes that are each optimized for different traffic patterns. Provisioned mode lets us define how much read and write throughput our tables can scale to, which is useful for workloads with predictable application traffic. On-demand mode provides pay-as-you-go pricing, scales to zero, and automatically scales tables to adjust for capacity. On-demand mode is useful for workloads that have unpredictable application traffic.

Since we couldn’t predict “MONOPOLY GO!”’s traffic, we launched the game with all of our tables in on-demand mode. Using on-demand mode meant we could accommodate our unexpected traffic without issue and provide players with a seamless experience right from launch. Having the option to change our tables from on-demand mode to provisioned mode helped us cost-optimize high traffic tables once we learned more about their traffic patterns.

Using on-demand mode was also useful for:

Tables that had spikey workloads.
Tables that were only used for short periods of time during in-game events.

While on-demand tables were useful for launching the game, we faced a few challenges over time that we’ve since learned to solve for:

Prewarming
“Hot second”

Prewarming

While on-demand tables scaled efficiently as our game grew, we foresaw challenges scaling for predictable high-traffic events, such as our popular partner events. During partner events, traffic sometimes grew over 100% from previous peaks. DynamoDB documentation states that on-demand capacity mode instantly accommodates up to double the previous peak traffic on a table. To prepare for our predictable surges in traffic, we “prewarmed” our tables by switching them into provisioned mode and setting a high WCU and RCU to an amount greater than we expected. This enabled us to read and write double the amount after switching back to on-demand mode. Had we not prewarmed the tables, DynamoDB would have still scaled our tables but it would have resulted in some throttling as additional partitions were created behind the scenes.

Hot Second

One of our tables had mostly idle traffic until a specific time of day when our game clients would all perform an action that would incur a write to the table at exactly 10:00.00am (to the second). We had previously prewarmed the table based on the previous consumed WCU observed in Amazon CloudWatch metrics.

The next day at 10am we saw the same load spike, consumed WCU was a bit less than we had prepared for but we still saw a large number of throttling errors. We thought that since we didn’t hit our max WCU capacity on the table that we might have an uneven key distribution (a hot key). We enabled Contributor Insights and waited for it to happen again.

The second day at 10am we saw the same behavior, this was expected, but Contributor Insights didn’t reveal any keys that were particularly hot. We worked with AWS and were able to determine that DynamoDB throttles on a smaller time window than CloudWatch reports. CloudWatch reports the average WCU over the period of 1 minute but DynamoDB will throttle you if you consume too much capacity over a much smaller window, like 1 second. The traffic was coming in all in the first few seconds of the minute period and the remaining portion of the minute was idle so we really needed 20 times the capacity we thought we did to handle the spike within a few seconds and not have throttling errors.

While DynamoDB handles most spikes quite well through on-demand mode, and burst capacity in provisioned mode, having every single client write at the exact same time is almost always going to present a challenge. We found that spreading those writes over a larger window allows us to be more efficient. We were pleased to see that DynamoDB was able to handle those huge traffic spikes after we pre-warmed the table. If you have a well-balanced partition key and smoothing spikes isn’t an option, you may be fine simply pre-warming to a sufficiently large size.

The following chart that shows what the incident might have looked like if we had per-second metrics:

Conclusion

DynamoDB’s serverless architecture and virtually limitless scale helped Scopely grow “MONOPOLY GO!” to 10 million daily players with zero DevOps engineers and zero DBAs. At peak traffic, “MONOPOLY GO!” drove 2.1 million writes/second to DynamoDB. Along our journey, DynamoDB helped us unblock the scaling limitations of our prior matchmaking service infrastructure. By leveraging a data model that uses multiple queries to identify users’ friends and their friends, DynamoDB delivered millions of friend matches with low latency and high availability. On-demand tables were also critical to our growth, accommodating unknown traffic spikes as our user base grew. Pre-warming our tables helped us support partner events and other known traffic patterns. We also found that spreading our writes over a larger window of time vs all-at-once reduced throttling for use cases where multiple clients were writing to one table at the same time. Achieving these milestones in just a few months without DynamoDB would have presented a meaningful challenge to our growth.

To learn more about Scopely’s MONOPOLY GO! architecture, refer to Building and Scaling MONOPOLGY GO! from GDC Vault. To learn more about DynamoDB, visit the DynamoDB product website and DynamoDB developer guide.

About the authors

Brandon Cuff is a Principal Engineer at Scopely. Brandon Cuff played a key role in scaling the game to its global audience, with particular focus on game architecture, .NET performance, and the development of resiliency and observability. Since joining Scopely in 2014, Brandon has driven core initiatives in the evolution of Scopely’s Playgami GameMAKER technology, including the migration from Windows to Linux, adoption of autoscaling and immutable deployment, and numerous architectural changes to enable the success of many of Scopely’s games, including now in the launch of Monopoly GO!

Jason Laschewer is an Outbound Product Manager on the Amazon DynamoDB team. Jason has held a number of Business Development roles in non-relational databases at AWS. Outside of work, Jason enjoys seeing live music, cooking, and spending time with his wife and three children in NY.

Masood Rahim is a Technical Account Manager at AWS working with games customers. He is responsible for helping customers with operational concerns related to scaling, performance and security. Masood is passionate about building and supporting large scale workloads to solve business problems in the cloud by leveraging his background in infrastructure and start-ups.

John Terhune is a Solutions Architect specializing in Amazon DynamoDB. In previous roles John has focused on both support and architecture for strategic customers, helping design and launch new applications. Beyond databases, John enjoys traveling with his wife and kids to exciting new places.

Read MoreAWS Database Blog

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments