Friday, April 19, 2024
No menu items!
HomeArtificial Intelligence and Machine LearningRecommend top trending items to your users using the new Amazon Personalize...

Recommend top trending items to your users using the new Amazon Personalize recipe

Amazon Personalize is excited to announce the new Trending-Now recipe to help you recommend items gaining popularity at the fastest pace among your users.

Amazon Personalize is a fully managed machine learning (ML) service that makes it easy for developers to deliver personalized experiences to their users. It enables you to improve customer engagement by powering personalized product and content recommendations in websites, applications, and targeted marketing campaigns. You can get started without any prior ML experience, using APIs to easily build sophisticated personalization capabilities in a few clicks. All your data is encrypted to be private and secure, and is only used to create recommendations for your users.

User interests can change based on a variety of factors, such as external events or the interests of other users. It’s critical for websites and apps to tailor their recommendations to these changing interests to improve user engagement. With Trending-Now, you can surface items from your catalog that are rising in popularity with higher velocity than other items, such as trending news, popular social content, or newly released movies. Amazon Personalize looks for items that are rising in popularity at a faster rate than other catalog items to help users discover items that are engaging their peers. Amazon Personalize also allows you to define the time periods over which trends are calculated depending on their unique business context, with options for every 30 minutes, 1 hour, 3 hours, or 1 day, based on the most recent interactions data from users.

In this post, we show how to use this new recipe to recommend top trending items to your users.

Solution overview

Trending-Now identifies the top trending items by calculating the increase in interactions that each item has over configurable intervals of time. The items with the highest rate of increase are considered trending items. The time is based on timestamp data in your interactions dataset. You can specify the time interval by providing a trend discovery frequency when you create your solution.

The Trending-Now recipe requires an interactions dataset, which contains a record of the individual user and item events (such as clicks, watches, or purchases) on your website or app along with the event timestamps. You can use the parameter Trend discovery frequency to define the time intervals over which trends are calculated and refreshed. For example, if you have a high traffic website with rapidly changing trends, you can specify 30 minutes as the trend discovery frequency. Every 30 minutes, Amazon Personalize looks at the interactions that have been ingested successfully and refreshes the trending items. This recipe also allows you to capture and surface any new content that has been introduced in the last 30 minutes and has seen a higher degree of interest from your user base than any preexisting catalog items. For any parameter values that are greater than 2 hours, Amazon Personalize automatically refreshes the trending item recommendations every 2 hours to account for new interactions and new items.

Datasets that have low traffic but use a 30-minute value can see poor recommendation accuracy due to sparse or missing interactions data. The Trending-Now recipe requires that you provide interaction data for at least two past time periods (this time period is your desired trend discovery frequency). If interaction data doesn’t exist for the last 2 time periods, Amazon Personalize will replace the trending items with popular items until the required minimum data is available.

The Trending-Now recipe is available for both custom dataset groups as well as video-on-demand domain dataset groups. In this post, we demonstrate how to tailor your recommendations for the fast-changing trends in user interest with this new Trending-Now feature for a media use case with a custom dataset group. The following diagram illustrates the solution workflow.

For example, in video-on-demand applications, you can use this feature to show what movies are trending in the last 1 hour by specifying 1 hour for your trend discovery frequency. For every 1 hour of data, Amazon Personalize identifies the items with the greatest rate of increase in interactions since the last evaluation. Available frequencies include 30 minutes, 1 hour, 3 hours, and 1 day.

Prerequisites

To use the Trending-Now recipe, you first need to set up Amazon Personalize resources on the Amazon Personalize console. Create your dataset group, import your data, train a solution version, and deploy a campaign. For full instructions, see Getting started.

For this post, we have followed the console approach to deploy a campaign using the new Trending-Now recipe. Alternatively, you can build the entire solution using the SDK approach with this provided notebook. For both approaches, we use the MovieLens public dataset.

Prepare the dataset

Complete the following steps to prepare your dataset:

Create a dataset group.
Create an interactions dataset using the following schema:

{ “type”: “record”, “name”: “Interactions”, “namespace”: “com.amazonaws.personalize.schema”, “fields”: [ { “name”: “USER_ID”, “type”: “string” }, { “name”: “ITEM_ID”, “type”: “string” }, { “name”: “TIMESTAMP”, “type”: “long” } ], “version”: “1.0” }

Import the interactions data to Amazon Personalize from Amazon Simple Storage Service (Amazon S3).

For the interactions data, we use ratings history from the movies review dataset, MovieLens.

Please use below python code to curate interactions dataset from the MovieLens public dataset.

import pandas as pd
import time
import datetime

data_dir = “blog_data”
!mkdir $data_dir
!cd $data_dir && wget http://files.grouplens.org/datasets/movielens/ml-25m.zip
!cd $data_dir && unzip ml-25m.zip
dataset_dir = data_dir + “/ml-25m/”

interactions_df = pd.read_csv(dataset_dir + ‘/ratings.csv’)
interactions_df.drop(columns=[‘rating’], axis=1, inplace=True)
interactions_df = interactions_df.rename(columns = {‘userId’:’USER_ID’, ‘movieId’:’ITEM_ID’, ‘timestamp’:’TIMESTAMP’})
interactions_file = ‘curated_interactions_training_data.csv’
interactions_df.to_csv(interactions_file, index=False)

The MovieLens dataset contains the user_id, rating, item_id, interactions between the users and items, and the time this interaction took place (a timestamp, which is given as UNIX epoch time). The dataset also contains movie title information to map the movie ID to the actual title and genres. The following table is a sample of the dataset.

USER_ID
ITEM_ID
TIMESTAMP
TITLE
GENRES
116927
1101
1105210919
Top Gun (1986)
Action|Romance
158267
719
974847063
Multiplicity (1996)
Comedy
55098
186871
1526204585
Heal (2017)
Documentary
159290
59315
1485663555
Iron Man (2008)
Action|Adventure|Sci-Fi
108844
34319
1428229516
Island, The (2005)
Action|Sci-Fi|Thriller
85390
2916
953264936
Total Recall (1990)
Action|Adventure|Sci-Fi|Thriller
103930
18
839915700
Four Rooms (1995)
Comedy
104176
1735
985295513
Great Expectations (1998)
Drama|Romance
97523
1304
1158428003
Butch Cassidy and the Sundance Kid (1969)
Action|Western
87619
6365
1066077797
Matrix Reloaded, The (2003)
Action|Adventure|Sci-Fi|Thriller|IMAX

The curated dataset includes USER_ID, ITEM_ID (movie ID), and TIMESTAMP to train the Amazon Personalize model. These are the mandatory required fields to train a model with the Trending-Now recipe. The following table is a sample of the curated dataset.

USER_ID
ITEM_ID
TIMESTAMP
48953
529
841223587
23069
1748
1092352526
117521
26285
1231959564
18774
457
848840461
58018
179819
1515032190
9685
79132
1462582799
41304
6650
1516310539
152634
2560
1113843031
57332
3387
986506413
12857
6787
1356651687

Train a model

After the dataset import job is complete, you’re ready to train your model.

On the Solutions tab, choose Create solution.
Choose the new aws-trending-now recipe.
In the Advanced configuration section, set Trend discovery frequency to 30 minutes.
Choose Create solution to start training.

Create a campaign

In Amazon Personalize, you use a campaign to make recommendations for your users. In this step, you create a campaign using the solution you created in the previous step and get the Trending-Now recommendations:

On the Campaigns tab, choose Create campaign.
For Campaign name, enter a name.
For Solution, choose the solution trending-now-solution.
For Solution version ID, choose the solution version that uses the aws-trending-now recipe.
For Minimum provisioned transactions per second, leave it at the default value.
Choose Create campaign to start creating your campaign.

Get recommendations

After you create or update your campaign, you can get a recommended list of items that are trending, sorted from highest to lowest. On the campaign (trending-now-campaign) Personalization API tab, choose Get recommendations.

The following screenshot shows the campaign detail page with results from a GetRecommendations call that includes the recommended items and the recommendation ID.

The results from the GetRecommendations call includes the IDs of recommended items. The following table is a sample after mapping the IDs to the actual movie titles for readability. The code to perform the mapping is provided in the attached notebook.

ITEM_ID
TITLE
356
Forrest Gump (1994)
318
Shawshank Redemption, The (1994)
58559
Dark Knight, The (2008)
33794
Batman Begins (2005)
44191
V for Vendetta (2006)
48516
Departed, The (2006)
195159
Spider-Man: Into the Spider-Verse (2018)
122914
Avengers: Infinity War – Part II (2019)
91974
Underworld: Awakening (2012)
204698
Joker (2019)

Get trending recommendations

After you create a solution version using the aws-trending-now recipe, Amazon Personalize will identify the top trending items by calculating the increase in interactions that each item has over configurable intervals of time. The items with the highest rate of increase are considered trending items. The time is based on timestamp data in your interactions dataset.

Now let’s provide the latest interactions to Amazon Personalize to calculate the trending items. We can provide the latest interactions using real-time ingestion by creating an event tracker or through a bulk data upload with a dataset import job in incremental mode. In the notebook, we have provided sample code to individually import the latest real-time interactions data into Amazon Personalize using the event tracker.

For this post we will provide the latest interactions as a bulk data upload with a dataset import job in incremental mode. Please use below python code to generate dummy incremental interactions and upload the incremental interactions data using a dataset import job.

import pandas as pd
import time
import datetime

#Selecting some random USER_ID’s for generating incremental interactions.
users_list = [‘20371′,’63409′,’54535′,’119138′,’58953′,’82982′,’19044′,’139171′,’98598′,’23822′,’112012′,’121380′,’2660′,’46948′,’5656′,’68919′,’152414′,’31234′,’88240′,’40395′,’49296′,’80280′,’150179′,’138474′,’124489′,’145218′,’141810′,’82607’]
#Selecting some random ITEM_ID’s for generating incremental interactions.
items_list = [ ‘153’,’2459′,’1792′,’3948′,’2363′,’260′,’61248′,’6539′,’2407′,’8961′]

time_epoch = int(time.time())
time_epoch = time_epoch-3600
inc_df = pd.DataFrame(columns=[“USER_ID”,”ITEM_ID”,”TIMESTAMP”])

i=0
for j in range(0,10):
for k in users_list:
for l in items_list:
time_epoch = time_epoch+1
list_row = [str(k),str(l),time_epoch]
inc_df.loc[i] = list_row
i=i+1

incremental_file = ‘interactions_incremental_data.csv’
inc_df.to_csv(incremental_file, index=False)

We have synthetically generated these interactions by randomly selecting a few values for USER_ID and ITEM_ID, and generating interactions between those users and items with latest timestamps. The following table contains the randomly selected ITEM_ID values that are used for generating incremental interactions.

ITEM_ID
TITLE
153
Batman Forever (1995)
260
Star Wars: Episode IV – A New Hope (1977)
1792
U.S. Marshals (1998)
2363
Godzilla (Gojira) (1954)
2407
Cocoon (1985)
2459
Texas Chainsaw Massacre, The (1974)
3948
Meet the Parents (2000)
6539
Pirates of the Caribbean: The Curse of the Bla…
8961
Incredibles, The (2004)
61248
Death Race (2008)

Upload the incremental interactions data by selecting Append to current dataset (or use incremental mode if using APIs), as shown in the following snapshot.

After the import job of incremental interactions dataset is complete, wait for the length of the trend discovery frequency time that you configured for the new recommendations to get reflected.

Choose Get recommendations on the campaign API page to get the latest recommended list of items that are trending.

Now we see the latest list of recommended items. The following table contains the data after mapping the IDs to the actual movie titles for readability. The code to perform the mapping is provided in the attached notebook.

ITEM_ID
TITLE
260
Star Wars: Episode IV – A New Hope (1977)
6539
Pirates of the Caribbean: The Curse of the Bla…
153
Batman Forever (1995)
3948
Meet the Parents (2000)
1792
U.S. Marshals (1998)
2459
Texas Chainsaw Massacre, The (1974)
2363
Godzilla (Gojira) (1954)
61248
Death Race (2008)
8961
Incredibles, The (2004)
2407
Cocoon (1985)

The preceding GetRecommendations call includes the IDs of recommended items. Now we see the ITEM_ID values recommended are from the incremental interactions dataset that we had provided to the Amazon Personalize model. This is not surprising because these are the only items that gained interactions in the most recent 30 minutes from our synthetic dataset.

You have now successfully trained a Trending-Now model to generate item recommendations that are becoming popular with your users and tailor the recommendations according to user interest. Going forward, you can adapt this code to create other recommenders.

You can also use filters along with the Trending-Now recipe to differentiate the trends between different types of content, like long vs. short videos, or apply promotional filters to explicitly recommend specific items based on rules that align with your business goals.

Clean up

Make sure you clean up any unused resources you created in your account while following the steps outlined in this post. You can delete filters, recommenders, datasets, and dataset groups via the AWS Management Console or using the Python SDK.

Summary

The new aws-trending-now recipe from Amazon Personalize helps you identify the items that are rapidly becoming popular with your users and tailor your recommendations for the fast-changing trends in user interest.

For more information about Amazon Personalize, see the Amazon Personalize Developer Guide.

About the authors

Vamshi Krishna Enabothala is a Sr. Applied AI Specialist Architect at AWS. He works with customers from different sectors to accelerate high-impact data, analytics, and machine learning initiatives. He is passionate about recommendation systems, NLP, and computer vision areas in AI and ML. Outside of work, Vamshi is an RC enthusiast, building RC equipment (planes, cars, and drones), and also enjoys gardening.

Anchit Gupta is a Senior Product Manager for Amazon Personalize. She focuses on delivering products that make it easier to build machine learning solutions. In her spare time, she enjoys cooking, playing board/card games, and reading.

Abhishek Mangal is a Software Engineer for Amazon Personalize and works on architecting software systems to serve customers at scale. In his spare time, he likes to watch anime and believes ‘One Piece’ is the greatest piece of story-telling in recent history.

Read MoreAWS Machine Learning Blog

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments