Saturday, April 20, 2024
No menu items!
HomeData IntegrationRedis Pipeline: How to Publish and Subscribe Data from Redis to Your...

Redis Pipeline: How to Publish and Subscribe Data from Redis to Your Destination

Looking to build a Redis pipeline? Before we dig in and build our pipeline, let’s discuss what Redis is, its benefits, and common Redis pipeline use cases.

What is Redis?

Redis, which stands for Remote Dictionary Server, is an open source (BSD licensed), in-memory data structure store used as a database, cache, message broker, and streaming engine. Redis delivers sub-millisecond response times, enabling millions of requests per second for real-time applications in industries like ad-tech, financial services, gaming, healthcare, and IoT.

 

Today, Redis is one of the most popular open source engines today. Because of its fast performance, Redis is a popular choice for caching, session management, gaming, leaderboards, real-time analytics, geospatial, ride-hailing, chat/messaging, media streaming, and pub/sub apps.

Benefits of Redis

Performance

All Redis data resides in memory, which enables low latency and high throughput data access. Unlike traditional databases, In-memory data stores don’t require a trip to disk, reducing engine latency to microseconds. Because of this, in-memory data stores can support an order of magnitude more operations and faster response times. The result is blazing-fast performance with average read and write operations taking less than a millisecond and support for millions of operations per second.

Flexible Data Structures

Redis has a vast variety of data structures

Strings – text or binary data up to 512MB in size
Lists – a collection of Strings in the order they were added
Sets – an unordered collection of strings with the ability to intersect, union, and diff other Set types
Sorted Sets – Sets ordered by a value
Hashes – a data structure for storing a list of fields and values
Bitmaps – a data type that offers bit level operations
HyperLogLogs – a probabilistic data structure to estimate the unique items in a data set
Streams – a log data structure Message queue
Geospatial – a longitude-/latitude-based entries Maps, “nearby”

Simplicity and Ease-of-use

Redis enables you to write traditionally complex code with fewer and simpler lines. With Redis, you write fewer lines of code to store, access, and use data in your applications. Over a hundred open source clients are available for Redis developers. Supported languages include Java, Python, PHP, C, C++, C#, JavaScript, Node.js, Ruby, R, Go, and many others.

Replication and Persistence

Redis employs a primary-replica architecture and supports asynchronous replication where data can be replicated to multiple replica servers. This provides improved read performance (as requests can be split among the servers) and faster recovery when the primary server experiences an outage. For persistence, Redis supports point-in-time backups (copying the Redis data set to disk).

High Availability and Scalability

Redis offers a primary-replica architecture in a single node primary or a clustered topology. This allows you to build highly available solutions providing consistent performance and reliability. When you need to adjust your cluster size, various options to scale up and scale in or out are also available. This allows your cluster to grow with your demands.

Open Source

Redis is an open source project supported by a vibrant community.

Popular Redis Use Cases

Caching
Chat, messaging, and queues
Gaming leaderboards
Session store
Rich media streaming
Geospatial
Machine Learning
Real-time analytics

Redis Pub/Sub implements the messaging system where the senders (in Redis terminology, publishers) send the messages while the receivers (subscribers) receive them. The link by which the messages are transferred is called channel.

In Redis, a client can subscribe to any number of channels.

Connect to Redis and publish message on a channel

redis-cli -u redis://localhost:6379/0

localhost:6379> publish ch2 “{“a”:2077941584}”

(integer) 1

Connect to Redis and consume message from a channel

redis-cli -u redis://localhost:6379/0

localhost:6379> subscribe ch2

Reading messages… (press Ctrl-C to quit)

1) “subscribe”
2) “ch2”
3) (integer) 1
1) “message”
2) “ch2”
3) “{“a”:2988789}”
1) “message”
2) “ch2”
3) “{“a”:2077941584}”

Publish and Subscribe Message: Redis Pipeline Creation in StreamSets

Publish to Redis:

Configuration:

Redis tab:

URI: redis://localhost:6379/0
Mode : Publish
Channel : ch2

Data Format:

Data Format: JSON

 

Consumer:

Configuration:

Redis tab:

URL: redis://localhost:6379/0
Mode : Publish
Channel : ch2

Data Format:

Data Format: JSON

You can also view the data in the cli:

localhost:6379> subscribe ch2
Reading messages… (press Ctrl-C to quit)
1) “subscribe”
2) “channel2”
3) (integer) 1
1) “message”
2) “channel2”
3) “{“city”:”NY”,”latitude_longitude”:{“latitude”:”37.7749″,”longitude”:”-122.4194″},”lst”:[“one”,”two”]}”

Redis Destination (Batch Mode)

Say I have a json file with below content:

{city“:  “city“,”city_name“: “San Francisco“,
State“: “state“,”state_name“:”CA“,
Country“: “country“,”country_name“:”USA“,
other“:”other“,
latitude_longitude“:
[{latitude“: “37.7749“,”longitude“: “-122.4194}]
}
~

In the Redis configuration, select the mode as “Batch” and key values of type string like below:

 

Preview the Redis pipeline:

 

And view the data in the Redis cli:

redis-cli -h localhost -p 6379
localhost:6379> keys *
1) “NY”
localhost:6379> keys *
1) “country”
2) “state”
3) “NY”
4) “city”
localhost:6379> 

Create Map object for batch:

Sample json file:

{“city”:  “NY”,
“latitude_longitude”:
{“latitude”: “37.7749”,”longitude”: “-122.4194”}
}

In the configuration of Redis, enter key, value and type as Hash (shown below).

 

Key of name “NY” is created in Redis:

localhost:6379> keys *

1) “NY”
2) “country”
3) “state”
4) “city”

localhost:6379> HGETALL NY
1) “latitude”
2) “37.7749”
3) “longitude”
4) “-122.4194”

Create List object in json

{“city”: “NY”,
“latitude_longitude”:
{“latitude”: “37.7749”,”longitude”: “-122.4194”},
“lst”: [“one”,”two”]

}

 

Now, the city object with hold list values 

 

localhost:6379> DEL NY
(integer) 1

localhost:6379> keys *

1) “NY”
2) “country”
3) “state”
4) “city”

localhost:6379> TYPE NY
List
localhost:6379> lrange NY 0 1
1) “two”
2) “one”

You can see from this Redis pipeline example the benefits and limitations of embedded python in your smart data pipelines. StreamSets aims to bridge the gap between the ultimate control of hand coding and ease and repeatability of a graphical interface.

With StreamSets you can:

Quickly build, deploy, and scale streaming, batch, CDC, ETL and ML pipelines
Handle data drift automatically, keeping jobs running even when schemas and structures change
Deploy, monitor, and manage all your data pipelines – across hybrid and multi-cloud – from a single dashboard 

Try smart data pipelines out yourself with StreamSets, a fully cloud-based, all-in-one DataOps platform. Sign up now and start building pipelines for free

The post Redis Pipeline: How to Publish and Subscribe Data from Redis to Your Destination appeared first on StreamSets.

Read MoreStreamSets

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments