Sela. | Cloud Better.

Introduction to Google Cloud Platform Pub/Sub

Google Cloud Pub/Sub offers excellent scalability, making it capable of handling a significant volume of messages per second with remarkable speed. This scalability feature makes it an ideal choice for applications that require high throughput and low latency, particularly for real-time data processing.

Vikrant Barde | Tech Lead | Sela. India

Google Cloud Platform Pub/Sub

Applications possess the remarkable capability to communicate asynchronously through the utilization of a sophisticated messaging service provided by a widely recognized and reputable cloud platform. This platform offers a versatile, dependable, and highly adaptable solution for effectively managing data, catering to a diverse range of purposes encompassing continuous data processing, real-time streaming analysis, and intricate event-driven architectures. By harnessing the power of this innovative service, applications can seamlessly handle the flow of data, thereby ensuring optimal operational efficiency and facilitating the seamless exchange of crucial information.

 

Google Cloud Pub/Sub offers excellent scalability, making it capable of handling a significant volume of messages per second with remarkable speed. This scalability feature makes it an ideal choice for applications that require high throughput and low latency, particularly for real-time data processing.

To ensure reliable message delivery, Google Cloud Pub/Sub utilizes a distributed system for storing and delivering messages. By distributing messages across multiple regions, the system prevents message loss even in the event of a single machine or regional failure. Additionally, Google Cloud Pub/Sub implements redundancy measures by replicating messages across multiple servers, further enhancing message reliability.

Critical features of Pub/Sub

At-least-once delivery

Synchronous, cross-zone message replication and per-message receipt tracking ensure at-least-once delivery at any scale.

Open

Open APIs and client libraries in seven languages (like Java, Go, Python, Node.js) support cross-cloud and hybrid deployments.

Exactly-once processing

Dataflow supports reliable, expressive, exactly-once processing of Pub/Sub streams.

No provisioning, auto-everything

Pub/Sub does not have shards or partitions. Just set your quota, publish, and consume.

Compliance and security

Pub/Sub is a HIPAA-compliant service, offering fine-grained access controls and end-to-end encryption.

Google Cloud–native integrations

Google Cloud Pub/Sub integrates with other Google Cloud services, including Cloud Functions, Cloud Dataflow, Gmail update events, and Cloud Storage.

Third-party and OSS integrations

Pub/Sub provides third-party integrations with Splunk and Datadog for logs and Striim and Informatica for data integration. Additionally, OSS integrations are available through Confluent Cloud for Apache Kafka and Knative Eventing for Kubernetes-based serverless workloads.

Seek and replay

Rewind your backlog to any point in time or a snapshot, giving the ability to reprocess the messages. Fast forward to discarding outdated data.

Filtering

Pub/Sub can filter messages based on attributes to reduce delivery volumes to subscribers.

Cost-effective

Google Cloud Pub/Sub offers a cost-effective pricing model based on the volume of messages processed.

Critical Concepts of Pub/Sub

These are some of the key concepts related to Google Cloud Pub/Sub. Understanding these concepts is essential for building reliable and scalable messaging systems using Pub/Sub.

Topics

A topic is a named resource to which messages can be published by publishers. It represents a specific stream of data. Topics can have multiple subscriptions, and each subscription receives a copy of every message published on the topic.

Subscriptions

A subscription is a named resource that represents the stream of messages from a single, specific topic, to be delivered to the subscribing application. Each subscription is associated with a single topic, but multiple subscriptions can be associated with the same topic.

Messages

Messages are units of data that are exchanged between publishers and subscribers. A message consists of a payload and optional attributes that describe the message.

Publishers

Publishers are applications that create and send messages to a topic in Pub/Sub. A publisher can be any application that can make HTTP requests.

Subscribers

Subscribers are applications that receive messages from subscriptions. A subscriber can be any application that can make HTTP requests.

 

Acknowledgments

A subscriber must explicitly acknowledge each message it receives from a subscription. This allows Pub/Sub to ensure that each message is processed only once.

Dead-letter topics

A dead-letter topic is a special topic to which messages that cannot be delivered are sent. This allows subscribers to handle messages that could not be processed, such as those that are too large or have invalid message data.

Message retention

Pub/Sub retains messages in a topic for a certain period, which can be configured. This ensures that subscribers have time to process messages, even if they are temporarily offline or cannot keep up with the incoming message rate.

Ordering

Pub/Sub supports both ordered and unordered message delivery. Ordered delivery is achieved by using the same ordering key for all messages in a topic.

Push

With a push, subscribers register a URL endpoint with Pub/Sub. When a message is published to a topic, Pub/Sub pushes the message to the subscriber’s endpoint using HTTP/HTTPS. This allows subscribers to receive messages in near real-time and reduces the amount of work they need to do to receive messages.

Pull

With a pull, subscribers explicitly request messages from Pub/Sub. The subscriber polls Pub/Sub at regular intervals to see if there are any new messages available. If there are, Pub/Sub returns one or more messages to the subscriber. This mechanism requires the subscriber to manage the polling process, which can be more complex than push, but it also provides more control over when messages are received.

How Does Pub/Sub Work?

 

 

  1. A publisher sends a message.
  2. The message is written to storage.
  3. Pub/Sub sends an acknowledgment to the publisher that it has received the message and guarantees its delivery to all attached subscriptions.
  4. At the same time as writing the message to storage, Pub/Sub delivers it to subscribers.
  5. Subscribers received messages.
  6. Subscribers send an acknowledgment to Pub/Sub that they have processed the message.
  7. Once at least one subscriber for each subscription has acknowledged the message, Pub/Sub deletes the message from storage.

Pub/Sub Patterns

Many to One

 

 

Subscribers can receive messages from multiple publishers, subscribers just need to subscribe to multiple subscriptions.

Many to Many

 

 

In this pattern, messages are published to a topic from multiple publishers, and pub/sub manages workload by distributing messages to subscribers.

One to Many

 

In this pattern, messages are published on a topic, and multiple subscribers receive a copy of each message. This pattern is useful when you need to distribute messages to multiple recipients, such as broadcasting updates to a large number of users.

Pub/Sub usage example.

Pub/Sub plays a crucial role in the periodic data collection architecture described below. Here we process data from storage objects (blob/files) and BigQuery and fetch refined data in Cloud Sql. The refined data from Cloud Sql is made available to clients via microservices. This is just a basic high-level scenario. With the right GCP resources, you can design a better solution.

 

Conclusion

To conclude, Google Cloud Pub/Sub stands out as a robust messaging service with a multitude of use cases. Its exceptional scalability, reliability, flexibility, and security features make it the perfect solution for applications that demand high throughput and minimal latency. By leveraging Google Cloud Pub/Sub, developers can create resilient and highly efficient applications capable of seamlessly managing substantial message volumes in real-time scenarios.