In this piece, we’ll introduce how the notification service works, the notification problems we had to solve, and what work still needs to be done. The problems we addressed are two-thirds resolved and the remaining third will address some code issues. However, the remaining changes will not be visible on the clients’ side.

How notifications work in production now

We have a vast number of microservices, some of which require push notifications. For example, WMS house inventory, Workhouse workflow, Job Coordinator, and Lamanche. Lamanche is a time tracking software which everyone at Paragon uses at least twice a day, at the start and end of each workday. All of these microservices must send notifications to their users.

You have likely seen some notification solutions, be it in Jira or elsewhere, where you have a notification bell at the top of your application window. The bell delivers some news, insights, or any other messages you have subscribed to. The Notificator service, which will be described in greater detail below, leverages that ‘notification bell’ functionality.

Let’s look at how sending notifications works on the client-side, as it’s a simple process. By the client, we’re referring to a mobile device or a browser. As soon as a WebSocket opens, a connection is established with a backend service called Paragon Communicator, which communicates with the client through that WebSocket. To receive messages after opening the connection with the Communicator, we need to send a specific filter. As shown in the example below:

In JavaScript, it will look different of course, but the gist is the same. It’s sort of pseudo-AST, which compares some message properties we want to receive. For example, PropertyName = “SourceId,” and logical operations that must be applied to the specific property and its value, such as:

{
PropertyName = “Name”,
Values = new [] {“UserStatusEvent”},
ConditionOperator = ConditionOperator.Equal
},

With this, everything ends on the client: callbacks are registered, and messages are received.

We’d like to reiterate that when we make additional changes and address the remaining issues with the code, nothing will change on the client-side, and we won’t break any backward compatibility.

Let’s see how it works on the backend. We’ll explain how notifications work on Lamanche backend since it has trivial logic, which will suffice for the purposes of this article. In contrast, if we mention WMS (or any other Paragon’s service), we’ll have to explain extraneous details associated to that specific service, such as the details of the plan, changes in the statuses of the plans, and etc. 

Back to Lamanche. Every time a user starts or stops the tracking, an event is sent to generate a push notification. Even though the user can opt-out from receiving notifications, it will be sent anyway.

When the event is sent, a message is formed and added to a particular exchange in the message broker RabbitMQ. This exchange is then bound to the queue in the Paragon Event Filter service, which is another backend service that listens to the queue and shares the subscription data with the Communicator. The client sends a filter and the filter is saved to Redis, which both the Event Filter and Communicator can access.

Event Filter takes those events, looks for any subscribers (or clients) who sent the above filter, and who satisfy the conditions of the message. If there are no such subscribers, Event Filter either discards the event without saving it anywhere, “retries,” or performs “re-queuing.” However, small artifacts that indicate the presence of such a message remain in the logs, but nothing more than that.

If there’s a subscriber, Event Filter forwards this event via RabbitMQ to the Communicator, which has a socket and connection to the client. The Communicator takes up that message and looks for the client who subscribed to receive a notification. Upon finding such a client, the Communicator opens the socket and sends the message there.

Problematics: Creating a storage service for messages

Even though we sent all messages, some could have disappeared along the way. But what if we wanted to repeat the message or analyze the sent notifications? Since we didn’t store those messages anywhere, this would be impossible. There were some traces of them left in the logs but inspecting the logs for statistical purposes is not practical. Therefore, we needed to develop a storage service where all those messages could be securely stored for a specified amount of time. If we wanted to analyze those messages or resend them, we could go back to that service. 

The storage service serves as an intermediary between the Communicator and the final service, such as Lamanche in our example. That was how Paragon Notificator was born.

It’s important not to confuse Communicator and Notificator because the former sends the messages to clients and Notificator stores those messages.

The following diagram shows how Notificator works:

Diagram to illustrate how notofication works in production

As you can see in the diagram, on the top left, there is a client (for illustrative purposes, it’s Lamanche). The Kafka client, as shown on the left, has not yet been finalized, so we’re still using RabbitMQ for now. Lamanche sends a message to a message broker. The Notificator reads that message (the Notificator also has public REST APIs that can be used to read notifications).

The Notificator sends the message via RabbitMQ to the Communicator, which forwards the message to the client via the WebSocket. To communicate with WebSockets, the Communicator uses Microsoft framework SingleR, which is a powerful framework that can keep connection not only via a WebSocket but also via HTTP long polling, or iFrame.

How Paragon Notificator works

Now, a few words on how Paragon Notificator works. The service doesn’t do anything special and thus, doesn’t offer sophisticated functionality apart from receiving, storing, and sending messages, which all occurs through the message broker.

The following table shows how Notificator works.

Notificator service database structure

From the three tables above, you can see that there are notifications in the form of JSON, the list of recipients, and acknowledgments that confirm reading or receiving a notification (whichever a user chooses). Those acknowledgments are taken out of the Notificator’s RESTs.

What follows is an example of REST, which accepts a NotificationID and thus, eliminates the need to read the actual notification.

Code snippets that show how the Notificator works

For example, we have the following test for Lamanche:

In the above test, the connection to the Communicator opens, the filter gets registered (an example filter was given at the beginning of the article), and two actions are registered in Lamanche: pushing of a start and stop buttons or
Manager.Start ();
Manager.Stop ();

As a result, in addition to the notifications that will settle in the logs, we’ll also get an entry in PostgreSQL in two tables: for the notifications and recipients.

Additional entry into Postgre tables for notifications and recipients

OnHandler shows the event that came via the WebSocket.

We performed two operations, ‘start’ and ‘stop,’ and received two corresponding events.

There’s nothing of interest in particular here, except for probably the fields From and To, where the latter is empty because we’ve started the work but have not yet finished it.

Start event data line with the empty "to" field because we haven't finished the work yet

The second event has both From and To fields.

Correspondingly, those two events now appear in both Recipients and Notifications tables. However, In the above screenshot, you’ll see four registered events, but there are only two (a bug that has yet to be fixed).

As part of this task, we’ve also refactored Event Filter, and its functionality (filtering by subscriptions and events) is now being migrated to the Communicator. After we complete the migration process, we can get rid of the Event Filter. There’s something we have yet to complete in the Communicator for this to work, but we are close to finishing the migration process.

Frequently asked questions

How is the “read’ status realized?

Through REST API. The Notificator has public RESTs. For example, if you come to the following REST and pull a resource, the notification will change its status to “read” and will be written in the database in the Acknowledgements table.

Can a client manage the lifetime of those notifications?

No. Notifications have an expiration field, but it’s not currently used. This is something that has not been developed yet.

For how long are notifications stored?

Currently, we have no restrictions on the storage period for notifications.

Can a client opt-out of saving notifications to the database?

Nothing prevents a client from publishing directly to the Communicator and bypassing the Notificator.

Previously, the events were published in the exchange called System.E.Fanout.Events, which, as you can see from the below screenshot from Redis, is still there.

System E Fanout events example

Notificator now listens to that exchange, whereas previously, instead of the Notificator, it was listening to Event Filter. Notificator sends that event via the exchange named Communicator.E.Fanout.Events. The Notificator also has an additional exchange Notificator.E.Fanout.Acknowledgments for notification confirmations, but currently, no service listens to it.

Thus, if a client wants to bypass Notificator and use the Communicator directly, all they have to do is to use the exchange Communicator.E.Fanout.Events.

How do you guarantee the delivery of the notification (at most once)?

Currently, we have not yet implemented that functionality. However, as previously mentioned, we use RabbitMQ, which offers eventful consistency by design and, without going further into technical details, let’s just say that it can lose the notifications without consequence. In the event our push notification service fails to deliver a message to the recipient, nothing critical will happen. However, we can write a piece of code that can check whether the message has been sent or not, then tie it into the acknowledgments; but that’s not something that we consider our main priority at this time.

Besides, large companies, such as Apple, will throw out duplicate messages and send only one notification to the user even if you try to send a hundred.

Therefore, we currently cannot guarantee delivery, but that situation will improve once we implement Kafka.

Does it mean that Paragon Notificator uses the unimportant data, so in case it gets lost, it’s okay?

We won’t lose the notifications. We build lists such as the example below, which are stored in the Notificator so long as it’s alive. If the Notificator dies, then we’ll lose those notifications. But if it dies, then we have a bigger problem than losing a message. Theoretically, even if we lose a push notification, we’ll still be able to build such a list from what’s stored in the Notificator:

Besides, our system is designed in such a way that even if the push was missed, we still have a pulling channel.

Does it mean that a client can send a request to the Communicator with a specified filter and will receive the notifications according to that filer?

Yes, that’s correct. The filter is compiled on the client-side in the form of JSON. Then JavaScript via HTTP agrees to work through WebSocket. The client receives 101 and starts working via the WebSocket. The first request upon the connection is SetSubscription, where JSON gets transferred, processed by SingleR, deposited to Redis, used by the Event Filter, and so forth, as described earlier.

Are notification types fixed?

Let me show an example:

Example of how we fix notifications

We have a fixed header but the content can be absolutely anything and the subscriber knows what they have subscribed to. In other words, there’s a logical expression, which upon runtime, parses any JSON it receives. If it spots that JSON and it satisfies the conditions, it sends the message to the recipient.

In the example above, there are two messages that are published. Each message consists of two parts: a header, which is required and contains system information, such as the type of version, the publisher, and the name of the contract, and then there is the data itself, which in this case is event.base. Then, all of it is serialized using JSON and sent.

The example of a possible fix to be implemented to a notification's header
Summary
Notificator Microservice Case Study
Article Name
Notificator Microservice Case Study
Description
How we've developed a robust and universal notification microservice for our e-commerce products. In this case study, you'll learn how this service is built and works, and what issues it helps us to solve in our projects.
Author
Publisher Name
Paragon Software group
Publisher Logo