Server performance in web applications: is reactivity the key to success?

Whether delving into abstract concepts such as optimal algorithms for graph traversal or sorting a list or exploring concrete scenarios such as software solutions for managing banking transactions or implementing a flawless eCommerce platform enabling seamless online order placement, one certainty prevails in the programming world: performance.

12.11.2023

written by Darius Has (Java Software Engineer), in the November 2023 issue of Today Software Magazine.

Read the article in Romanian here

Introduction

In Web applications and distributed systems, numerous techniques exist to enhance performance, such as incorporating redundancy within our services through horizontal scaling, parallelisation, and distributed computing, along with monitoring, logging, or tracing the execution of incoming requests.

However, is there a risk of excessively focusing prematurely on these aspects? Can we achieve better service performance before considering them? Do we still have room for improvement?

Evidently, there is no definitive answer, but we can agree that there’s no one-size-fits-all solution ensuring absolute performance in all scenarios.

However, in this article, my aim is to explore this topic and identify potential areas for improvement, starting with introducing a different design paradigm for Web applications—specifically, the reactive approach.

Of course, you don’t have to take my word for it when I say that reactive programming can be a valuable solution in improving the performance of Web servers.

To support this claim, I will endeavor to provide a strong argument in the following manner. How? I’ll stick to what I do best and implement two REST Web servers: one reactive and one blocking, based on a traditional approach that follows the “one thread per request” model. I’ll then compare them to showcase differences, similarities, advantages, and limitations between the reactive and traditional approaches, focusing on a context with a high number of simultaneous users performing I/O-intensive operations.

Reactivity. buzzword or a new paradigm with great potential?

reactive manifesto

Principles, concepts, or design values for ensuring the reactive nature of a software service:

Elasticity represents the concept in which a system can remain responsive amidst a varied volume of requests or concurrent users, increasing performance when more users concurrently use the service, or conversely, automatically decreasing when demand diminishes. While the first thought for achieving elasticity might involve scaling the system, whether through horizontal or vertical scaling, scalability will always be limited by existing bottlenecks within developed systems.
Resilience denotes a system’s ability to maintain its responsiveness in the face of failures. Clearly, the concept of resilience is closely linked to that of elasticity, as errors occur in the absence of a component that reacts to fluctuations in data volume. Considering the possible presence of failures, maintaining the adaptive nature of the system necessitates resilience as an essential principle in designing and implementing a Web service.
Message-based communication is a means through which the components of a distributed system can communicate, taking elasticity and resilience into account. Promoting decoupling, isolating communication participants, and concurrently enabling scalability, message-based communication represents an asynchronous way for communicating services to exchange information without being blocked until the message is processed/sent by the opposing party.
Responsiveness essentially represents the goal of a reactive system. Undoubtedly, the latency of a system is a critical characteristic of responsiveness. The higher the latency, the more it is affected.

All these concepts contribute to defining reactivity.

As a programming paradigm that aids in implementing a reactive system, we can talk about reactive programming, which in turn focuses on efficiently managing events and state changes, relying on processing them in an asynchronous, non-blocking manner.

Like any other programming paradigm, it forms a foundation in implementing software solutions, particularly those characterized by reactivity.

Why would we choose to build a reactive system?

The goal is to have scalable, decoupled, easily extensible systems with as low latency as possible, and we might believe that we know how to achieve this, having done it many times without considering reactivity.

However, there are aspects that should not be overlooked when it comes to reactive services, aspects that could justify choosing it in certain scenarios, including:

Component Isolation due to the nature of their communication, inevitably leading to their decoupling, is a strong indicator of a well-designed system. Furthermore, this isolation adds an element of extensibility to the solution, once again confirming the maturity within the system’s architecture.
Backpressure. Defined as a feedback mechanism, it reacts to a substantial increase in data flow and is essential in a reactive system and serves as a significant aid in terms of resilience. It plays a pivotal role in helping the system gracefully handle the tasks it needs to accomplish, particularly amidst a multitude of requests.
Resource management efficiency. Reactive programming promises efficient resource management, including memory and processing power. This can be attributed to the fact that, in a reactive system, there is a limited group of threads optimised to facilitate asynchronous processing. Many reactive solutions include an event-loop processor capable of capturing events leading to state changes. Its sole purpose is to manage and distribute the workload received to other threads, resulting in more efficient thread utilisation and a reduced need for their quantity. As a result, concurrency and synchronization issues can also be managed more efficiently.
Higher throughput in the context of I/O-intensive applications. In other words, the ability to respond more rapidly within a high volume of concurrent I/O operations (such as database operations), making it a candidate to consider for real-time applications with numerous concurrent users.

Reactive libraries in the java ecosystem

In what follows, I intend to stay within my area of expertise and present certain libraries that can be used in developing reactive services in Java. The goal is, of course, not to reinvent the wheel and to understand the robust and mature solutions available.

Thus, among the most well-known reactive libraries or frameworks are:

RxJava was for a long time the standard library for reactive programming in the Java ecosystem, considered an influence to other libraries. In its API, we can observe how a solution based on well-known design patterns, including Observer and Iterator, can lead to the creation of reactive data streams that we can work with. We have two major categories of objects with which we interact: Observable, the object capable of receiving data and whose state is of interest to the Observer, which desires notification upon a change in its state. Therefore, data streams can be treated as observable objects and manipulated using operators similar to operations in functional programming (map, filter, zip, etc.)
Akka is a toolkit designed to aid in the development of distributed, non-blocking applications in an environment where increased concurrency is prevalent. Developed in Scala following the principles of the Reactive Manifesto [1], it is based on the actor model, which represents isolated entities communicating with each other through messages.
Project Reactor is the foundation that underlies, among many others, the reactive suite in
Spring. It is implemented as an umbrella over a reactive streams solution and introduces two major categories of observable objects (or Publishers): Mono and Flux. Understanding these two is essential, so now it is a proper moment to make the distinction between them: although both represent reactive streams producing elements in a non-blocking manner, the difference lies in the number of elements they can produce. Thus, a Flux object can produce zero, one, or multiple elements, unlike Mono, which can produce at most one element.

Real-world scenario: creating a reactive rest service using java, spring webflux, and spring data r2dbc

It’s time to move on to practical implementation. Next, we’ll see how to implement a reactive server using Java 17, Spring WebFlux, and Spring Data R2DBC, a component that provides integration of reactive drivers with relational databases.

For our service, we’ll be using one specific to PostgreSQL. We’ll note that the server for our service will run on Netty, specialized in handling asynchronous HTTP requests.

Before delving into the necessary steps to implement an HTTP service that doesn’t block the caller until the processing is complete, I’d like to review the scenario we’ll consider both in constructing the service and comparing it with a classic blocking REST service that follows a ’one thread per request’ model.

Therefore, we’ll aim to implement a server for abanking institution, capable of recording new accounts, transactions, and even searching for specific transactions within a given interval or calculating the balance in an account based on the recorded transaction history.

Additionally, we’ll explore how to handle and propagate various custom exceptions, highlighting the reactive API for manipulating data streams.

There are several steps involved in developing the reactive server that facilitates performing CRUD operations for banking transactions.

Adding dependencies

In addition to the dependencies on Spring Webflux and Spring Data R2DBC, I’ve added dependencies for Lombok for boilerplate code and for Flyway, for migration scripts.

.dependencies {

implementation 'org.springframework.boot:spring-boot-starter-data-r2dbc'

implementation 'org.springframework.boot:spring-boot-starter-webflux'

implementation 'org.flywaydb:flyway-core'

implementation 'org.projectlombok:lombok:1.18.28'

runtimeOnly 'org.postgresql:r2dbc-postgresql'

implementation 'org.postgresql:postgresql'

compileOnly 'org.projectlombok:lombok:1.18.28'

annotationProcessor 'org.projectlombok:lombok:1.18.28'

testImplementation 'org.springframework.boot:spring-boot-starter-test'

testImplementation 'io.projectreactor:reactor-test'

}

Entities

In our service, we’ll have two models: Account and Transaction. The relationship between them is a one-to-many type. Below we can see a possible implementation of the entities and their relationship with specific Spring Data R2DBC annotations.

@Getter

@Setter

@NoArgsConstructor

@Table("account")

public class Account {

@Id

private Long id;

@Column("account_number")

private String accountNumber;

@Column("account_holder")

private String accountHolder;

@Transient

@JsonIgnore

@MappedCollection(idColumn = "account_number")

private List transactions;

}

@Data

@NoArgsConstructor

@Table(name = "transaction")

public class Transaction {

@Id

Long id;

@Column("account_number")

String accountNumber;

@Column("date")

LocalDate transactionDate;

@Column("transaction_details")

String transactionDetails;

@Column("processed_date")

LocalDate processedDate;

@Column("withdrawal_amount")

Double withdrawalAmount;

@Column("deposit_amount")

Double depositAmount;

}

Reactive repository

By using Spring Data RDBC (i.e., Relational Reactive Database Connectivity), we can construct a ReactiveCrudRepository where we have the possibility to add new methods that will be implemented through reflection, a common practice in other solutions within the Spring Data suite. The main difference is its reactive nature, resulting in the return types being Mono or Flux.

@Repository

public interface TransactionRepository extends ReactiveCrudRepository<Transaction, Long> {

Flux findAllBy(Pageable pageable);

Flux findAllByAccountNumber(String accountNumber, Pageable pageable);

Mono deleteAllByAccountNumber(String accountNumber);

Flux findAllByAccountNumberAndTransactionDateBetween(String accountNumber, LocalDate after, LocalDate before, Pageable pageable);

Mono countByAccountNumber(String accountNumber);

Mono countByAccountNumberAndTransactionDateBetween(String accountNumber, LocalDate after, LocalDate before);

}

Reactive service

In what follows, we have the implementation of a method to save a transaction, chaining multiple database operations in a reactive manner to demonstrate the power of the API at our disposal. We can observe how we access the element contained in a Mono observable through a flatMap, then save the transaction. Furthermore, we return a custom exception in case the account corresponding to the account number for which we aim to save a transaction does not exist.

@Transactional

public Mono createTransaction(Transaction transaction) {

return accountRepository

.findByAccountNumber(transaction.getAccountNumber())

.flatMap(account -> transactionRepository.save(transaction))

.switchIfEmpty(Mono.error(new AccountNotFoundException()));

}

Reactive controller

Finally, we have the implementation of a POST method related to the previously presented service method. We notice many common elements with a classic implementation for a blocking REST service (such as the standard annotations @PostMapping and @RequestBody). However, the interesting aspect comes from the way in which, through the Fluent API of Reactor Core, we can handle certain thrown exceptions and return different HTTP status codes.

@PostMapping

public Mono> createTransaction(@RequestBody Transaction transaction) {

return transactionService

.createTransaction(transaction)

.then(Mono.just(ResponseEntity.ok().build()))

.onErrorResume(AccountNotFoundException.class, error ->

Mono.just(ResponseEntity.notFound().build()));

}

Event-loop vs thread per request. when to go reactive?

The answer is that you can go during development and also while trying to identify and fix potential issues. To support the implementation of a reactive service, I propose a case study in which I aim to compare two REST services, one reactive and one blocking, in two different scenarios.

The scenario remains the same: we have a service capable of performing CRUD operations with banking transactions, bank accounts, and possibly even aggregating some data (for example, calculating a balance from the current transactions).

It’s worth noting that I used JMeter to create bulk requests simulating the service’s concurrent usage by several users, JProfiler for service profiling, and pgAdmin to collect specific metrics related to the database connection.

That being said, in order to achieve the highest degree of isolation, we focus on testing two services running in Docker containers, without considering aspects such as horizontal scaling through redundancy, load balancing, or other enhancements that could help us have a service more ’production-ready’.

Small number of concurrent users. computationally intensive operations

The testing context involves a relatively small number of concurrent users, where both are required to compute results in memory. To achieve this, we simulated a calculation of the total of banking transactions by summing the amounts from the transaction history.

The graph related to this test, the figure below is particularly interesting because we observe that neither the response time nor the throughput differs significantly between the two services.

A possible interpretation could be that in these circumstances, the benefits of a reactive system are not evident, and the blocking service can cope successfully.

Large number of concurrent users. i/o intensive operations

The number of concurrent users poses an issue, so it’s worth seeing what happens when it increases significantly. For a clear overview, the graph below contains metrics regarding the average response time per call and throughput in the context of many HTTP requests such as: creating new transactions, adding new accounts, and finding all transactions within a specific time interval.

It’s worth noting that each concurrent user performed 3 operations, so the number of requests is actually 3 times higher for each of the tested cases.

An exponential growth is observed in the case of the response time of the blocking service, where we experienced an upper limit of requests that could be processed of approximately 9,000-10,000 concurrent users (30,000 requests), after which my service started producing errors such as closing connections or, worse, it crashed completely.

In contrast, the reactive server meets expectations in the context of many users. It shows good throughput, reasonable response time, and, following the tests, we noticed a concurrency user handling limit nearly double, 17,000, responding to 54,000 concurrent requests with a virtually nonexistent error rate.

Below, I’ve attached a screenshot from the JMeter tests, where we can observe other interesting metrics such as the error rate, which predominantly remained zero, and even the maximum response time for each individual call.

Overview

Beyond comparing response time or throughput metrics with JMeter, another interesting aspect is highlighted by the profiling of both services and database monitoring. Using JProfiler throughout the tests, I observed, as expected, a significant difference in both the number of threads created and the number of threads concurrently active or blocked during processing.

The figures below illustrate this, noting that the peaks in various metrics coincide with moments of testing with a high number of users.

Thus, we observe that the event-loop-based architecture of the reactive service maintains a consistently lower count of threads, around 30, and a concurrent number of active threads during processing, unlike the blocking service, which, in times of high user volume, escalates the number of existing threads, most of which are in a waiting state.

Other differences are noted in CPU load peaks, where the reactive server is more efficient, utilising it for shorter periods, and in memory usage, which, though comparable, remains constant for the reactive service, indicating its ability to process a high volume of data continuously.

Another significant metric was the number of database transactions, highlighting the massive differences between the reactive and the blocking driver used in services.

The capture below demonstrates that, following a test with 5,000 concurrent users, the number of database transactions per minute is twice as high in the reactive scenario.

Conclusions

Undoubtedly, the reactive paradigm generates significant interest around it but also brings about a change in the way of thinking, designing, implementing, and testing software solutions.

It relies on asynchronicity to ensure the isolation of components, resulting in an extensible system, capable of scaling and avoiding blocking during execution to wait for processing completion.

Although the learning curve might be steeper, I have observed the convenience of using existing libraries to create adaptive, reactive solutions that handle one of the most significant challenges in software: processing a large stream of data effectively.

In conclusion, starting from our subject, I believe that efficiency and performance will continue to be essential topics in the programming domain.

It remains our responsibility to find the most optimal solutions for each specific situation, as seen in this article.

There will always be multiple approaches to various challenges, but if we were to summarise the relationship between reactivity and performance, we could say that although it appears promising, it is just another tool in our arsenal, far from being a solution designed to address problems in a context-agnostic manner.

Closing thoughts

Reactive programming is interesting. Like any other paradigm, it involves a novel way of thinking.
In this article, we only scratched the surface of the reactive world. For deeper insights, this book is a good source.
We do not live in a utopia; in real-world applications, constraints often arise that frequently lead to the impossibility of implementing a fully reactive architecture.
In the reactive world, many interesting problems emerge that would deserve an article of their own: synchronization or data integration when a specific ordering is needed, data reprocessing issues (in the case of message communication), and even debugging problems. The topic of debugging problems is also particularly interesting, and the monitoring and logging mechanism is absolutely necessary in the case of asynchronous systems that process in parallel.
To understand the reactive architecture of Spring Webflux, an excellent resource is Spring webflux under the hood. Here, we can visually see how the mechanism works behind the scenes and why an army of threads for processing is unnecessary.
Akka is a toolkit for complex asynchronous distributed applications that deserves a separate article on its own, but a good starting point is Akka introduction.