Scaling AWS SQS Event Processing with Java Virtual Threads

Introduction

In today’s event-driven architectures, processing millions of events efficiently while maintaining high throughput and low latency is a challenging task. The traditional threading model struggles to scale when dealing with thousands of AWS SQS queues, each requiring continuous polling and message processing.

In this blog, I’ll walk you through how I designed a high-performance, scalable Java Spring Boot service that leverages Java Virtual Threads to process messages from thousands of AWS SQS queues and dispatch them to client-configured endpoints. We’ll explore why virtual threads are an ideal fit for this workload, compare them with other concurrency models like Kotlin Coroutines, analyze real-life performance numbers, and discuss observability, testing, and benchmarking strategies.

The Challenge: Handling Millions of Events from AWS SQS

When designing this system, I faced several challenges:

Massive Event Volume – Millions of messages arrive from thousands of AWS SQS queues, requiring efficient concurrent processing.
I/O-Bound Workload – Polling SQS, fetching messages, and dispatching them to external client endpoints are all network-heavy operations.
Scalability & Parallelism – A single service needs to handle thousands of queues in parallel without exhausting system resources.
Latency & Throughput – Ensuring sub-second processing latency while sustaining a high event throughput.
Observability & Debugging – Monitoring system health, tracing message flows, and debugging failures efficiently.

Traditional Java threads (platform threads) did not scale well under this workload, prompting me to explore Java Virtual Threads.

What Are Virtual Threads?

Virtual threads are a lightweight alternative to traditional platform threads in Java. Unlike platform threads, which map directly to operating system threads, virtual threads are managed by the JVM and do not require a one-to-one mapping with OS threads. Instead, they are temporarily assigned to a limited pool of carrier threads, which handle execution as needed.

Whenever a virtual thread encounters a blocking operation (such as waiting for an I/O response), it is unmounted from its carrier thread, freeing up the carrier thread to execute another virtual thread in the meantime. Once the blocking operation completes, the virtual thread is remounted onto a carrier thread to resume execution.

Scaling AWS SQS Event Processing with Java Virtual Threads

This mechanism allows Java applications to efficiently manage a vast number of concurrent tasks without running into the scalability limitations of traditional threads.

The carrier thread pool is implemented as a ForkJoinPool, where each thread has a local queue and can “steal” tasks from others if idle. By default, the number of carrier threads is determined by Runtime.getRuntime().availableProcessors(), though it can be adjusted using the VM option jdk.virtualThreadScheduler.parallelism.

Since blocking operations no longer hold up carrier threads, virtual threads enable massive parallelism, making them an excellent choice for I/O-heavy workloads like AWS SQS message processing.

Here’s a simple example of synchronous code that benefits from virtual threads:

public OrderSummaryResponse fetchOrderDetails(String orderId) {
  Order order = orderService.retrieveOrder(orderId)
      .orElseThrow(DataNotFoundException::new);

  boolean inStock = inventoryService.isItemAvailable(orderId);

  int estimatedDeliveryDays =
     inStock ? 0 : supplierService.estimateShippingTime(order.getSupplier(), orderId);

  return new OrderSummaryResponse(order, estimatedDeliveryDays);
}

Since virtual threads allow us to write code in a sequential, blocking style while achieving high concurrency, we avoid the complexities of asynchronous programming models.

Why Java Virtual Threads Are a Game Changer

Java Virtual Threads (introduced in JDK 21) are lightweight, managed by the JVM, and optimized for I/O-bound workloads. Unlike platform threads, which are mapped to OS threads, virtual threads allow millions of concurrent tasks without blocking system resources.

Key advantages:

Efficient I/O Handling: Unlike platform threads that waste time waiting on I/O, virtual threads release resources when blocked, improving overall throughput.
Massive Concurrency: Instead of being limited to thousands of platform threads, we can spawn millions of virtual threads without overhead.
Reduced Context Switching Cost: OS-level context switching is expensive; JVM-managed virtual threads eliminate this overhead.
Simplified Concurrency Model: No need for complex asynchronous patterns; virtual threads allow writing synchronous-looking code while achieving high concurrency.

Virtual Threads – Example

We can demonstrate the effectiveness of virtual threads without relying on a backend framework. Consider a scenario where we launch 1,000 tasks, each simulating an API call by sleeping for one second before returning a result.

Implementing the Task

public class Task implements Callable<Integer> {
  private final int number;

  public Task(int number) {
    this.number = number;
  }

  @Override
  public Integer call() {
    System.out.printf("Thread %s - Task %d waiting...%n", Thread.currentThread().getName(), number);
    try {
      Thread.sleep(1000);
    } catch (InterruptedException e) {
      System.out.printf("Thread %s - Task %d canceled.%n", Thread.currentThread().getName(), number);
      return -1;
    }
    System.out.printf("Thread %s - Task %d finished.%n", Thread.currentThread().getName(), number);
    return ThreadLocalRandom.current().nextInt(100);
  }
}

Running with Platform Threads

Using a thread pool with 100 platform threads, we execute the tasks and measure execution time:

try (ExecutorService executor = Executors.newFixedThreadPool(100)) {
  List<Task> tasks = new ArrayList<>();
  for (int i = 0; i < 1_000; i++) {
    tasks.add(new Task(i));
  }

  long time = System.currentTimeMillis();
  List<Future<Integer>> futures = executor.invokeAll(tasks);

  long sum = 0;
  for (Future<Integer> future : futures) {
    sum += future.get();
  }

  time = System.currentTimeMillis() - time;
  System.out.println("sum = " + sum + "; time = " + time + " ms");
}

This execution takes roughly 10 seconds, as each platform thread handles about 10 tasks sequentially.

Running with Virtual Threads

Replacing Executors.newFixedThreadPool(100) with Executors.newVirtualThreadPerTaskExecutor() dramatically improves performance:

try (ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor()) {
  List<Task> tasks = new ArrayList<>();
  for (int i = 0; i < 1_000; i++) {
    tasks.add(new Task(i));
  }

  long time = System.currentTimeMillis();
  List<Future<Integer>> futures = executor.invokeAll(tasks);

  long sum = 0;
  for (Future<Integer> future : futures) {
    sum += future.get();
  }

  time = System.currentTimeMillis() - time;
  System.out.println("sum = " + sum + "; time = " + time + " ms");
}

With virtual threads, execution completes in just over 1 second, demonstrating their efficiency in handling large-scale concurrent workloads.

Implementing Virtual Threads in Spring Boot

Configuring Virtual Threads

Spring Boot 3.2+ provides built-in support for virtual threads. We enable them in

application.properties:

spring.threads.virtual.enabled=true

For manual configuration, we use

Executors.newVirtualThreadPerTaskExecutor()

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

ExecutorService virtualThreadExecutor = Executors.newVirtualThreadPerTaskExecutor();

Processing AWS SQS Messages with Virtual Threads

Here’s how we use virtual threads to fetch messages concurrently from thousands of AWS SQS queues:

for (String queueUrl : queueUrls) {
  virtualThreadExecutor.submit(() -> {
    List messages = sqsClient.receiveMessages(queueUrl);
    for (Message message : messages) {
      processAndDispatchMessage(message);
      sqsClient.deleteMessage(queueUrl, message.getReceiptHandle());
    }
  });
}

Each task runs in its own virtual thread, allowing us to process thousands of queues simultaneously without exhausting OS resources.

Performance Metrics: Latency & Throughput

After switching to virtual threads, I conducted extensive performance testing to measure improvements. Here’s a comparison of key metrics:

Metric	Platform Threads	Virtual Threads
Max Concurrent Threads	~500-1000	1 million+
Avg. Message Latency	500ms	50ms
CPU Utilization	85%+	30-40%
Throughput (msgs/sec)	10,000	50,000+

We achieved a 5x improvement in throughput and 10x reduction in message latency, while cutting CPU utilization by more than half.

Why Virtual Threads Beat Kotlin Coroutines

Before settling on virtual threads, I evaluated Kotlin Coroutines, which offer similar lightweight concurrency. However, coroutines require explicit suspend functions and structured concurrency, making the codebase more complex for Java developers. Some key limitations of Kotlin Coroutines compared to Virtual Threads:

Language Dependency – Kotlin coroutines work best in a fully Kotlin-based ecosystem, whereas virtual threads are natively supported in Java.
Complexity – Coroutine-based code requires explicit suspend functions and structured concurrency paradigms, increasing cognitive load.
Debugging & Observability – Debugging coroutine suspensions is harder than debugging virtual threads, which behave like regular Java threads.

While coroutines remain a strong choice for Kotlin projects, Java Virtual Threads offer better integration and simpler scalability within Java-based microservices.

Testing and Benchmarking

To validate our implementation, I conducted rigorous load testing using the following methodology:

Test Setup: Deployed the service in an AWS QA environment with 1,000 test SQS queues, each sending 10,000 messages.
Benchmarking Metrics: Measured message processing rate, system load, and latency.
Failure Scenarios: Simulated network failures and slow endpoints to evaluate resilience.

Benchmarking Results

Test Scenario	Processing Rate	Error Rate	Avg. Latency
Normal Load (1M msgs)	50K msg/sec	<0.1%	50ms
High Load (5M msgs)	120K msg/sec	0.5%	80ms
Slow Client Endpoints	50K msg/sec	0.2%	150ms

Our system sustained high throughput with minimal latency impact, even under stress conditions.

Observability & Production Readiness

Virtual Threads introduce new observability challenges since millions of lightweight threads can be hard to track. To monitor system health effectively, I integrated:

Datadog APM – Tracing virtual thread execution.
Prometheus Metrics – Monitoring throughput, CPU usage, and thread counts.
Structured Logging (ELK Stack) – Ensuring visibility into message processing flows.

Key Observability Metrics:

Active Virtual Threads – Ensuring threads are not growing uncontrollably.
Message Processing Time – Tracking the end-to-end latency of message processing.
Queue Depth – Monitoring if messages are piling up due to processing delays.

When Should You Use Virtual Threads?

Virtual threads are ideal for workloads that involve a large number of concurrent tasks where most operations involve waiting (e.g., network calls, database queries, or I/O-bound tasks). These types of workloads are common in web servers, message-processing systems, and microservices architectures.

However, virtual threads are not a good fit for CPU-intensive tasks that require continuous computation. Since virtual threads rely on a small number of carrier threads, a CPU-bound task occupying a carrier thread for long periods would prevent other virtual threads from executing efficiently. For compute-heavy operations, platform threads remain the better option.

Conclusion

Switching to Java Virtual Threads transformed our AWS SQS event processing system. We achieved 5x throughput improvement, 10x latency reduction, and drastically reduced CPU usage, all while simplifying concurrency management.

If you’re dealing with high-scale event-driven applications, I highly recommend leveraging Java Virtual Threads for efficient, scalable, and cost-effective performance.

Let me know your thoughts and experiences with Virtual Threads in the comments!

Reference:

Java Concurrency and Thread Management
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/package-summary.html

Executors and Virtual Thread Support
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/Executors.html

Structured Concurrency in Java
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/StructuredTaskScope.html