Every team running Apache ActiveMQ in production eventually reaches the same conclusion: throughput is lower than expected, latency is inconsistent, or producers are blocked without an obvious reason. The broker logs show flow control events. Queue depth is climbing.
The root cause is almost never a hardware limitation. It is almost always ActiveMQ performance tuning left at its defaults, a deliberately conservative configuration baseline. This guide walks through every tuning layer that matters, from producer-level async sends down to JVM GC configuration, with production-ready XML and code examples at each step.
The improvements are not marginal. Teams that have not previously tuned their broker routinely achieve 5-10x throughput gains by applying the changes present in this guide. The “10x” in the title is not marketing, it reflects the documented gap between Apache ActiveMQ’s default configuration and a properly tuned deployment.
The Seven Tuning Layers: A Framework for Systematic Optimization
Random performance tuning produces random results. The framework that produces consistent, explainable gains is to work through the seven layers in order: broker transport, then producer, then consumer, then persistence, then memory management, then storage I/O, then JVM.
Each layer has a ceiling; work from the outermost (transport) to the innermost (JVM) so that gains at each layer compound rather than cancel out.
- Transport Layer — TCP vs NIO, connection concurrency limits
- Producer Layer — Async sends, compression, delivery mode
- Consumer Layer — Prefetch policy, acknowledge mode, concurrent consumers
- Persistence Layer — KahaDB journal tuning, concurrent store-and-dispatch
- Memory Management — systemUsage sizing, destination-level memory limits
- Storage I/O — Disk placement, mKahaDB sharding, fsync behavior
- JVM Layer — Heap sizing, GC algorithm, GC pause tuning
Layer 1: Transport – TCP vs NIO
By default, ActiveMQ listens on a plain TCP transport. TCP transport uses one thread per connection, which is perfectly adequate for tens of connections, but starts to hit practical limits once you reach hundreds or thousands of connections.
The NIO (Non-blocking I/O) transport uses Java NIO selectors to handle multiple connections on a small, bounded thread pool. For any deployment with more than ~100 concurrent client connections, switching to NIO removes a thread-creation and context-switching bottleneck that degrades throughput as the connection count grows.
| <!– activemq.xml — switch transport to NIO –> <transportConnectors> <!– Replace TCP with NIO for high connection-count deployments –> <transportConnector name=”nio” uri=”nio://0.0.0.0:61616?maximumConnections=5000&wireFormat.maxFrameSize=104857600″/> <!– Keep TCP for legacy clients that require it –> <transportConnector name=”tcp” uri=”tcp://0.0.0.0:61617?maximumConnections=1000″/> </transportConnectors> |
maximumConnections prevents a connection storm from consuming all broker resources. Set it to a value that reflects your actual expected peak concurrency with a reasonable safety margin, not unlimited.
wireFormat.maxFrameSize raises the frame size limit for large message payloads. The default (64MB) is sufficient for most workloads, but applications that send large binary payloads (documents, images) benefit from increasing it to match their 99th-percentile message size.For Artemis deployments, this is less of a concern as Artemis uses Netty non-blocking I/O exclusively across all acceptors by default. We covered this architectural difference in depth in our foundational post on ActiveMQ vs Artemis: the 2026 Definitive Guide.
Layer 2: Producer – Async Sends and Delivery Mode
This is where the largest single-throughput gain lies for persistent workloads.
The Persistent Send Blocking Problem
When a producer sends a persistent message, ActiveMQ blocks the send call until the broker has written the message to the persistent store and returned an acknowledgment. This is JMS-spec-correct behavior for persistent delivery; the message is guaranteed durable before the producer moves on.
The cost: a single round-trip to the broker’s disk on every send. Under high throughput, these round-trips compound. Persistent delivery is roughly 20 times slower than non-persistent delivery in ActiveMQ, measured on equivalent hardware.
Fix 1: Enable Async Sends for Persistent Messages
useAsyncSend=true decouples the producer’s send acknowledgment from the disk write. The producer fires the message without waiting for disk confirmation. The broker still persists the message, the producer just does not block waiting for the receipt.
| // Client-side: enable async sends on the connection factory ActiveMQConnectionFactory factory = new ActiveMQConnectionFactory(“tcp://broker:61616”); factory.setUseAsyncSend(true); // Or via URI parameter — same effect ActiveMQConnectionFactory factory = new ActiveMQConnectionFactory( “tcp://broker:61616?jms.useAsyncSend=true”); |
The reliability trade-off: with async sends, a producer cannot detect a send failure as it occurs. If the broker fails between the send and the disk write, the message can be lost without the producer knowing.
For workloads where this tradeoff is acceptable (high-volume event streams, metrics, logs), async sends are the right default. For financial transactions or messages requiring guaranteed delivery acknowledgment, keep synchronous sends and accept the throughput cost.
Fix 2: Non-Persistent Delivery Mode for Transient Workloads
For workloads where message loss on broker restart is acceptable, NON_PERSISTENT delivery eliminates disk I/O from the message path entirely:
| // Per-message delivery mode MessageProducer producer = session.createProducer(destination); producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT); // Or broker-wide — disables persistence for all messages // activemq.xml: <broker persistent=”false”> |
For topic messages in particular, the default behavior is surprising: even with no durable subscribers, a topic producer blocks waiting for broker acknowledgment. Setting useAsyncSend=true or switching to NON_PERSISTENT has a large impact on topic throughput even in fanout scenarios.
Fix 3: Enable Message Compression for Large Payloads
For messages with compressible payloads (JSON, XML, structured text), enabling broker-level compression reduces the bytes written to the journal and transferred over the wire, improving both throughput and storage efficiency:
| factory.setUseCompression(true); |
Compression adds CPU overhead. Measure the tradeoff on your specific payload profile, for small messages (< 1KB), the CPU cost of compression typically outweighs the I/O savings.
Layer 3: Consumer – Prefetch Policy and Acknowledge Mode
Prefetch Policy: The Most Misunderstood Setting
ActiveMQ pushes messages to consumers proactively rather than waiting for consumers to request them. The prefetch limit controls how many messages the broker streams to a consumer before waiting for acknowledgments.
Default prefetch values in ActiveMQ:
| Consumer Type | Default Prefetch |
| Queue consumer | 1000 |
| Topic consumer | 32766 |
| Durable topic subscriber | 100 |
| Queue browser | 0 |
The default queue prefetch of 1000 is designed for a single fast consumer. When multiple consumers share a queue for work distribution, a prefetch of 1000 means the first consumer to connect absorbs 1000 messages into its local buffer, leaving other consumers idle until that first consumer processes and acknowledges its backlog. Under uneven load distribution, this creates the appearance of a slow queue when the real problem is a lopsided prefetch.
Rule of thumb:
- Single consumer, fast processing → leave prefetch at default (1000) or increase
- Multiple consumers sharing a queue → reduce prefetch to 1 for even distribution
- Batch processing consumers → increase prefetch to match batch size
| <!– Via connection URI — applies to all consumers on this connection –> tcp://localhost:61616?jms.prefetchPolicy.queuePrefetch=10 <!– Per-consumer via destination options — most targeted approach –> Queue queue = session.createQueue(“ORDERS.QUEUE?consumer.prefetchSize=1”); MessageConsumer consumer = session.createConsumer(queue); |
Optimized Acknowledge: Batch ACKs for AUTO_ACKNOWLEDGE Consumers
When consumers use AUTO_ACKNOWLEDGE mode, ActiveMQ can send acknowledgments back to the broker in batches rather than one-per-message. optimizeAcknowledge=true batches ACKs at 65% of the prefetch limit (or every 300ms if consumption is slow).
| factory.setOptimizeAcknowledge(true); // Optional: control the ACK batch window factory.setOptimizeAcknowledgeTimeOut(300); |
For high-throughput consumers processing thousands of messages per second, this reduces ACK round-trips by 35%+ – a meaningful throughput gain without any reliability tradeoff for AUTO_ACKNOWLEDGE workloads.
Layer 4: Persistence – KahaDB Tuning
KahaDB is ActiveMQ’s default persistence engine, a dual-structure system combining an append-only journal for writes with a B-tree index for message retrieval. Both components have tuning parameters that significantly affect throughput.
Core KahaDB Performance Configuration
| <!– activemq.xml — production-tuned KahaDB –> <persistenceAdapter> <kahaDB directory=”${activemq.data}/kahadb” <!– Larger journal files = less file rollover overhead at high throughput –> journalMaxFileLength=”64mb” <!– B-tree index cache — set as large as your available heap allows –> <!– Default 10000 pages × 4KB = ~40MB. Increase for large message volumes –> indexCacheSize=”100000″ <!– Write multiple index pages to disk in one batch — reduces fsync calls –> indexWriteBatchSize=”5000″ <!– Store and dispatch concurrently: write to journal AND push to consumer –> <!– Enabled by default for queues. Enable for topics too if latency matters –> concurrentStoreAndDispatchQueues=”true” concurrentStoreAndDispatchTopics=”true” <!– Disable for maximum throughput on non-critical workloads –> <!– KEEP ENABLED for financial/audit data — this is your durability guarantee –> enableJournalDiskSyncs=”true” <!– Pre-allocate journal files to avoid file system fragmentation –> preallocationStrategy=”zeros”/> </persistenceAdapter> |
concurrentStoreAndDispatchQueues: The Hidden Performance Multiplier
By default, KahaDB enables this for queues but not topics. It allows the broker to simultaneously write a message to the journal AND dispatch it to a waiting consumer, rather than writing first, then dispatching. For queues with live consumers, this halves the effective message latency on the dispatch path.
The cost: if the broker crashes between dispatch and disk write, the consumer may have processed a message that was never persisted, violating strict at-least-once delivery. For most workloads, this is acceptable, and the setting should be enabled. For regulated industries requiring strict persistence-before-dispatch guarantees, leave it disabled.
enableJournalDiskSyncs: The Reliability/Performance Dial
This is the single most impactful KahaDB parameter for throughput, and the one most dangerous to misuse. When true (the default), the broker calls fsync() on the journal file before returning an acknowledgment to the producer. This guarantees the message is physically on disk before the producer’s send returns.
Disabling it (enableJournalDiskSyncs=”false”) eliminates the fsync overhead, which can reduce write latency by an order of magnitude on systems where fsync is slow (RHEL 6+, certain SAN configurations, EBS on AWS). The durability cost: messages can be lost on a broker crash between the OS buffer write and the physical disk write.
Decision rule: Keep enableJournalDiskSyncs=true for any data that must survive a broker crash. Disable it only for workloads where replay from source is possible, and message loss on crash is acceptable.
Layer 5: Memory Management – Sizing system usage correctly
This is where most misconfigured production deployments fail. The default systemUsage values ship as conservative minimums — they are appropriate for a laptop test environment, not a production broker.
| <!– activemq.xml — production systemUsage sizing –> <systemUsage> <systemUsage> <!– Non-persistent messages held in memory before spooling to tempUsage –> <!– Default: 64mb — triggers flow control at very low message volumes –> <!– Production: size to ~20% of broker JVM heap –> <memoryUsage> <memoryUsage percentOfJvmHeap=”20″/> </memoryUsage> <!– Persistent message disk storage ceiling –> <!– CRITICAL: actual disk usage can exceed this value — set to 70% of –> <!– intended max disk allocation to leave headroom –> <storeUsage> <storeUsage limit=”500 gb”/> </storeUsage> <!– Non-persistent messages spooled to disk when memoryUsage is exhausted –> <!– Must be lower than actual free disk space on the data volume –> <tempUsage> <tempUsage limit=”50 gb”/> </tempUsage> </systemUsage> </systemUsage> |
The storeUsage Trap
One of the most confusing systemUsage behaviors: in certain scenarios, actual disk storage used by ActiveMQ can exceed the configured storeUsage limit. Red Hat’s documentation explicitly recommends setting storeUsage to approximately 70% of the intended maximum disk storage to leave headroom for this overshoot behavior. Setting storeUsage to exactly the available disk space risks filling the volume entirely, which causes I/O errors rather than graceful flow control.
Per-Destination Memory Limits
For brokers with many queues of varying priority, per-destination memory limits prevent a single high-volume queue from consuming the entire broker memory budget:
| <destinationPolicy> <policyMap> <policyEntries> <!– High-priority payments queue gets more memory headroom –> <policyEntry queue=”payments.>” memoryLimit=”512mb” producerFlowControl=”true”/> <!– Standard queues get a bounded allocation –> <policyEntry queue=”>” memoryLimit=”64mb” producerFlowControl=”true”/> </policyEntries> </policyMap> </destinationPolicy> |
Layer 6: Storage I/O – Disk Placement and KahaDB Sharding
The KahaDB journal is an append-only write log, it is designed to exploit sequential disk I/O, which is the fastest access pattern on both spinning disks and SSDs. The append-only advantage is completely destroyed when the journal volume is shared with other I/O-heavy processes.
The rule is simple: put the KahaDB directory on a dedicated volume. Not shared with the OS, not shared with application logs, not shared with temp storage. Dedicated volume, dedicated I/O.
A real-world benchmark from a JBoss A-MQ production tuning effort illustrates the impact: sync write performance on a shared volume measured at 9.7 MB/sec. The same operation on a dedicated volume measured at 746 MB/sec, a 75x difference attributable entirely to I/O contention, not hardware capability.
mKahaDB: Destination Sharding for Mixed Workloads
When a single broker serves destinations with wildly different consumption characteristics, some queues near-real-time, others accumulating large backlogs, a single KahaDB instance serializes all their I/O. KahaDB’s garbage collection (compaction of acknowledged messages) is blocked by any destination still holding unacknowledged messages, which can cause journal growth and I/O spikes.
The solution is mKahaDB with per-destination stores, which isolates each destination’s journal GC from all others:
| <!– activemq.xml — mKahaDB destination sharding –> <persistenceAdapter> <mKahaDB directory=”${activemq.data}/kahadb”> <filteredPersistenceAdapters> <!– High-volume, fast-consuming queues: aggressive journal settings –> <filteredKahaDB queue=”payments.>”> <persistenceAdapter> <kahaDB journalMaxFileLength=”64mb” concurrentStoreAndDispatchQueues=”true” indexCacheSize=”50000″/> </persistenceAdapter> </filteredKahaDB> <!– Per-destination isolation for all remaining queues –> <filteredKahaDB perDestination=”true”> <persistenceAdapter> <kahaDB journalMaxFileLength=”32mb”/> </persistenceAdapter> </filteredKahaDB> </filteredPersistenceAdapters> </mKahaDB> </persistenceAdapter> |
perDestination=”true” assigns each destination its own isolated KahaDB instance. This prevents a slow-consumer queue (or the DLQ) from blocking journal GC for the entire broker.
We covered DLQ accumulation as a KahaDB disk growth driver in our post on ActiveMQ Dead Letter Queue Management mKahaDB sharding is the architectural solution to that problem at the storage level.
Layer 7: JVM – Heap Sizing and GC Algorithm
Broker GC pauses impact message latency directly. A 500ms stop-the-world pause is a 500ms message delivery stall, invisible in a benchmark but catastrophic in production when hundreds of consumers are waiting.
Recommended JVM Baseline for Production ActiveMQ
| # activemq.conf (or wrapper.conf) # Heap: Set initial = max to prevent resize pauses # 4–8GB appropriate for most enterprise deployments ACTIVEMQ_OPTS=”-Xms4g -Xmx4g” # G1GC: Designed for large heaps, low-pause, incremental collection ACTIVEMQ_OPTS=”$ACTIVEMQ_OPTS -XX:+UseG1GC” # Target max GC pause of 20ms — aggressive but achievable with G1 ACTIVEMQ_OPTS=”$ACTIVEMQ_OPTS -XX:MaxGCPauseMillis=20″ # Region size — for 4GB heap, 4–8MB regions are appropriate ACTIVEMQ_OPTS=”$ACTIVEMQ_OPTS -XX:G1HeapRegionSize=8m” # GC logging — essential baseline data for performance troubleshooting ACTIVEMQ_OPTS=”$ACTIVEMQ_OPTS -Xlog:gc*:file=${ACTIVEMQ_DATA}/gc.log:time,uptime:filecount=5,filesize=20m” # Disable explicit GC calls from application code (prevents advisory pauses) ACTIVEMQ_OPTS=”$ACTIVEMQ_OPTS -XX:+DisableExplicitGC” |
KahaDB Index Cache vs. Heap Pressure
One frequently overlooked interaction: KahaDB’s indexCacheSize allocates pages in the JVM heap. At the default of 10,000 pages × 4KB/page, that is ~40MB – negligible. At indexCacheSize=100000, that is ~400MB of heap committed to the index cache. Factor this into your heap sizing: if your broker has a 4GB heap and you allocate 400MB to KahaDB index cache, your effective working heap for message buffers and connection state is 3.6GB.
For very large message volumes where the index does not fit in cache, consider increasing the heap before increasing indexCacheSize. A well-sized heap that prevents GC pressure is more valuable than a larger cache that causes frequent collection.
The Complete Performance Tuning Checklist
Apply these in order. Measure after each group before proceeding to the next.
Transport (Layer 1)
- [ ] Switch to NIO transport for > 100 concurrent connections
- [ ] Set maximumConnections appropriate to peak concurrency
- [ ] Increase wireFormat.maxFrameSize if sending large message payloads
Producer (Layer 2)
- [ ] Enable useAsyncSend=true for throughput-sensitive persistent senders
- [ ] Set DeliveryMode.NON_PERSISTENT for truly transient workloads
- [ ] Enable useCompression=true for large compressible payloads
Consumer (Layer 3)
- [ ] Reduce queue prefetch to 1 for multi-consumer workload distribution
- [ ] Enable optimizeAcknowledge=true for AUTO_ACKNOWLEDGE consumers
- [ ] Tune concurrent consumer count to match processing thread pool size
Persistence (Layer 4)
- [ ] Set journalMaxFileLength to 64mb for high-throughput brokers
- [ ] Increase indexCacheSize (100000+ for large brokers)
- [ ] Enable concurrentStoreAndDispatchQueues=true
- [ ] Evaluate enableJournalDiskSyncs=false for non-critical workloads
Memory (Layer 5)
- [ ] Replace memoryUsage limit=”64mb” with percentOfJvmHeap=”20″
- [ ] Set storeUsage to 70% of available disk, not 100%
- [ ] Set per-destination memoryLimit for high-priority queues
Storage I/O (Layer 6)
- [ ] Move KahaDB to a dedicated volume (no shared I/O)
- [ ] Implement mKahaDB with perDestination=”true” for mixed workloads
- [ ] Set preallocationStrategy=”zeros” to reduce file system fragmentation
JVM (Layer 7)
- [ ] Set -Xms = -Xmx (no heap resize pauses)
- [ ] Switch to -XX:+UseG1GC with -XX:MaxGCPauseMillis=20
- [ ] Enable GC logging as a permanent baseline
Measuring What You’ve Tuned
Tuning without measurement is guessing. ActiveMQ ships with a built-in performance testing tool via the activemq-perf Maven plugin:
| # Start the broker (separately) # Run producer test mvn activemq-perf:producer -Dfactory.brokerURL=tcp://localhost:61616 \ -Dfactory.userName=admin -Dfactory.password=admin # Run consumer test mvn activemq-perf:consumer -Dfactory.brokerURL=tcp://localhost:61616 |
The plugin generates XML performance reports that can be graphed over time. Run baseline tests before applying any tuning changes, and after each layer of changes, to isolate the contribution of each modification.
For ongoing visibility into production performance, rather than point-in-time benchmark runs, you need real-time metrics on enqueue rate, dequeue rate, consumer count, flow-control events, and KahaDB journal growth per destination.
From Default Config to 10x Throughput: The Path Is Clear
ActiveMQ’s default configuration ships tuned for correctness and safety. Every performance gain requires a deliberate, informed decision to trade some margin of safety for throughput — async sends trade synchronous acknowledgment visibility for producer speed, NIO trades per-connection resource isolation for concurrency scale, enableJournalDiskSyncs=false trades write-through durability for I/O throughput.
The job of performance tuning is not to disable safety features blindly. It is to understand each tradeoff, apply it where the workload actually warrants it, and measure the outcome. The seven-layer framework in this guide gives you the structure to do that systematically, not by trial and error.
MeshIQ helps enterprise teams apply this framework to their specific deployments — with the monitoring visibility to see what is actually limiting performance, and the expertise to configure each layer correctly for the first time.
Get a performance audit of your ActiveMQ deployment Talk to an Expert
Frequently Asked Questions
The highest-impact changes are: enable useAsyncSend=true for persistent producer sends, tune prefetch policy to match your consumer topology (reduce to 1 for multi-consumer queues), enable optimizeAcknowledge=true for batch-mode consumers, right-size systemUsage limits above the default memoryUsage=64mb, and tune KahaDB journalMaxFileLength and indexCacheSize for your message volume. These five changes alone routinely produce 3–5x throughput gains.
The default queue consumer prefetch is 1000. For single-consumer queues with fast processing, this is reasonable. For multi-consumer workload distribution, reduce it to 1 – at 1000, the first consumer absorbs all prefetch messages while others sit idle. Per-destination tuning using the consumer.prefetchSize destination option is the most targeted approach.
Significantly, yes — persistent delivery is approximately 20x slower than non-persistent due to mandatory fsync() on every send. Switching to NON_PERSISTENT delivery or enabling useAsyncSend=true (which decouples the producer’s blocking wait from the disk write) are the highest-leverage individual changes available for throughput-focused workloads.
Flow control triggers when systemUsage limits are exceeded — typically memoryUsage (default 64mb) or storeUsage. Root causes: undersized limits, slow consumers causing message accumulation, or a KahaDB I/O bottleneck preventing timely acknowledgment and journal compaction. We cover troubleshooting producer flow control blocks in depth in our dedicated post in this series.
Use G1GC (-XX:+UseG1GC) with -XX:MaxGCPauseMillis=20, set initial and maximum heap equal to prevent resize pauses, and allocate 4–8GB heap for most enterprise deployments. Enable GC logging permanently as a baseline. We cover the full JVM tuning guide — including GC log analysis and heap dump interpretation — in our post on JVM Memory & Garbage Collection Tuning →.