---
title: "Data in Motion NYC: Messaging, Streaming, and Processing"
date: 2026-05-06
author: "meshIQ"
featured_image: "https://www.meshiq.com/wp-content/uploads/event_data-in-motion-nyc_060426.jpg"
---

# Data in Motion NYC: Messaging, Streaming, and Processing

## Agenda

- **11:00 AM – 11:20 AM:** Arrival &amp; Networking
- **11:20 AM – 12:00 PM:** Data in Motion with Apache ActiveMQ® and Apache Beam | *JB Onofré, Principal Software Engineer, Dremio + Director, Apache Foundation*
- **12:00 PM – 1:00 PM:** Rooftop Lunch &amp; Networking (on the rooftop if the weather permits)
- **1:00 PM – 1:40 PM:** GraphFlow &amp; Beam: Pythonic, Scalable GNN Pipelines | *Yogesh Tewari, Senior Cloud Data Engineer, Google*
- **1:40 PM – 2:00 PM:** Final Networking


## Abstracts

### Data in Motion with Apache ActiveMQ® and Apache Beam

*JB Onofré, Principal Software Engineer, Dremio + Director, Apache Foundation*

Modern data architectures demand more than batch processing — they require reliable, scalable, and flexible pipelines that can handle data as it moves. This session explores the powerful combination of **Apache ActiveMQ**, a battle-tested message broker for enterprise messaging, and **Apache Beam**, a unified programming model for both batch and streaming data processing.

We’ll walk through the fundamentals of integrating ActiveMQ as a durable message source and sink within Beam pipelines, enabling real-time event-driven workflows across distributed systems. Attendees will learn how to build end-to-end pipelines that consume messages from ActiveMQ queues and topics, apply transformations, enrichments, and windowing strategies using Beam’s expressive API, and route results to downstream systems — all with portability across runners like Apache Flink, Apache Spark, and Google Dataflow.

Key topics include:

- ActiveMQ connectivity patterns in Beam (JMS I/O)
- Message acknowledgment and exactly-once semantics
- Schema handling and payload deserialization
- Scaling strategies for high-throughput messaging workloads
- Real-world use cases: event sourcing, CDC, and operational data pipelines

Whether you’re modernizing a legacy messaging infrastructure or designing a new streaming architecture from scratch, this talk will give you practical patterns and insights to put data in motion — reliably and at scale.


### GraphFlow &amp; Beam: Pythonic, Scalable GNN Pipelines

*Yogesh Tewari, Senior Cloud Data Engineer at Google*

Learn how GraphFlow, a modular Python toolkit, utilizes Apache Beam to create efficient and scalable data pipelines for Graph Neural Networks (GNNs). We’ll demonstrate how GraphFlow on Beam tackles large-scale graph data challenges, including distributed ingestion from cloud databases, scalable feature normalization, graph sampling, and online model inference.