Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Why Everyone’s Talking About Event Streaming

By Chris Latimer, vice president, product management, DataStax

There’s a lot of talk about the importance of streaming data and event-driven architectures right now. You might have heard of it, but do you really know why it’s so important to a lot of enterprises? Streaming technologies unlock the ability to capture insights and take instant action on data that’s flowing into your organization; they’re a critical building block for developing applications that can respond in real-time to user actions, security threats, or other events. In other words, they’re a key part of building great customer experiences and driving revenue.

Here’s a quick breakdown of what streaming technologies do, and why they’re so important to enterprises.

Data in motion

Organizations have gotten pretty good at creating a relatively complete view of so-called “data at rest” — the kind of information that’s often captured in databases, data warehouses, and even data lakes to be used immediately (in “real time”) or to fuel applications and analysis later.

Increasingly, data that’s driven by activities, actions, and events that happen in real-time across an organization pours in from mobile devices, retail systems, sensor networks, and telecommunications call-routing systems.

While this “data in motion” might ultimately get captured in a database or other store, it’s extremely valuable while it’s still on the move. For a bank, data in motion might enable it to detect fraud in real time and act upon it instantly. Retailers can make product recommendations based on a consumer’s searching or purchasing history, the instant someone visits a web page or clicks on a particular item.

Consider Overstock, a U.S. online retailer. It must consistently deliver engaging customer experiences and derive revenue from in-the-moment monetization opportunities. In other words, Overstock sought the ability to make lightning-fast decisions based on data that was arriving in real-time (generally, brands have 20 seconds to connect with customers before they move on to another website).

“It’s like a self-driving car,” says Thor Sigurjonsson, Overstock’s head of data engineering. “If you wait for feedback, you’re going to drive off the road.”

The event-driven architecture

To maximize the value of their data as it’s created — instead of waiting hours, days, or even longer to analyze it once it’s at rest—Overstock needed a streaming and messaging platform, which would enable them employ real-time decision-making to deliver personalized experiences and recommend products likely to be well-received by customers at the perfect time (really fast, in other words).

Data messaging and streaming is a key part of an event-driven architecture, which is a software architecture or programming approach built around the capture, communication, processing, and persistence of events—mouse clicks, sensor outputs, and the like.

Processing streams of data involves taking actions on a series of data that originates from a system that continuously creates “events.” The ability to query this non-stop stream and find anomalies, recognize that something important has happened, and act on it quickly and in a meaningful way, is what streaming technology enables.

This is in contrast to batch processing, where an application would store a data after intaking it, process it, and then store the processed result or forward it to another application or tool. Processing might not start until after, say, 1000 data points have been collected. That’s too slow for the kind of applications that require reactive engagement at the point of interaction.

It’s worth pausing to break that idea down:  

  • The point of interaction could be a system making an API call, or a mobile app.
  • Engagement is defined as adding value to the interaction. It could be giving a tracking number to a customer after they place an order, a product recommendation based on a user’s browsing history, or a billing authorization or service upgrade.
  • Reactive means the engagement action happens in real-time or near-real-time; this translates to hundreds of milliseconds for human interactions, while machine-to-machine interactions that occur in an energy utility’s sensor network, for example, might not require such a near-real-time response.

When message queue isn’t enough

Some enterprises have recognized that they need to derive value from their data-in-motion and have assembled their own event-driven architectures from a variety of technologies, including message-oriented middleware systems like Java messaging service (JMS) or message queue (MQ) platforms.

But these platforms were built on a fundamental premise that the data they processed was transient and should be immediately discarded once each message had been delivered. This essentially throws away a highly valuable asset: data that’s identifiable as arriving at a particular point in time. Time-series information is critical for applications that involve asynchronous analysis, like machine learning. Data scientists can’t build machine learning  models without it.  A modern streaming system needs to not only pass events along from one service to another, but also store them in a way that retains their value or usage later.

The system also needs to be able to scale to manage terabytes of data and millions of messages per second. The old MQ systems are not designed to do either of these.

Pulsar and Kafka: The old guard and the unified, next-gen challenger

As I touched upon above, there are a lot of choices available when it comes to messaging and streaming technology.

They include various open-source projects like RabbitMQ, ActiveMQ, and NATS, along with proprietary solutions such as IBM MQ or Red Hat AMQ. Then there are the two well-known, unified platforms for handling real-time data: Apache Kafka, a very popular technology that has become almost synonymous with streaming; and Apache Pulsar, a newer streaming and message queuing platform.

Both of these technologies were designed to handle the high throughput and scalability that many data-driven applications require.

Kafka was developed by LinkedIn to facilitate data communication between different services at the job networking company and became an open source project in 2011. Over the years it’s become a standard for many enterprises looking for ways to derive value from real-time data.

Pulsar was developed by Yahoo! to solve messaging and data problems faced by applications like Yahoo! Mail; it became a top-level open source project in 2018. While still catching up to Kafka in popularity, it has more features and functionality. And it carries a very important distinction: MQ solutions are solely messaging platforms, and Kafka only handles an organization’s streaming needs—Pulsar handles both of these needs for an organization, making it the only unified platform available.

Pulsar can handle real-time, high-rate use cases like Kafka, but it’s also a more complete, durable, and reliable solution when compared to the older platform. To have streaming and queuing (an asynchronous communications protocol that enables applications to talk to one another), for example, a Kafka user would need to bolt on something like RabbitMQ or other solutions. Pulsar, on the other hand, can handle many of the use cases of a traditional queuing system without add-ons.

Pulsar carries other advantages over Kafka, including higher throughput, better scalability, and geo-replication, which is particularly important when a data center or cloud region fails. Geo-replication enables an application to publish events to another data center without interruption, preventing the app from going down—and preventing an outage from affecting end users. (Here’s a more technical comparison of Kafka and Pulsar).

Wrapping up

In the case of Overstock, Pulsar was chosen as the retailer’s streaming platform. With it, the company built what its head of engineering Sigurjonsson describes as an “integrated layer of data and connected processes governed by a metadata layer supporting deployment and utilization of integrated reusable data across all environments.”

In other words, Overstock now has a way to understand and act upon real-time data organization-wide, enabling the company to impress its customers with magically fast, relevant offers and personalized experiences.

As a result, teams can reliably transform data in flight in a way that is easy to use and requires less data engineering. This makes it that much easier to delight their customers—and ultimately drive more revenue.

To learn more about DataStax, visit us here.

About Chris Latimer

Chris Latimer is a technology executive whose career spans over twenty years in a variety of roles including enterprise architecture, technical presales, and product management. He is currently Vice President of Product Management at DataStax where he is focused on building the company’s product strategy around cloud messaging and event streaming. Prior to joining DataStax, Chris was a senior product manager at Google where he focused on APIs and API Management in Google Cloud. Chris is based near Boulder, CO, and when not working, he is an avid skier and musician and enjoys the never-ending variety of outdoor activities that Colorado has to offer with his family.


Read More from This Article: Why Everyone’s Talking About Event Streaming
Source: News

Category: NewsJanuary 14, 2022
Tags: art

Post navigation

PreviousPrevious post:Delivering Business Value Through Data-First ModernizationNextNext post:Top 15 IT certifications in demand for 2022

Related posts

Barb Wixom and MIT CISR on managing data like a product
May 30, 2025
Avery Dennison takes culture-first approach to AI transformation
May 30, 2025
The agentic AI assist Stanford University cancer care staff needed
May 30, 2025
Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
May 30, 2025
“AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
May 30, 2025
“ROI는 어디에?” AI 도입을 재고하게 만드는 실패 사례
May 30, 2025
Recent Posts
  • Barb Wixom and MIT CISR on managing data like a product
  • Avery Dennison takes culture-first approach to AI transformation
  • The agentic AI assist Stanford University cancer care staff needed
  • Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
  • “AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.