Information about Pivotal's Spring-XD and how it differs from DataTorrent
Spring-XD is something that Pivotal is pushing in it’s Big Data stack of tools, while it may seem similar to DataTorrent (DT), they have different goals. I had a conversation with DT’s team and here’s what I learned.
The focus of Spring-XD and DT are very different in that …
Spring-XD’s focus is to make diverse and competing components work together in an easy to use stack.
DT’s focus is real-time, low latency processing and big data computations. DT offers a platform natively developed for it.The same could not be accomplished by combining diverse existing components that were architected each with their individual goals and are loosely coupled in a single stack.
IBM’s InfoSphere Streams would be closer to a competitor yet still doesn’t compare to DT in terms of it’s scalability and throughput. DT is native Hadoop and programmed to work as a Hadoop component. DT’s focus on real-time is much deeper then Spring-XD (and other like stacks). Most of the other technologies focus is to use streaming to ingest data (read, filter, store) but DT is aiming to be the next generation big data computational model that will take away some of the importance of batch processing. DT’s see’s batch as getting commoditized and real-time being the value add and differentiator for the next 3-4 years.
Here’s an interesting read about LinkedIn’s Samza (which is Kafka and Yarn based) and how it differs from DT.
https://groups.google.com/forum/#!search/malhar-users/malhar-users/64FNPqz_qEQ/N4qosKOrZLwJ