UpStream Case Study

Background
UpStream provides a marketing performance management cloud service. It enables marketers to plan, measure and optimize marketing campaigns across all marketing channels.  UpStream uses statistical modeling and text mining techniques to make data actionable and provide immediate business value. The idea for UpStream evolved from planning and analytics solutions the founders had implemented for premier brands and developed in close partnership with these customers.

UpStream is multi-channel and handles a wide variety of customer data, including online and offline sales transactions, web logs, and catalog, email and direct mail logs. 

To support the marketing requirements of the world’s premier brands, the company needed to build a platform that could handle big data, scale cost-effectively and was easily extensible. Some prototyping had been done in SAS, but SAS required expensive hardware in order to scale.  Hadoop on the other hand, met customer requirements far better. It allowed them to scale using commodity hardware and take advantage of Cloud resources, making it the obvious choice as the base technology. However, they quickly realized they needed a way to handle dozens of disparate data sources feeding Hadoop, speed time to market and make developers productive quickly.

Solution
UpStream chose Cascading for use with Hadoop to streamline data manipulation, allow development of reusable components and bring their new application to market faster. The company will leverage Level 3 support from Concurrent, Inc., since Cascading will now be a part of the UpStream product. UpStream also chose to leverage the Rackspace Cloud to instantly provision as much or as little capacity as needed for customer jobs. UpStream also uses Cascading to plan and manage complicated jobs executed on Hadoop clusters. Each customer job is run separately, with the data kept segregated.

Cascading made coding for Hadoop easier and more repeatable and was a much better fit for UpStream than raw MapReduce or tools like Pig and Hive that lacked flexibility. Many of the typical data manipulation scenarios were already thought through by Cascading’s authors, which accelerated the company’s development process. In addition, Cascading made it easy to create reusable components. This reusability was key, since the company wanted to be able to use one tool for many customers and allow for fast turnaround of new features.  In addition, Cascading allows other developers to look at the existing code, understand what it’s meant to do and make changes or reuse the code elsewhere.

During product development, UpStream ported existing prototype code from SAS to Hadoop/Cascading.  They called on Concurrent to train their developers and analysts on Cascading so they could become productive very quickly.

Benefits
Cascading delivers faster time to market for new products and features by allowing multiple developers to work on the same code base and create reusable components.  Developers can also write in Java and don’t have to think in MapReduce which saves valuable development time. Cascading also provided far more flexibility than either Pig or Hive and didn’t require developers to learn a new syntax, making development quick and easy.

Customers of UpStream will benefit from Cascading’s contributions to the revenue attribution and customer level response modeling modules. These modules identify the most profitable channels for every customer and guide marketers on where to spend promotional dollars most efficiently. Using Cascading, UpStream is able to evaluate data from multiple marketing and purchasing channels to understand customer-level response and generate actionable customer lists for each.

“Cascading is an important part of our UpStream product,” noted Brandon Mason, VP of Product, UpStream. “Our customers are global brands who need to analyze huge volumes of data to guide their marketing activities. With Cascading, we were able to develop a high volume data manipulation and analysis tool quickly and efficiently. The reusable components we created with Cascading will also speed time to market for future products and features.”

In the future, UpStream plans to leverage Cascading to help develop new products or add features to existing products.  Cascading is now their internal standard tool for data manipulation, a key piece of their arsenal for building products that create business value from data.

Resources

News & Events

Cascading

Cascading is software for fault tolerant data processing. Learn more ›

Cascading Support

Concurrent provides licensing, indemnification, and support for Cascading. Learn more ›

Consulting and Training Services

For advanced Cascading Consulting, Training, and Mentoring. Learn more ›