Define complex data flows and create sophisticated data oriented frameworks. These frameworks can be Maven compatible libraries or Domain Specific Languages (DSLs) for scripting.
Create and test rich functionality before tackling complex integration problems. Integration points can be developed and tested before plugging them into a production data flow.
Use in conjunction with Riffle lifecycle annotations to schedule unit of work operations from any third-party application.
Develop applications based on data workflows and assemblies, and let Cascading automatically create MapReduce jobs that get deployed for execution.
Flexible Source & Sink Taps
Easily read and write data from any source and in any format, and change location of files based on deployment environment.
Validate and debug your pipe-assemblies, custom operations, and data flows before deploying the application into production.
Standard Relational Operations
Apply relational data manipulation operations like Select and Join to unstructured data using Pipe operations like Each, GroupBy, CoGroup, Every, etc..
Include XML Operations
Use XML operations to tidy up HTML and XML data streams when analyzing them as part of a workflow.
Call Cascading APIs from any Java compatible scripting language to instantiate Cascading classes to create pipes, assemblies, and flows.