At the Apache Pig Hackathon, Twitter open sourced "Ambrose," a tool which helps authors of large-scale data workflows keep track of the overall status of a workflow and visualize its progress.
In the screenshot below, we see the Ambrose UI for a workflow compiled from a single Pig script. "The circular chord diagram in the upper left highlights dependencies between jobs. As a job's status changes, the color of its arc in the diagram changes. Statistics for the job most recently started are displayed to the right of the chord diagram. Summary information and status of all jobs is displayed in the table beneath these two views," explains Chris Aniszczyk, Manager of Open Source at Twitter.
"In its current form Ambrose is still early in development and has a growing list of features we'd love to add, but we've open sourced it to develop Ambrose in the open and get community feedback."/p>
"At the moment it only works with Pig; however, the framework is extensible and allows support for other other runtimes. We plan to support Cascading and Scalding, but we welcome patches for other runtimes as well. Ambrose also relies on a number of other great open-source projects including Jetty, D3.js, and Twitter Bootstrap," adds Aniszczyk.
If you're interested in working on and evolving data visualization tools like Ambrose, you can download Amrose on the GitHub.