November 8th & 9th, 2024
Love SeaGL and want to help out? Get Involved

RecordStream: Slicing and dicing data on the Unix command line

SeaGL 2013

Hadoop is fantastic if you need to crunch a lot of data. RDBMSes and data warehouses are great if your data is well-structured and easy to import. But many times you just want simple, ad-hoc processing.

Creating a one-off report of popular URLs, broken down by region? Getting a feel for a new dataset? Analyzing logs during an attack? These are all well suited to something lightweight, interactive, and fast.

RecordStream (recs) is an MIT licensed collection of tools for inputing, slicing, dicing, summarizing, and outputting JSON-formatted data. It is a powerful and extensible addition (and sometimes replacement) to the standard cat, cut, grep, sort and friends.

Presenters

Eli Lindsey

Eli Lindsey