122 private links
This is a set of command line utilities for manipulating large tabular data files. Files of numeric and text data commonly found in machine learning, data mining, and similar environments. Filtering, sampling, statistics, joins, and more.
These tools are especially useful when working with large data sets. They run faster than other tools providing similar functionality, often by significant margins. See Performance Studies for comparisons with other tools.
They perform data manipulation and statistical calculations on tab delimited data. They are intended for large files. Larger than ideal for loading entirely in memory in an application like R, but not so big as to necessitate moving to Hadoop or similar distributed compute environments. The features supported are useful both for standalone analysis and for preparing data for use in R, Pandas, and similar toolkits.
From eBay.
A tool from Facebook that parses the output from a command and presents a UI to select files and directories; can be used to apply a command of a interactively selected files or to move across directories.
PathPicker accepts a wide range of input -- output from git commands, grep results, searches -- pretty much anything.
After parsing the input, PathPicker presents you with a nice UI to select which files you're interested in. After that you can open them in your favorite editor or execute arbitrary commands.
Executes SQL-like queries on CSVs/TSVs tabular data files; each tabular file is treated as a database table; support to all SQL constructs (WHERE
, GROUP BY
, JOIN
).
Utility that allows users to choose one option from a set of choices using an interface with fuzzy search functionality.
A Python script that
1) receives input lines from stdin
or a file,
2) lists the input lines and waits for input that filter/select the line(s),
3) outputs the selected line(s) to stdout
;
Can be used to add interactivity to many regular shell commands.
(JSON Query?) is sed-like processor for JSON data; can be used to process JSON files and data streams and perform operations such as those allowed by cat
, sed
, grep
and awk
on regular text files.
(Generic Colouriser) can be configured to parse a given text stream and to colorize it according to regexp written in configuration files; different patterns can be associated to file types.
(FuZzy Finder) is a general-purpose command-line finder with fuzzy search/filter capabilities; good integration with vim
.