Search: [csv] - Toolleeo's Links

Miller - CSV/TSV and other formats toolkit

csv · file_management · software · opensource · homepage · data_science · tools

Mon Jul 20 16:00:23 2020 * · permalink

·

http://johnkerl.org/miller/doc/index.html

structured-text-tools - A list of command line tools for manipulating structured text data

tools · list · text-processing · markdown · csv

Sun May 31 19:49:26 2020 * · permalink

·

https://github.com/dbohdan/structured-text-tools

CleverCSV - A Python package for handling messy CSV files

CleverCSV provides a drop-in replacement for the Python csv package with improved dialect detection for messy CSV files. It also provides a handy command line tool that can standardize a messy file or generate Python code to import it.

csv · library · coding_lang:python · opensource · software · data_mining · algorithm

Tue Jan 14 05:45:51 2020 * · permalink

·

https://github.com/alan-turing-institute/CleverCSV

xsv - Doing a SQL join with CSV files

How to combine data spread over two CSV files, like separate tables in a normalized relational database.

tutorial · article · #cli-app · csv · database · terminal · programming

Wed Jan 1 14:55:27 2020 * · permalink

·

https://www.johndcook.com/blog/2019/12/31/sql-join-csv-files/

csv2tex

Converts csv files into LaTeX tables. Contribute to O2-AC/csv2tex development by creating an account on GitHub.

latex · csv · conversion · terminal · software · opensource · command_line

Sat Nov 9 20:53:02 2019 · permalink

·

https://github.com/O2-AC/csv2tex/blob/master/README.md

Data Package specifications

A Data Package consists of:

Metadata that describes the structure and contents of the package
Resources such as data files that form the contents of the package
The Data Package metadata is stored in a "descriptor". This descriptor is what makes a collection of data a Data Package. The structure of this descriptor is the main content of the specification below.

In addition to this descriptor a data package will include other resources such as data files. The Data Package specification does NOT impose any requirements on their form or structure and can therefore be used for packaging any kind of data.

The data included in the package may be provided as:

Files bundled locally with the package descriptor
Remote resources, referenced by URL
"Inline" data (see below) which is included directly in the descriptor

homepage · data · file_format · csv · json

Fri Sep 6 14:11:31 2019 · permalink

·

https://frictionlessdata.io/specs/data-package/

frictionlessdata | A Python library for working with Data Packages

python · library · machine_learning · file_format · json · csv · dataset · data · source_code · coding_lang:python

Fri Sep 6 14:07:40 2019 · permalink

·

https://github.com/frictionlessdata/datapackage-py

TSV Utilities - Command line tools for large, tabular data files

This is a set of command line utilities for manipulating large tabular data files. Files of numeric and text data commonly found in machine learning, data mining, and similar environments. Filtering, sampling, statistics, joins, and more.

These tools are especially useful when working with large data sets. They run faster than other tools providing similar functionality, often by significant margins. See Performance Studies for comparisons with other tools.

They perform data manipulation and statistical calculations on tab delimited data. They are intended for large files. Larger than ideal for loading entirely in memory in an application like R, but not so big as to necessitate moving to Hadoop or similar distributed compute environments. The features supported are useful both for standalone analysis and for preparing data for use in R, Pandas, and similar toolkits.

From eBay.

csv · file_management · filter · command_line · opensource · software · coding_lang:d · #cli-app

Sun Sep 1 19:56:38 2019 * · permalink

·

https://github.com/eBay/tsv-utils

q - Run SQL-like queries on CSV/TSV files

Executes SQL-like queries on CSVs/TSVs tabular data files; each tabular file is treated as a database table; support to all SQL constructs (WHERE, GROUP BY, JOIN).

#cli-app · text-processing · software · opensource · source_code · tools · terminal · search · filter · csv · SQL · category:text_processing

Sun Aug 25 14:54:03 2019 * · permalink

·

http://harelba.github.io/q/

Turn Vim Into Excel: Tips for Editing Tabular Data

The author tried to edit data in spreadsheet programs.

This post illustrate ho to use Vim to edit tabular data, although there are a few things that will make it more pleasant. It is assumed that editing files are in tab-separated value format (TSV).

"But what about CSV files?" Just. Don't.

Do: convert your CSV to TSV and back for editing.

csv · text_manipulation · vim · tutorial · post · article

Mon Dec 3 03:07:04 2018 * · permalink

·

http://alangrow.com/blog/turn-vim-into-excel-tips-for-tabular-data-editing

A data cleaner's cookbook - About

This is version 1 of a cookbook that will help you check whether a data table (defined on the data tables page) is properly structured and free from formatting errors, inconsistencies, duplicates and other data headaches.

csv · formatting · tutorial · article · guidelines

Thu Aug 16 13:50:26 2018 · permalink

·

https://www.polydesmida.info/cookbook/index.html

xsv - A fast CSV command line toolkit written in Rust

xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. Commands should be simple, fast and composable:

Simple tasks should be easy.
Performance trade offs should be exposed in the CLI interface.
Composition should not come at the expense of performance.

#cli-app · coding_lang:rust · file_management · csv · software · opensource

Thu Aug 16 13:17:17 2018 * · permalink

·

https://github.com/BurntSushi/xsv