131 private links
Journal entry from 20 August 1999, introducing his new blog about The Lord of the Rings.
- Increased height for a better reading experience
- Adapted to reading code
- code-specific ligatures
- weights with matching italics
- free & open source
Facebook AI has developed the first neural network that uses symbolic reasoning to solve advanced mathematics problems.
Speaking as a maintainer of Mercurial and an avid user of Python, I feel like the experience of making Mercurial work with Python 3 is worth sharing because there are a number of lessons to be learned.
Parsr, is a minimal-footprint document (image, pdf) cleaning, parsing and extraction toolchain which generates readily available, organized and usable data for data scientists and developers.
It provides users with clean structured and label-enriched information set for ready-to-use applications ranging from data entry and document analysis automation, archival, and many others.
Currently, Parsr can perform:
- Document Hierarchy Regeneration - Words, Lines and Paragraphs
- Headings Detection
- Table Detection and Reconstruction
- Lists Detection
- Text Order Detection
- Named Entity Recognition (Dates, Percentages, etc)
- Key-Value Pair Detection (for the extraction of specific form-based entries)
- Page Number Detection
- Header-Footer Detection
- Link Detection
- Whitespace Removal
Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.
Main features:
- Train new vocabularies and tokenize, using today's most used tokenizers.
- Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
- Easy to use, but also extremely versatile.
- Designed for research and production.
- Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
- Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.
CleverCSV provides a drop-in replacement for the Python csv package with improved dialect detection for messy CSV files. It also provides a handy command line tool that can standardize a messy file or generate Python code to import it.
Python haters always say, that one of reasons they don't want to use it, is that it's slow. Well, whether specific program - regardless of programming language used - is fast or slow is very much dependant on developer who wrote it and their skill and ability to write optimized and fast programs.
So, let's prove some people wrong and let's see how we can improve performance of our Python programs and make them really fast!
Karate Club is an unsupervised machine learning extension library for NetworkX.
Karate Club consists of state-of-the-art methods to do unsupervised learning on graph structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping commmunity detection methods. Implemented methods cover a wide range of network science (NetSci, Complenet), data mining (ICDM, CIKM, KDD), artificial intelligence (AAAI, IJCAI) and machine learning (NeurIPS, ICML, ICLR) conferences, workshops, and pieces from prominent journals.
Graphical A* simulation.
Nota is a nice terminal calculator with rich notation rendering. It is designed for your quick calculations and therefore provides you with a tiny and beautiful language so you can express your ideas easily. Nota is all about beauty and ASCII art.
When you learn c, you gain a basic understanding of the flow of these languages and how they run, though all of them bring some or the other changes which make them unique. So, if you’re interested in programming, C is a great place to start.
Broot, a new way to navigate directory trees on linux, made in rust.
101+ coding interview problems with detailed solutions, test cases, and program analysis
Fostering reliability, maintainability, compactness, and good performance in code has been a constant quest for programmers and language designers over the years. It's rare that one technique can give you all of the above benefits, concurrently. But intelligent use of associative arrays can do that. If you haven't yet tapped into the power of associative arrays, you might want to give the issues involved some thought. The issues are widely applicable to a variety of programming tasks, cutting across all major languages.
Any C structure can be stored in a hash table using uthash. Just add a UT_hash_handle to the structure and choose one or more fields in your structure to act as the key. Then use these macros to store, retrieve or delete items from the hash table.
Any C structure can be stored in a hash table using uthash. Just add a UT_hash_handle to the structure and choose one or more fields in your structure to act as the key. Then use these macros to store, retrieve or delete items from the hash table.
A fast, efficient universal vector embedding utility package.