Search: [research] - Toolleeo's Links

Last fifty years of integer linear programming: a focus on recent practical advances

optimization · research · review · paper

Sun Jul 27 14:30:35 2025 * · permalink

·

https://inria.hal.science/hal-04776866v1

The unbearable slowness of being: Why do we live at 10 bits/s? - ScienceDirect

This article is about the neural conundrum behind the slowness of human behavior. The information throughput of a human being is about 10 bits/s. In comparison, our sensory systems gather data at bits/s. The stark contrast between these numbers remains unexplained and touches on fundamental aspects of brain function: what neural substrate sets this speed limit on the pace of our existence?

article · paper · neural_networks · science · research

Tue Dec 31 07:37:57 2024 * · permalink

·

https://www.sciencedirect.com/science/article/abs/pii/S0896627324008080

New Transformer architecture modifications from NVIDIA researchers

nGPT: A hypersphere-based Transformer achieving 4-20x faster training and improved stability for LLMs.

AI · research

Sun Oct 20 19:55:56 2024 * · permalink

·

https://x.com/rohanpaul_ai/status/1847277918243754156

alphaXiv

alphaXiv is a platform where you can comment line-by-line on any arXiv paper and join the academic dialogue. You can also leave private notes, follow authors, and integrate with ORCID.

research · web · online

Sun Sep 29 15:43:57 2024 * · permalink

·

https://www.alphaxiv.org/

Large Language Model: world models or surface statistics?

Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. They are trained by playing a guess-the-next-word game with itself over and over again. Each time, the model looks at a partial sentence and guesses the following word. If it makes it correctly, it will update its parameters to reinforce its confidence; otherwise, it will learn from the error and give a better guess next time.

article · science · research · NLP · large_language_models

Mon Jan 30 21:37:16 2023 * · permalink

·

https://thegradient.pub/othello/

Unpaywall

research · paper · bibliography · list · search · open_access

Sat May 7 03:06:54 2022 * · permalink

·

https://unpaywall.org/

Reinforcement Learning as a fine-tuning paradigm

Reinforcement Learning (RL) should be better seen as a “fine-tuning” paradigm that can add capabilities to general-purpose pretrained models, rather than a paradigm that can bootstrap intelligence from scratch.

neural_networks · research · post

Thu Jan 13 04:37:43 2022 * · permalink

·

https://ankeshanand.com/blog/2022/01/08/rl-fine-tuning.html

The 10 Best Practices for Remote Software Engineering | Opinion | Communications of the ACM

Focusing on the human element of remote software engineer productivity.

software_engineering · guidelines · tips · article · research

Tue Apr 27 06:22:40 2021 * · permalink

·

https://m-cacm.acm.org/opinion/articles/252174-the-10-best-practices-for-remote-software-engineering/fulltext

100-nlp-papers - 100 Must-Read NLP Papers

NLP · paper · collection · list · research

Sat Sep 5 14:01:02 2020 * · permalink

·

https://github.com/mhagiwara/100-nlp-papers

Google 'BigBird' Achieves SOTA Performance on Long-Context NLP Tasks | Synced

BigBird is shown to dramatically improve performance across long-context NLP tasks, producing SOTA results in question answering and summarization.

NLP · article · research

Mon Aug 3 20:13:43 2020 * · permalink

·

https://syncedreview.com/2020/08/03/google-bigbird-achieves-sota-performance-on-long-context-nlp-tasks/

DeepDream: How Alexander Mordvintsev Excavated the Computer’s Hidden Layers | The MIT Press Reader

A Google researcher looks into the mind of a computer.

deep_learning · article · research · tech

Mon Aug 3 20:10:41 2020 * · permalink

·

https://thereader.mitpress.mit.edu/deepdream-how-alexander-mordvintsev-excavated-the-computers-hidden-layers/

Why general artificial intelligence will not be realized | Humanities and Social Sciences Communications

The modern project of creating human-like artificial intelligence (AI) started after World War II, when it was discovered that electronic computers are not just number-crunching machines, but can also manipulate symbols. It is possible to pursue this goal without assuming that machine intelligence is identical to human intelligence. This is known as weak AI. However, many AI researcher have pursued the aim of developing artificial intelligence that is in principle identical to human intelligence, called strong AI. Weak AI is less ambitious than strong AI, and therefore less controversial. However, there are important controversies related to weak AI as well. This paper focuses on the distinction between artificial general intelligence (AGI) and artificial narrow intelligence (ANI). Although AGI may be classified as weak AI, it is close to strong AI because one chief characteristics of human intelligence is its generality. Although AGI is less ambitious than strong AI, there were critics almost from the very beginning. One of the leading critics was the philosopher Hubert Dreyfus, who argued that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all. One of Dreyfus’ main arguments was that human knowledge is partly tacit, and therefore cannot be articulated and incorporated in a computer program. However, today one might argue that new approaches to artificial intelligence research have made his arguments obsolete. Deep learning and Big Data are among the latest approaches, and advocates argue that they will be able to realize AGI. A closer look reveals that although development of artificial intelligence for specific purposes (ANI) has been impressive, we have not come much closer to developing artificial general intelligence (AGI). The article further argues that this is in principle impossible, and it revives Hubert Dreyfus’ argument that computers are not in the world.

AI · paper · research

Fri Jul 10 18:57:19 2020 · permalink

·

https://www.nature.com/articles/s41599-020-0494-4

Why Sleep Deprivation Kills

Going without sleep for too long kills animals but scientists haven’t known why. Newly published work suggests that the answer lies in an unexpected part of the body.

research · paper

Tue Jun 9 23:43:56 2020 · permalink

·

https://www.quantamagazine.org/why-sleep-deprivation-kills-20200604/

Unsupervised Translation of Programming Languages

A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port codebases written in an obsolete or deprecated language (e.g. COBOL, Python 2) to a modern one. They typically rely on handcrafted rewrite rules, applied to the source code abstract syntax tree. Unfortunately, the resulting translations often lack readability, fail to respect the target language conventions, and require manual modifications in order to work properly. The overall translation process is timeconsuming and requires expertise in both the source and target languages, making code-translation projects expensive.
Although neural models significantly outperform their rule-based counterparts in the context of natural language translation, their applications to transcompilation have been limited due to the scarcity of parallel data in this domain. In this paper, we propose to leverage recent approaches in unsupervised machine translation to train a fully unsupervised neural transcompiler. We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy.
Our method relies exclusively on monolingual source code, requires no expertise in the source or target languages, and can easily be generalized to other
programming languages. We also build and release a test set composed of 852 parallel functions, along with unit tests to check the correctness of translations. We show that our model outperforms rule-based commercial baselines by a significant margin.

research · programming · machine_learning · neural_networks · paper

Tue Jun 9 23:34:08 2020 · permalink

·

https://arxiv.org/abs/2006.03511

TESPy -j Thermal Engineering Systems in Python

This package provides a powerful simulation toolkit for thermal engineering plants such as power plants, district heating systems or heat pumps.

simulation · energy · energy_efficiency · coding_lang:python · opensource · research · source_code

Thu Jun 4 10:16:26 2020 · permalink

·

https://github.com/oemof/tespy

GIMP-ML - Set of Machine Learning Python plugins for GIMP

graphics · software · neural_networks · opensource · research · source_code

Sun May 10 14:34:34 2020 * · permalink

·

https://github.com/kritiksoman/GIMP-ML/blob/master/README.md

cli-arxiv - CLI tool for exploring arXiv

CLI tool for exploring arXiv (inspired by karpathy's brilliant ArXiv Sanity Preserver)

The script will create data/pdf/, data/txt/ and data/summary/ directories to hold files downloaded from arXiv. I am also aware that this is a rather stupid way to implement a datastore but DBs seem a bit over the top. Text from PDFs are auto-converted on downloaded and are used to suggest future articles to the user. Downloading articles is idempotent.

software · tools · opensource · source_code · science · browser · #cli-app · search · research

Fri May 1 22:46:32 2020 * · permalink

·

https://github.com/knguyenanhoa/cli-arxiv

Zenodo

Zenodo is a free and open digital archive built by CERN and OpenAIRE, enabling researchers to share and preserve research output in any size, format and from all fields of research.

research · science · data · sharing · online · platform · service · webservice

Wed Apr 29 21:06:14 2020 * · permalink

·

https://about.zenodo.org/

The Illustrated FixMatch for Semi-Supervised Learning

Deep Learning has shown very promising results in the field of Computer Vision. But when applying it to practical domains such as medical imaging, lack of labeled data is a major challenge.

In practical settings, labeling data is a time consuming and expensive process. Though, you have a lot of images, only a small portion of them can be labeled due to resource constraints. In such settings, how can we leverage the remaining unlabeled images along with the labeled images to improve the performance of our model? The answer is semi-supervised learning.

FixMatch is a recent semi-supervised approach by Sohn et al. from Google Brain that improved the state of the art in semi-supervised learning(SSL). It is a simpler combination of previous methods such as UDA and ReMixMatch. In this post, we will understand the concept of FixMatch and also see how it got 78% median accuracy and 84% maximum accuracy on CIFAR-10 with just 10 labeled images.

machine_learning · research · techniques · science · article

Fri Apr 3 19:24:52 2020 * · permalink

·

https://amitness.com/2020/03/fixmatch-semi-supervised/

Gherkin Reference

Gherkin uses a set of special keywords to give structure and meaning to executable specifications. Each keyword is translated to many spoken languages; in this reference we’ll use English.

research · language · testing · tech

Tue Mar 17 13:23:49 2020 · permalink

·

https://cucumber.io/docs/gherkin/reference/