Search: [NLP] - Toolleeo's Links

Microsoft’s GraphRAG + AutoGen + Ollama + Chainlit = Local & Free Multi-Agent RAG Superbot

article programming RAG AI NLP data_science

Sat Jan 4 22:46:41 2025 · permalink

·

https://ai.gopubby.com/microsofts-graphrag-autogen-ollama-chainlit-fully-local-free-multi-agent-rag-superbot-61ad3759f06f

Noterat - Mapping ~400k speeches from the Swedish parlament

data_science article NLP

Sat Jan 4 22:31:17 2025 · permalink

·

https://noterat.github.io/posts/noteringar/202407301845.html

Large Language Model: world models or surface statistics?

Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. They are trained by playing a guess-the-next-word game with itself over and over again. Each time, the model looks at a partial sentence and guesses the following word. If it makes it correctly, it will update its parameters to reinforce its confidence; otherwise, it will learn from the error and give a better guess next time.

article · science · research · NLP · large_language_models

Mon Jan 30 21:37:16 2023 * · permalink

·

https://thegradient.pub/othello/

Yandex Publishes YaLM 100B. It’s the Largest GPT-Like Neural Network in Open Source

In recent years, large-scale transformer-based language models have become the pinnacle of neural networks used in NLP tasks. They grow in scale and complexity every month, but training such models requires millions of dollars, the best experts, and years of development. That’s why only major IT companies have access to this state-of-the-art technology. However, researchers and developers all over the world need access to these solutions. Without new research, their growth could wane. The only way to avoid this is by sharing best practices with the developer community.

We’ve been using YaLM family of language models in our Alice voice assistant and Yandex Search for more than a year now.

article · AI · NLP · large_language_models · neural_networks

Sun Dec 4 17:27:11 2022 * · permalink

·

https://medium.com/yandex/yandex-publishes-yalm-100b-its-the-largest-gpt-like-neural-network-in-open-source-d1df53d0e9a6

haystack - An open source NLP framework that leverages Transformer models

library · machine_learning · NLP · deep_learning · software · opensource · source_code

Fri Dec 10 06:08:35 2021 * · permalink

·

https://github.com/deepset-ai/haystack

Markup - Rapid Annotation, Powered by Active Learning

NLP · tools · online

Sat Jun 19 19:28:51 2021 * · permalink

·

https://www.getmarkup.com/

100-nlp-papers - 100 Must-Read NLP Papers

NLP · paper · collection · list · research

Sat Sep 5 14:01:02 2020 * · permalink

·

https://github.com/mhagiwara/100-nlp-papers

code/lit - The Language Interpretability Tool

Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface.

NLP · tools · software · opensource · coding_lang:python · GUI

Thu Aug 20 11:02:40 2020 * · permalink

·

https://github.com/PAIR-code/lit

Google 'BigBird' Achieves SOTA Performance on Long-Context NLP Tasks | Synced

BigBird is shown to dramatically improve performance across long-context NLP tasks, producing SOTA results in question answering and summarization.

NLP · article · research

Mon Aug 3 20:13:43 2020 * · permalink

·

https://syncedreview.com/2020/08/03/google-bigbird-achieves-sota-performance-on-long-context-nlp-tasks/

LanguageTool - Spell and Grammar Checker

LanguageTool is a free proofreading tool for English, German, Spanish, Russian, and more than 20 other languages.

language · NLP · framework · library · software · homepage

Tue May 12 21:08:15 2020 * · permalink

·

https://languagetool.org/

nlp-recipes - Natural Language Processing Best Practices and Examples

The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in NLP algorithms, neural architectures, and distributed machine learning systems. The content is based on our past and potential future engagements with customers as well as collaboration with partners, researchers, and the open source community.

best_practice · NLP · collection

Fri Apr 24 22:44:18 2020 * · permalink

·

https://github.com/microsoft/nlp-recipes

Notebook on conversational model

In this task we will try our first approach at training a conversational model.

machine_learning · text_generation · AI · notebook · NLP

Tue Jan 21 20:45:29 2020 * · permalink

·

https://colab.research.google.com/drive/1iHcQ8_K0cfRE3v8QX6FMKAzdSSGtf5IX

magnitude - A fast, efficient universal vector embedding utility package

A fast, efficient universal vector embedding utility package.

NLP · text_mining · library · opensource · software · coding_lang:python

Mon Jan 6 12:53:35 2020 · permalink

·

https://github.com/plasticityai/magnitude

A list of beginner-friendly NLP projects

This article is designed to serve as a directory of software projects built on NLP (natural language processing), that anyone — even someone without ML experience — can build.

article · machine_learning · NLP

Wed Dec 25 19:36:48 2019 * · permalink

·

https://towardsdatascience.com/a-list-of-beginner-friendly-nlp-projects-using-pre-trained-models-dc4768b4bec0

abbreviation-extraction | Implementation of the Schwartz-Hearst

Python3 implementation of the Schwartz-Hearst algorithm for extracting abbreviation-definition pairs.

python · library · programming · source_code · NLP · machine_learning · algorithm

Mon Oct 14 16:10:33 2019 * · permalink

·

https://github.com/philgooch/abbreviation-extraction

Automation in Systematic, Scoping and Rapid Reviews by an NLP Toolkit: A Case Study in Enhanced Living Environments

With the increasing number of scientific publications, the analysis of the trends and the state-of-the-art in a certain scientific field is becoming very time-consuming and tedious task. In response to urgent needs of information, for which the existing systematic review model does not well, several other review types have emerged, namely the rapid review and scoping reviews.

The paper proposes an NLP powered tool that automates most of the review process by automatic analysis of articles indexed in the IEEE Xplore, PubMed, and Springer digital libraries. We demonstrate the applicability of the toolkit by analyzing articles related to Enhanced Living Environments and Ambient Assisted Living, in accordance with the PRISMA surveying methodology. The relevant articles were processed by the NLP toolkit to identify articles that contain up to 20 properties clustered into 4 logical groups.

The analysis showed increasing attention from the scientific communities towards Enhanced and Assisted living environments over the last 10 years and showed several trends in the specific research topics that fall into this scope. The case study demonstrates that the NLP toolkit can ease and speed up the review process and show valuable insights from the surveyed articles even without manually reading of most of the articles. Moreover, it pinpoints the most relevant articles which contain more properties and therefore, significantly reduces the manual work, while also generating informative tables, charts and graphs.

research · paper · text_mining · machine_learning · NLP · systematic_literature_review

Fri Aug 2 16:34:56 2019 * · permalink

·

https://link.springer.com/chapter/10.1007%2F978-3-030-10752-9_1

Automated Keyword Extraction from Articles using NLP

In research & news articles, keywords form an important component since they provide a concise representation of the article’s content. Keywords also play a crucial role in locating the article from information retrieval systems, bibliographic databases and for search engine optimization. Keywords also help to categorize the article into the relevant subject or discipline.

Conventional approaches of extracting keywords involve manual assignment of keywords based on the article content and the authors’ judgment. This involves a lot of time & effort and also may not be accurate in terms of selecting the appropriate keywords. With the emergence of Natural Language Processing (NLP), keyword extraction has evolved into being effective as well as efficient.

And in this article, we will combine the two — we’ll be applying NLP on a collection of articles (more on this below) to extract keywords.

research · machine_learning · text_mining · article · NLP · python · bibliometry

Sat Jun 15 14:34:18 2019 · permalink

·

https://medium.com/analytics-vidhya/automated-keyword-extraction-from-articles-using-nlp-bfd864f41b34

texar: Toolkit for Text Generation and Beyond

Toolkit for Text Generation and Beyond. Contribute to asyml/texar development by creating an account on GitHub.

machine_learning · NLP · tensorflow · library · coding_lang:python · text_generation · text_manipulation · programming · opensource · source_code

Mon Sep 10 19:39:33 2018 * · permalink

·

https://github.com/asyml/texar

Python Natural Language Processing Tools

Natural language processing (NLP) is an exciting field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages.

NLP · article · list · python

Sat Sep 1 23:22:23 2018 * · permalink

·

https://www.linuxlinks.com/python-natural-language-processing-tools/

TextBlob: Simplified Text Processing

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

Features

Noun phrase extraction
Part-of-speech tagging
Sentiment analysis
Classification (Naive Bayes, Decision Tree)
Language translation and detection powered by Google Translate
Tokenization (splitting text into words and sentences)
Word and phrase frequencies
Parsing
n-grams
Word inflection (pluralization and singularization) and lemmatization
Spelling correction
Add new models or languages through extensions
WordNet integration

coding_lang:python · library · NLP · machine_learning · text_mining · text_manipulation · docs

Wed Aug 22 01:36:15 2018 * · permalink

·

https://textblob.readthedocs.io/en/dev/index.html