128 private links
You’re likely reading this text in a browser. Press Ctrl+F (⌘+F on macOS) and search for the word "text" on this page. The browser will instantly show you how many times the word appears. Even in texts hundreds of times longer than this page, browsers can quickly find the desired substring. Today, we’ll look at the algorithms that make this possible.
I can’t get through a zoom call, a conference talk, or an afternoon scroll through LinkedIn without hearing about vectors. Do you feel like the term vector is everywhere this year? It is. Vector actually means several different things and it's confusing. Vector means AI data, GIS locations, digital graphics, and a type of query optimization, and more. The terms and uses are related, sure. They all stem from the same original concept. However their practical applications are quite different.
So “Vector” is my choice for this year’s name collision of the year.
The Story of Chaos Theory and Some Fun Facts About the Scientists.
With the advent of Llama 2, running strong LLMs locally has become more and more a reality. Its accuracy approaches OpenAI's GPT-3.5, which serves well for many use cases.
In this article, we will explore how we can use Llama2 for Topic Modeling without the need to pass every single document to the model. Instead, we are going to leverage BERTopic, a modular topic modeling technique that can use any LLM for fine-tuning topic representations.
Have you noticed that Git is so integral to working with code that people hardly ever include it in their tech stack or on their CV at all? The assumption is you know it already, or at least enough to get by, but do you?
Git is a Version Control System (VCS). The ubiquitous technology that enables us to store, change, and collaborate on code with others.
Many Linux users have experienced a lasting sense of accomplishment after composing a particularly clever command that achieves multiple actions in just one line or that manages to do in one line what usually takes 10 clicks and as many windows in a graphical user interface (GUI). Aside from being the stuff of legend, one-liners are great examples of why the terminal is considered to be such a powerful tool.
An LLM is no black box but an ML model (based on Neural Networks) that predicts the ‘next’ token given a sequence of previously predicted tokens and input prompt.
How is it able to get the context of the input? Using multi-head attention helps in focusing on important words compared to other tokens in the input sentence. If you’re interested in mathematics, you can read the below blog.
Over the summer, after finally getting around to learning Vim motions, I quickly fell down the Neovim rabbithole and have been procrastinating work by tinkering away at my configurations ever since! This post will be sharing setup that I have currently landed at to turn my Neovim editor into a supercharged workhorse.
Well, if you found this page and are interessed in pass, you must already have your reasons to look into password managers. For me, it's a basic concept: Use password only once. Don't (ever) reuse passwords or passphrases for other services. If one service gets compromised, you won't automatically have to worry about your other services. This makes remembering passwords a bitch, especially if you don't iterate through numbers of your favorite, easy-to-guess, passwords. Speaking of which, yes, there are tools out there, that can generate very good dictionaries based on a bit of social engineering. So you really should use generated passwords.
Journal entry from 20 August 1999, introducing his new blog about The Lord of the Rings.
Perché niente è impossibile da capire… Se lo spieghi bene !
Blog Chris Siebenmann is a Unix sysadmin who now works at the Department of Computer Science, University of Toronto.
The web platform is the delivery mechanism of choice for a ton of software these days, either through the web browser itself or through Electron, but that doesn’t mean there isn’t a place for a good old fashioned straight-up desktop application in the picture.
Fortunately, it’s easier than ever to write a usable, pretty, and performant desktop app, using my language of choice (Rust) and the wildly successful cross-platform GUI framework GTK.
GUI prototyped using Glade.
We often have to write code using permissive programming languages like C and C++. They tend to generate hard-to-debug problems that can crash your applications. Thankfully, many compilers offer “sanitizers”. I discussed them in my post No more leaks with sanitize flags in gcc and clang. I strongly encourage the use of sanitizers as I think it is the modern way to write C and C++. When many people describe how impossibly difficult it is to build good software in C and C++, they often think about old-school bare metal C and C++ where the code do all sorts of mysterious things without any protection. Then they feel compelled to run their code in a debugger and to manually run through it. You should not write code this way! Get some tools! Sanitizers can catch undefined behaviour, memory leaks, buffer overflows, data races, and so forth.
Check out a cool project that leverages Stack Overflow Data and Google's Cloud AI to predict what tags would work best on Stack Overflow questions.
You’ve probably played with model trains, for instance with something like the Brio set shown below.1 And if you’ve built a layout with a model train set, you may well have wondered: is it possible for my train to use all the parts of my track?
These APIs, from niche but useful to just plain fun, should make any software developer smile. @Enterprisenxt