Getting tired of Anglocentrism everywhere and especially in tech – the belief that particular features of the English language are somehow applicable all other languages. One most egregious example is text search and indexing. In English it's easy; in languages where words change depending on case or noun category, it's not so straightforward. Incidentally this makes English especially well-suited to train LLMs.
There's a funny thing you see in many scientific papers - especially #AI papers: The paper will prominently include a link to a GitHub repository with claims of code availability "soon" but when you go there (months after the paper was released) there's either just a placeholder or the paper text.
People use GitHub links to score browny points for "doing open science" but most of it is just not there. Especially with statistical systems when you realize that you don't get the training data, you don't get the code, you don't get model weights what you get are results and a "trust me bro".
ChatGPT a des tics de langage à cause du colonialisme numérique
si « delve » est peu employé dans l’anglais britannique ou américain, au Nigeria, le mot « est beaucoup plus fréquemment utilisé dans l’anglais des affaires »[...] « Les personnes chargées de l’apprentissage des IA ont donc fourni des exemples utilisant leur langue, ce qui a permis d’obtenir un système d’IA qui écrit légèrement comme de l’anglais parlé en Afrique. »
To this day, when somebody says "Something's wrong" my head immediately adds "perhaps a missing \item" #LaTeX #TexLaTeX
April 1, and sadly not a joke: Dictionary.com and Thesaurus.com get acquired:
https://finance.yahoo.com/news/ixl-learning-acquires-dictionary-com-121500032.html
April 12: The entire team of lexicographers at Dictionary.com gets laid off.
https://bsky.app/profile/korystamper.bsky.social/post/3kpxgzhx7eo2l
Computers as tools for humans are so useful exactly *because* they can’t think and do tedious work like calculations or information storage and retrieval for humans in a *deterministic* way.
It took like nearly 90 years of digital computers to make them powerful enough to run a wasteful algorithm that pretends to think (but doesn’t) and to deliver bullshit non-deterministic results while using absurd amounts of computational and environmental resources.
People on Twitter are debating whether a person using uncommon words like "delve" are trying to sound smarter than they are, or worse, are ChatGPT bots, because "normal" people don't talk like that.
You don't have to get upset, or embroiled in the debate. Not worth the time or attention. But I'll share some important context as your friendly neighborhood Nigerian 🙋🏿‍♂️
Many Nigerians have bigger English language vocabularies and better command of grammar than the typical American or English person
Purin mais Microsoft arrive mĂŞme Ă pourrir les Ă©mojis: Celui du monocle est *SOURIANT* sous Windows.
WTF c'est pas du tout la mĂŞme signification !
Il n'y a QUE ces abrutis de Microsoft qui font cette connerie.
Le K barré : Ꝃ, dans les noms commençant par Ker, une spécificité bretonne qui déroute les généalogistes. – Le Trésor du breton écrit
http://www.tresor-breton.bzh/2024/04/07/k-barre/
#liammoĂą #brezhoneg #LangueBretonne #ęť‚
https://ewen.korr.bzh/liens/shaare/76E8ag