Christophe Cerisara
cerisara@mastodon.online

52 cognitive in French

Jeu de cartes sous licence CC-BY de 52 biais cognitifs en fran├žais.

olkihost.loria.fr/cerisara/52-

August 26, 2023
Christophe Cerisara
cerisara@mastodon.online

New opportunity in Nancy France on continual learning of large language models:

members.loria.fr/CCerisara/#ph

Don't hesitate to contact us for further details!

August 07, 2023
Christophe Cerisara
cerisara@mastodon.online

Translating to English before processing works better than multi-lingual processing; self-translating also works better, although a bit less good, but the delta might decrease with scale.

And if you're looking for a good open translation , NLLB-200 is recommended by the authors:

arxiv.org/abs/2308.01223

August 04, 2023
Christophe Cerisara
cerisara@mastodon.online

Scaling laws for 4-bit : arxiv.org/abs/2212.09720
It's best to use more parameters with 4-bits, than less parameters in 16-bits.

Also, SpQR improves over QLoRA with good scaling laws:
arxiv.org/abs/2306.03078

August 04, 2023
Christophe Cerisara
cerisara@mastodon.online

Another great report on training and finetuning details for Llama2:
arxiv.org/abs/2307.09288

August 04, 2023