Apertus: Switzerland government release a fully open, transparent, multilingual language LLM

ikt@aussie.zone · 2 days ago

Apertus: Switzerland government release a fully open, transparent, multilingual language LLM

hendrik@palaver.p3x.de · 2 days ago

Did someone try it? What’s your experience?

I think I like it. But, I’m having massive issues with it becoming very repetetive after a while. And I’m not sure if it’s the model or my sampler settings.

keepthepace@tarte.nuage-libre.fr · edit-2 1 day ago

Found it not very compliant, asked it about french politics and it fared worse than other 70B models I tested through openrouter.

Mistral 24B gave better answers.

I like the group that is behind though. EPFL and ETH are some of the best research institutes in Europe. Catching up with big corporate models is hard, and it is nice they are giving it a shot. Don’t expect them to be there yet on the first try though.

Domi@lemmy.secnd.me · 2 days ago

It’s a dense model so it’s unfortunately much slower than the newer MoE models.

I also was a little bit disappointed with its translation abilities, especially considering that its a Swiss model but cannot properly do German <-> English translations.

hendrik@palaver.p3x.de · 2 days ago

I’m only tinkering with the 8B variant, so speed is alright. I hadn’t noticed yet. But yes, seems English to German leads to weird results. German to English seems to be fine, though. At least for the 2 texts I put in.

Domi@lemmy.secnd.me · 1 day ago

I tested the 70b model at Q8_K_XL quantization on Strix Halo and it’s not too slow to use for short queries but definitely much slower than something like gpt-oss-120b.

English to German leads to weird results. German to English seems to be fine, though. At least for the 2 texts I put in.

Didn’t get to German to English before giving up, English to German was just too awful. Cases were constantly wrong, very weird word choices and incorrect grammatical genders.

afk_strats@lemmy.world · 2 days ago

How are you running it? Would you be able to post your run arguments?

hendrik@palaver.p3x.de · 2 days ago

I’m running the 8B version with KoboldCPP and pretty much the default settings of the Min-P sampler. That tends to work very well with almost all other models and I’m not aware of any specific recommendations for Apertus…

afk_strats@lemmy.world · 2 days ago

We recommend setting temperature=0.8 and top_p=0.9 in the sampling parameters.

Try that. I believe those params are available I Kobold. Id that doesn’t work, send me a sample of what you’re doing and I’ll try it out

Apertus: Switzerland government release a fully open, transparent, multilingual language LLM

Apertus: Switzerland government release a fully open, transparent, multilingual language LLM

Apertus: a fully open, transparent, multilingual language model