

But we can’t afford to pay. I don’t think open models like the one in the OP article would be developed and released for free to the public if there was a complex process of paying billions of dollars to rightsholders in order to do so. That sort of model would favor a monopoly of centralized services run only by the biggest companies.












I don’t think this is entirely true; yeah, large foundational models have training costs that are beyond the reach of individuals, but plenty can be done that is not, or can be done by a relatively small organization. I can’t find a direct price estimate for Apertus, and it looks like they used their own hardware, but it’s mentioned they used ten million gpu hours, and GH200 gpus; I found a source online claiming a rental cost of $1.50 per hour for that hardware, so I think the cost of training this could be loosely estimated to be something around 20 million dollars.
That is a lot of money if you are one person, but it’s an order of magnitude smaller than the settlements of billions of dollars being paid so far by the biggest AI companies for their hasty unauthorized use of copyrighted materials. It’s easy to see how copyright and legal costs could potentially be the bottleneck here preventing smaller actors from participating.
How would that even work though? Yes, copyright currently favors the wealthy, but that’s because the whole concept of applying property rights to ideas inherently favors the wealthy. I can’t imagine how it could be the opposite even in theory, but in practice, it seems clear that any legislation codifying limitations on use and compensation for AI training will be drafted by lobbyists of large corporate rightsholders, at the obvious expense of everyone with an interest in free public ownership and use of AI technology.