A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things as well.

  • 6 Posts
  • 971 Comments
Joined 4 years ago
cake
Cake day: August 21st, 2021

help-circle

  • I think kids find ways to play and tinker with stuff. I’d give them an office suite to practice writing letters or advertisements or whatever they come up with, something to draw… maybe not Gimp because that’s not easy to use… I’ve seen people give their kids an instant messenger which connects to their dad/mom so they’re incentivised to type something. And then of course we have games. From Supertux, PlanetPenguin Racer, Tuxkart to commercial games. There are some kids games in the repos. Kartoffelknülch, drawing programs. Programming languages to learn coding with puzzle pieces and blocks or animate Turtles. There are educational games, at least my local library has some and I played some as a kid. But maybe at least try to balance the gaming. There’s so much more interesting stuff in computers. And then of course you could put some content into some directories, I think unrestricted internet access isn’t great at 6yo and the computer will be empty without, so idk. Maybe put some templates there, ideas what to draw, music or audiobooks or whatever fits the purpose…


  • Thanks.

    FIBO, which looks interesting in many ways.

    Indeed. Seems it has good performance, licensed training material… That’s all looking great. I wonder who has to come up with the JSON but I guess that’d be another AI and not my task. Guess I’ll put it on my list of things to try.

    It’s possible that there’ll be companies at some point who proudly train their models with renewable energy

    I said it in another comment, I think that’s a bit hypothetical. It’s possible. I think we should do it. But in reality we ramp up natural gas and coal. US companies hype small nuclear reactors and some some people voiced concerns China might want to take advantage of Russia’s situation for their insatiable demand for (fossil-fuel) energy. I mean they also invest massively in solar. It just looks to me we’re currently overall headed the other direction and we need substantial change to maybe change that some time in the future. So I categorize it more towards wishful-thinking.


  • Your experience with AI coding seems to align with mine. I think it’s awesome for generating boilerplate code, placeholders including images, and for quick mockups. Or asking questions about some documentation. The more complicated it gets, the more it fails me. I’ve measured the time once or twice and I’m fairly sure it’s more than usual, though I didn’t do any proper scientific study. It was just similar tasks and me running a timer. I believe the more complicated maths and trigonometry I mentioned was me yelling at AI for 90min or 120minutes or so until it was close and then I took the stuff around, deleted the maths part and wrote that myself. Maybe AI is going to become more “intelligent” in the future. I think a lot of people hope that’s going to happen. I think as of today we’re need to pay close attention if it fools us but is a big time and energy waster, or if it’s actually a good fit for a given task.

    Local AI will likely have a long lasting impact as it won’t just go away.

    I like to believe that as well, but I don’t think there’s any guarantee they’ll continue to release new models. Sure, they can’t ever take Mistral-Nemo from us. But that’s going to be old and obsolete tech in the world of 2030 and dwarfed by any new tech then. So I think the question is more, are they going to continue? And I think we’re kind of picking up what the big companies dumped when battling and outcompeting each other. I’d imagine this could change once China and the USA settle their battle. Or multiple competitors can’t afford it any more. And they’d all like to become profitable one day. Their motivation is going to change with that as well. Or the AI bubble pops and that’s also going to have a dramatic effect. So I’m really not sure if this is going to continue indefinitely. Ultimately, it’s all speculation. A lot of things could possibly happen in the future.

    At what point is generative AI ethically and legally fine?

    If that’s a question about development of AI in general, it’s an entire can of worms. And I suppose also difficult to answer for your or my individual use. What part of the overall environment footprint gets attributed to a single user? Even more difficult to answer with local models. Do the copyright violations the companies did translate to the product and then to the user? Then what impact do you have on society as a single person using AI for something? Does what you achieve with it outweigh all the cost?

    Firefox for realtime website translations

    Yes, I think that and text to speech and speech to text are massively underrated. Firefox Translate is something I use quite often and I can do crazy stuff with it like casually browse Japanese websites.


  • That’s not comparable. You can’t compare software or even research with a physical object like that. You need a dead cow for salami, if demand increases they have to kill more cows. For these models the training already happened, how many people use it does not matter.

    I really like to disagree here. Sure today’s cow is already dead and turned into sausage. But the pack of salami I buy this week is going to make the supermarket order another pack next week so what I’m really doing is have someone kill the next cow, or at least a tiny bit because I’m having just some slices and it’s the bigger picture and how I’m part of a large group of people creating the overall demand.

    And I think it’s at least quesionable if and how this translates. It’s still part of generating demand for AI. Sure, it’s kind of a byproduct but Meta directly invests additional research, alignment and preparation for these byproducts. And we got an entire ecosystem around it with Huggingface, CivitAI etc which cater to us, sometimes a sunstantial amount of their bussiness is the broader AI community and not just researchers. They provide us with datacenters for storage, bandwith and sometimes compute. So it’s certainly not nothing which gets added due to us. And despite it being immaterial, it has a proper effect on the world. It’s going to direct technology and society in some direction. Have real-world consequences when used. The pollution during the process of creating this non-physical product is real. And Meta seems to pay attention. At least that’s what I got from everything that happened with LLaMA 1 to today. I think if and how we use it is going to affect what they do with the next iteration. Similar to the salami pack analogy. Of course it’s a crude image. And we don’t really know what would happen if we did things differently. Maybe it’d be the same so it’s down to the more philosophical question of whether it’s ethical to benefit from things that have been made in an unethical way. Though this requires today’s use not to have any effect on future demand. Like the nazi example where me using medicine is not going to bring back nazi experiments in the future. And that’s not exactly the situation of AI. They’re still there and actively working on the next iteration. So the logic is more complicated than that.

    And I’m a bit wary because I have no clue about the true motive behind why Meta gifts us these things. It costs them money and they hand control to us, which isn’t exactly how large companies operate. My hunch is it’s mainly the usual war, they’re showing off and they accept cutting into their own business when it does more damage to OpenAI. And the Chinese are battling the USA… And we’re somewhere in the middle of it. Maybe we pick up the crumbs. Maybe we’re chess pieces and being used/exploited in some bigger corporate battles. And I don’t think we’re emancipated with AI, we don’t own the compute necessary to properly shape it, so we might be closer to the chess pieces. I don’t want to start any conspiracy theory but I think these dynamics are part of the picture. I (personally) don’t think it’s a general and easy answer to the question if it’s ethical to use these models. And reality is a bit messy.

    But you don’t have to. I can run small models on my NITRO+ RX 580 with 8 GB VRAM, which I bought 7 years ago. It’s maybe not the best experience, but it certainly “works”. Last time our house used external electricity was 34h ago.

    I think this is the common difference between theory and practice. What you do is commendable. In reality though, AI is in fact mostly made from coal and natural gas. And China and the US ramp up dirty fossil fuel electricity for AI. There’s a hype in small nuclear reactors to satisfy the urgend demand for more electricity and they’re a bit problematic with all the nuclear waste due to how nuclear power plants scale. So yes, I think we could do better. And we should. But that’s kind of a theoretical point unless we actually do it.

    it makes sense to train new models on public domain and cc0 materials

    Yes, I’d like to see this as well. I suppose it’s a long way from pirating books because they’re exempt from law with enough money and lawyers… to a proper consensual use.


  • Thanks. That sounds reasonable. Btw you’re not the only poor person around, I don’t even own a graphics card… I’m not a gamer so I never saw any reason to buy one before I took interest in AI. I’ll do inference on my CPU and that’s connected to more than 8GB of memory. It’s just slow 😉 But I guess I’m fine with that. I don’t rely on AI, it’s just tinkering and I’m patient. And a few times a year I’ll rent some cloud GPU by the hour. Maybe one day I’ll buy one myself.


  • Sure. I’m all for the usual system design strategy with strong cohesion within one component and loose coupling on the outside to interconnect all of that. Every single household appliance should be perfectly functional on its own. Without any hubs or other stuff needed.

    For self-contained products or ones without elaborate features, I kind of hate these external dependencies. I don’t want to miss my NAS and how I can access my files from my phone, computer or TV. But other than that I think the TV and all other electronics should work without being connected to other things.

    I mean edge computing is mainly to save cost and power. It doesn’t make sense to fit each of the devices with a high end computer and maybe half a graphics card to all do AI inference. That’s expensive and you can’t have battery-powered devices that way. If they need internet anyway (and that’s the important requirement) just buy one GPU and let them all use that. They’ll fail without the network connection anyway, so it doesn’t matter, and this is easier to maintain and upgrade, probably faster and cheaper.

    A bit like me buying one NAS instead of one 10TB harddisk for the laptop, one for the phone, one for the TV… And then I can’t listen to the song on the stereo because it was sent to my phone.

    But my premise is that the voice stuff and AI features are optional. If they’re essential, my suggestion wouldn’t really work. I rarely see the need. I mean in your example the smoke alarm could trigger and Home Assistant would send me a push notification on the phone. I’d whip it out and have an entire screen with status information and buttons to deal with the situation. I think that’d be superior to talking to the washing machine. I don’t have a good solution for the timer. One day my phone will do that as well. But mind your solution also needs the devices to communicate via one protocol and be connected. The washing machine would need to get informed by the kitchen, be clever enough to know what to do about it, also need to tell the dryer next to it to shut up… So we’d need to design a smart home system. If the devices all connect to a coordinator, perfect. That could be the edge computing “edge”. If not it’d be some sort of decentral system. And I’m not aware of any in existence. It’d be challenging to design and implement. And they tend to be problematic with innovation because everything needs to stay compatible, pretty much indefinitely. It’d be nice, though. And I can see some benefits if arbitrary things just connect, or stay seperate and there’s not an entire buying into some ecosystem involved.




  • I think they should be roughly in a similar range for selfhosting?! They’re both power-efficient. And probably have enough speed for the average task. There might be a few perks with the ThinkCentre Tiny. I haven’t looked it up but I think you should be able to fit an SSD and a harddrive and maybe swap the RAM if you need more. And they’re sometimes on sale somewhere and should be cheaper than a RasPI 5 plus required extras.


  • I’m a bit below 20W. But I custom-built the computer a long time ago with an energy-efficient mainboard and a PicoPSU. I think other options for people who don’t need a lot of harddisks or a graphics card include old laptops or Mini-PCs. Those should idle at somewhat like 10-15W. It stretches the definition of “desktop pc” a bit, but I guess you could place them on a desk as well 😉


  • You just described your subjective experience of thinking.

    Well, I didn’t just do that. We have MRIs and have looked into the brain and we can see how it’s a process. We know how we learn and change by interacting with the world. None of that is subjective.

    I would say that the LLM-based agent thinks. And thinking is not only “steps of reasoning”, but also using external tools for RAG.

    Yes, that’s right. An LLM alone certainly can’t think. It doesn’t have a state of mind, it’s reset a few seconds after it did something and forgets about everything. It’s strictly tokens from left to right And it also doesn’t interact and that’d have an impact on it. That’s just limited to what we bake in in the training process by what’s on Reddit and other sources. So there are many fundamental differences here.

    The rest of it emerges by an LLM being embedded into a system. We provide tools to it, a scratchpad to write something down, we devise a pipeline of agents so it’s able to devise something and later return to it. Something to wrap it up and not just output all the countless steps before. It’s all a bit limited due to the representation and we have to cram everything into a context window, and it’s also a bit limited to concepts it was able to learn during the training process.

    However, those abilities are not in the LLM itself, but in the bigger thing we build around it. And it depends a bit on the performance of the system. As I said, the current “thinking” processes are more a mirage and I’m pretty sure I’ve read papers on how they don’t really use it to think. And that aligns with what I see once I open the “reasoning” texts. Theoretically, the approach surely makes everything possible (with the limitation of how much context we have, and how much computing power we spend. That’s all limited in practice.) But what kind of performance we actually get is an entirely different story. And we’re not anywhere close to proper cognition. We hope we’re eventually going to get there, but there’s no guarantee.

    The LLM can for sure make abstract models of reality, generalize, create analogies and then extrapolate.

    I’m fairly sure extrapolation is generally difficult with machine learning. There’s a lot of research on it and it’s just massively difficult to make machine learning models do it. Interpolation on the other hand is far easier. And I’ll agree. The entire point of LLMs and other types of machine learning is to force them to generalize and form models. That’s what makes them useful in the first place.

    It doesn’t even have to be an LLM. Some kind of generative or inference engine that produce useful information which can then be modified and corrected by other more specialized components and also inserted into some feedback loop

    I completely agree with that. LLMs are our current approach. And the best approach we have. They just have a scalability problem (and a few other issues). We don’t have infinite datasets to feed in and infinite compute, and everything seems to grow exponentially more costly, so maybe we can’t make them substantially more intelligent than they are today. We also don’t teach them to stick to the truth or be creative or follow any goals. We just feed in random (curated) text and hope for the best with a bit of fine-tuning and reinforcement learning with human feedback on top. But that doesn’t rule out anything. There are other machine learning architectures with feedback-loops and way more powerful architectures. They’re just too complicated to calculate. We could teach AI about factuality and creativity and expose some control mechanisms to guide it. We could train a model with a different goal than just produce one next token so it looks like text from the dataset. That’s all possible. I just think LLMs are limited in the ways I mentioned and we need one of the hypothetical new approaches to get them anywhere close to a level a human can achieve… I mean I frequently use LLMs. And they all fail spectacularly at computer programming tasks I do in 30 minutes. And I don’t see how they’d ever be able to do it, given the level of improvement we see as of today. I think that needs a radical new approach in AI.


  • Agreed.

    Those models could be easily trained with renewables alone but you know, capitalism.

    It’s really sad to read the articles how they’re planning to bulldoze Texas and do fracking and all these massively invasive things and then we also run a lot of the compute on coal and want more nuclear plants as well. That doesn’t really sound that progressive and sophisticated to me.

    The thing is, those models are already out there and the people training them do not gain anything when people download their open weights/open source models for free for local use.

    You’re right. Though the argument doesn’t translate into anything absolute. I can’t buy salami in the supermarket and justify it by saying the cow is dead anyways and someone already sliced it up. It’s down to demand and that’s really complex. Does Mark Zuckerberg really gift an open-weights model to me out of pure altruism? Is it ethical if I get some profit out of some waste, or by-product of some AI war/competition? It is certainly correct that we here don’t invest money in that form. However that’s not the entire story either, we still buy the graphics cards from Nvidia and we also set free some CO2 when doing inference, even if we didn’t pay for the training process. And they spend some extra compute to prepare those public models, so it’s not no extra footprint, but it’s comparatively small.

    I’m not perfect, though. I’ll still eat salami from time to time. And I’ll also use my computer for things I like. Sometimes it serves a purpose and then it’s justified. Sometimes I’ll also do it for fun. And that in itself isn’t something that makes it wrong.

    I’m a huge fan of RAG because it cites where it got the information from

    Yeah, that’s really great and very welcome. Though I think it still needs some improvement on picking sources. If I use some research mode from one of the big AI services, it’ll randomly google things, but some weird blog post or a wrong reddit comment will show up on the same level as a reputable source. So it’s not really fit for those use-cases. It’s awesome to sift through documentation, though. Or a company’s knowledgebase. And I think those are the real use-cases for RAG.


  • Well, a LLM doesn’t think, right? It just generates text from left to right. Whereas I sometimes think for 5 minutes about what I know, what I can deduct from it, do calculations in my brain and carry one over… We’ve taught LLMs to write something down that resembles what a human with a thought process would write down. But it’s frequently gibberish or if I look at it it writes something down in the “reasoning”/“thinking” step and then does the opposite. Or omits steps and then proceeds to do them nonetheless or it’s the other way round. So it clearly doesn’t really do what it seems to do. It’s just a word the AI industry slapped on. It makes them perform some percent better, and that’s why they did it.

    And I’m not a token generator. I can count the number of "R"s in the word “strawberry”. I can go back and revise the start of my text. I can learn in real-time and interacting with the world changes me. My brain is connected to eyes, ears, hands and feet, I can smell and taste… My brain can form abstract models of reality, try to generalize or make sense of what I’m faced with. I can come up with methods to extrapolate beyond what I know. I have goals in life, like pursue happiness. Sometimes things happen in my head which I can’t even put into words, I’m not even limited to language in form of words. So I think we’re very unalike.

    You have a point in theory if we expand the concept a bit. An AI agent in form of an LLM plus a scratchpad is proven to be turing-complete. So that theoretical concept could do the same things a computer can do, or what I can do with logic. That theoretical form of AI doesn’t exist, though. That’s not what our current AI agents do. And there are probably more efficient ways to achieve the same thing than use an LLM.


  • Yeah, thanks as well, engaging discussion.

    What Godel proved is that there are some questions that can never be answered

    I think that’s a fairly common misconception. What Gödel proved was that there isn’t one single formal system in which we can derive everything. It doesn’t really lead to the conclusion that questions can’t be answered. There is an infinite amount of formal systems, and Gödel doesn’t rule out the possibility of proving something with one of the countless other, different systems, starting out with different axioms. And as I said, this is a limitation to formal logic systems and not to reality.

    uncomputability

    Yes, that’s another distinct form of undecidability. There are decision problems we can’t answer in finite time with computers.

    I think it is a bit of a moot point, as there are lots of impossible things. We have limited resources available, so we can only ever do things with what we have available. Then we have things like locality and I don’t even know what happens 15km away from me because I can’t see that far. Physics also sets boundaries. For example we can’t measure things to perfection and can’t even do enough measurments for complex systems. And then I’m too heavy to fly on my own and can’t escape gravity. So no matter how we twist it, we’re pretty limited in what we can do. And we don’t really have to resort to logic problems for that.

    To me, it’s far more interesting to look at what that means for a certain given problem. We human can’t do everything. Same applies to knowledge, physics calculations and AI. At the point we build it, it’s part of the real world and subject to the same limitations which apply to us as well. And that’s inescapable. You’re definitely right, there are all these limitations. I just don’t think it’s specific to anything in particular. But it certainly means we won’t ever build any AI which knows everything and can do everything. We also can’t ever simulate the entire universe. That’s impossible on all levels we discussed.

    Its now proven that some enzymes use quantum tunneling to accelerate chemical reactions crucial for life.

    I mean if quantum physics is the underlying mechanism of the universe, then everything “uses” quantum effects. It boils down to the question if that model is useful to describe some process. For example if I drop a spoon in the kitchen, it always falls down towards the floor. There are quantum effects happening in all the involved objects. It’s just not useful to describe that with quantum physics, regular Newtonian gravity is better suited to tell me something about the spoon and my kitchen… Same is with the enzymes and the human brain. They exist and are part of physics, and they do their thing. Only question is which model do we use to describe them or predict something about them. That might be quantum physics in some cases and other physics models in other cases.

    I acknowledge there’s currently no direct experimental evidence for quantum effects in neural computation, and testing these hypotheses presents extraordinary challenges. But this isn’t “hiding God in the gaps.” It’s a hypothesis grounded in the demonstrated principles of quantum biology and chaos theory.

    It certainly sounds like the God of the gaps to me. Look at the enzyme example. We found out there’s something going on with temperature we can’t correctly describe with our formulas. Then scientists proposed this is due to quantum tunneling and that has to be factored in… That’s science… On the other hand no such thing happened for the human brain. It seems to be perfectly fine to describe it with regular physics, it’s just too big/complex and involved to bridge the gap from what the neurons do to how the brain processes information. And then people claimed there’s God or chaos theory or quantum effects hidden inside. But that’s wild unfounded claims and opinion, not science. We’d need to see something which doesn’t add up, like how it happened with the enzymes. Everything else is religious belief. (And turns out we already simulated the brain of a roundworm and a fruit fly, and at least Wikipedia tells me the simulation is consistent with biology… Leading me to believe there’s nothing funny going on and it’s just a scalability problem.)


  • The current biggest quantum computer made by CalTech is 6100 qbits.

    Though in both the article you linked and in the associated video, they clearly state they haven’t achieved superposition yet. So it’s not a “computer”. It’s just 6100 atoms in a state of superposition. Which indeed is impressive. But they can not compute anything with it, that’d require them to do the research first how to get all the atoms in superposition.

    […] a continuous parameter space of possible states […]

    By the way, I think there is AI which doesn’t operate in a continuous space. It’s possible to have them operate in a discrete state-space. There are several approaches and papers out there.

    I see it somewhat differently. I view mathematics as our fundamentally limited symbolic representation of the universe’s operations at the microstate level […] Gödel’s Incompleteness Theorems and algorithmic undecidability […]

    Uh, I think we’re confusing maths and physics here. First of all, the fact that we can make up algorithms which are undecidable… or Goedel’s incompleteness theorem tells us something about the theoretical concept of maths, not the world. In the real world there is no barber who shaves all people who don’t shave themselves (and he shaves himself). That’s a logic puzzle. We can formulate it and discuss it. But it’s not real. I mean neither does Hilbert’s Hotel exist, in fact in reality almost nothing is infinite (except what Einstein said 😆) So… Mathematics can describe a lot of possible and impossible things. It’s the good old philosophical debate on how there’s less limits about what we can think of. But thinking about something doesn’t make it real. Similarly, if we can’t have a formal system which is non-contradictory and within everything is derivable, that’s just that. It might still describe reality perfectly and physical processes completely. I don’t think we have any reason to doubt that. In fact maths seems to work exceptionally well in physics from everything from the smallest thing to universe-scale.

    It’s true that in computer science we have things like the halting problem. And it’s also trivially true that physics can’t ever have a complete picture of the entire universe from within. Or look outside. But none of that tells us anything about the nature of cognition or AI. That’s likely just regular maths and physics.

    As far as I know maths is just a logically consistent method to structure things, and describe and deal with abstract concepts of objects. The objective reality is seperate from that. And unimpeded by our ability to formulate non-existing concepts we can’t tackle with maths due to the incompleteness theorem. But I’m not an expert on this nor an epistemologist. So take what I say with a grain of salt.

    For me, an entity eligible for consideration as pseudo-sentient/alive must exhibit properties we don’t engineer into AI. […]

    Yes. And on top of the things you said, it’d need some state of mind which can change… Which it doesn’t have unless we count whatever we can cram into the context window. I’d expect a sentient being to learn, which again LLMs can’t do from interacting with the world. And usually sentient beings have some kinds of thought processes… And those “reasoning” modes are super weird and not a thought process at all. So I don’t see a reason to believe they’re close to sentience. They’re missing quite some fundamentals.

    I suspect the difference between classical neural networks and biological cognition is that biology may leverage quantum processes, and possibly non-algorithmic operations. […]

    I don’t think this is the case. As far as I know a human brain consists of neurons which roughly either fire or don’t fire. That’s a bit like a 0 or 1. But that’s an oversimplification and not really true. But a human brain is closer to that than to an analog computer. And it certainly doesn’t use quantum effects. Yes, that has been proposed, but I think it’s mysticism and esoterica. Some people want to hide God in there and like to believe there is something mystic and special to sentience. But that’s not backed by science. Quantum effects have long collapsed at the scale of a brain cell. We’re talking about many trillions of atoms per every single cell. And that immediately rules out quantum effects. If you ask me, it’s because a human brain has a crazy amount of neurons and synapses compared to what we can compute. And they’re not just feed-forward in one direction but properly interconnected in many directions with many neighbours. A brain is just vastly more complex and able than a computer. And I think that’s why we can do cognitive tasks on a human-level and a computer can do it at the scale of a mouse brain, because that’s just the difference in capability. And it’d still miss the plasticity of the mouse brain and the animal’s ability to learn and adapt. I mean we also don’t discuss a mosquito’s ability to dream or a mouse’s creativity in formulating questions. That’d be the same antropomorphism.



  • Uh, I’m really unsure about the engineering task of a few years, if the solution is quantum computers. As of today, they’re fairly small. And scaling them to a usable size is the next science-fiction task. The groundworks hadn’t been done yet and to my knowledge it’s still totally unclear whether quantum computers can even be built at that scale. But sure, if humanity develops vastly superior computers, a lot of tasks are going to get easier and more approachable.

    The stochastical parrot argument is nonsense IMO. Maths is just a method. Our brains and entire physics abide by math. And sure, AI is maths as well with the difference that we invented it. But I don’t think it tells us anything.

    And with the goal, I think that’s about how AlphaGo has the goal to win Go tournaments. The hypothetical paperclip-maximizer has the goal of maximizing the paperclip production… And an LLM doesn’t really have any real-world goal. It just generates a next token so it looks like legible text. And then we embed it into some pipeline but it wasn’t ever trained to achieve the thing we use it for, whatever it might be. That’s just a happy accident if a task can be achieved by clever mimickry, and a prompt which simply tells it - pretend you’re good at XY.

    I think it’d probably be better if a customer service bot was trained to want to provide good support. Or a chatbot like ChatGPT to give factual answers. But that’s not what we do. It’s not designed to do that.

    I guess you’re right. Many aspects of AI boil down to how much compute we have available. And generalization and extrapolating past their training datasets has always been an issue with AI. They’re mainly good at interpolating, but we want them to do both. I need to learn a bit more about neural networks. I’m not sure where the limitations are. You said it’s a practical constrain. But is that really true for all neural networks? It sure is for LLMs and transformer models because they need terabytes of text being fed in on training, and that’s prohibitively expensive. But I suppose that’s mainly due to their architecture?! I mean backpropagation and all the maths required to modify the model weights is some extra work. But does it have to be so much that we just can’t do it while deployed with any neural networks?


  • I think you have a good argument here. But I’m not sure where this is going to lead. Your argument applies to neural networks in general. And we have those since the 1950s. Subsequently, we went through several "AI winter"s and now we have some newer approach which seemed to lead somewhere. But I’ve watched Richard Sutton’s long take on LLMs and it’s not clear to me whether LLMs are going to scale past what we see as of today. Ultimately they have severe issues to scale, it’s still not aimed at true understanding or reasonable generalization, that’s just a weird side effect, when the main point is to generate plausible sounding text (…pictures etc). LLMs don’t have goals and they don’t learn while running and have all these weird limitations which make generative AI unalike other (proper) types of reinforcement learning. And these are fundamental limitations, I don’t think this can be changed without an entirely new concept.

    So I’m a bit unsure if the current take on AI is the ultimate breakthrough. It might be a dead end as well and we’re still in need of a hypothetical new concept to do proper reasoning and understanding for more complicated tasks…
    But with that said, there’s surely a lot of potential left in LLMs no matter if they scale past today or not. All sorts of interaction with natural language, robotics, automation… It’s certainly crazy to see what current AI is able to do, considering the weird approach it is. And I’ll agree that we’re at surface level. Everything is still hyped to no end. What we’d really need to do is embed it into processes and the real world and see how it performs there. And that’d need to be a broad and scientific measurement. We occasionally get some studies on how AI helps companies, or it wastes their developer’s time. But I don’t think we have a good picture yet.


  • The broader generative AI economy is a steaming pile of shit and we’re somehow part of it? I mean it’s nice technology and I’m glad I can tinker around with it, but boy is it unethical. From how datasets contain a good amount of pirated stuff, to the environmental impact and that we’ll do fracking, burn coal and all for the datacenters, to how it’s mostly an unsustainable investment hype and trillion-dollar merry-go-round. And then I’m not okay with the impact on society either, I can’t wait for even more slop and misinformation everywhere and even worse customer support.

    We’re somewhere low on the food chain, certainly not the main culprit. But I don’t think we’re disconnected from the reality out there either. My main take is, it depends on what we do with AI… Do we do the same unhealthy stuff with it, or do we help even out the playing field so it’s not just the mega-corporations in control of AI? That’d be badly needed for some balance.

    Second controversial take: I think AI isn’t very intelligent. It regularly fails me once I give real-world tasks to it. Like coding and it really doesn’t do a good job with the computer programming issues I have. I need to double-check everything and correct it 30 times until it finally gets maths and memory handling somewhat right (by chance), and that’s just more effort than coding something myself. And I’m willing to believe that transformer models are going to plateau out, so I’m not sure if that’s ever going to change.

    Edit: Judging by the votes, seems I’m the one with the controversial comment here. Care to discuss it? Too close to the truth? Or not factual? Or not a hot take and just the usual AI naysayer argument?