

Can anyone recommend a good video of this? I want to see something representative not whatever YouTube feels like should have been picked up by the algorithm
Can anyone recommend a good video of this? I want to see something representative not whatever YouTube feels like should have been picked up by the algorithm
The awe and grandeur of Occarina Of Time… at the time.
Disco Elysium is the best literature I’ve ever played.
I still feel like used to live in Skyrim. It was a place where I wanted to be and explore.
TF2/Halo CE multilayer mix of copetitive adrenaline and funny shenanigans
Those are the game experiences which stuck with me.
Accept that quality matters more than velocity. Ship slower, ship working. The cost of fixing production disasters dwarfs the cost of proper development.
This has been a struggle my entire career. Sometimes, the company listens. Sometimes they don’t. It’s a worthwhile fight but it is a systemic problem caused by management and short-term profit-seeking over healthy business growth
Adding something like a filter can
perfectly remove all sound in the filtered range. So this is technically possible.
With a lot of sounds this is practically very hard because natural sounds have all kind of artifacts like reverb and harmonics which may not be in the range you’re filtering.
Another thing to consider is that filters are often gradual. They are not perfect hard cutoffs to 0 db. So you may need to change some settings and/or filter a bit more than you expect to get the coverage you want.
I don’t know if this is still useful for you, but I tried this out, mostly because I wanted to make sure I wasn’t crazy. Here’s my gpt-oss setup running on cheap AMD Instinct VRAM:
./llama-server \
--model {model}.gguf
--alias "gpt-oss-120b-mxfp4" \
--threads 16 \
-fa on\
--main-gpu 0 \
--ctx-size 64000 \
--n-cpu-moe 0 \
--n-gpu-layers 999 \
--temp 1.0 \
-ub 1536 \
-b 1536 \
--min-p 0.0 \
--top-p 1.0 \
--top-k 0.0 \
--jinja \
--host 0.0.0.0 \
--port 11343 \
--chat-template-kwargs '{"reasoning_effort": "medium"}'
I trimmed the content because it wasn’t relevant but left roughly the shape of the replies to give a sense of the verbosity.
Test 1: With default system message
user prompt: how do i calculate softmax in python
What is softmax
1 python + numpy
...
quick demo
...
2 SciPy
...
...
...
8 full script
...
running the script
...
results
...
TL;DR
...
followup prompt: how can i GPU-accelerate the function with torch
1 why pytorch is fast
...
...
**[Headers 2,3,4,5,6,7,8,9]**
...
...
TL;DR
...
Recap
...
Table Recap
...
Common pitfalls
...
Going beyond float32
...
10 Summary
...
Overall 6393 Tokens including reasoning
TEST 2 with this system prompt: You are a helpful coding assistant. Provide concise answers, to-the point answers. No fluff. Provide straightforward explanations when necessary. Do not add emoji and only provide tl;drs or summaries when asked.
user prompt: how do i calculate softmax in python
Softmax calculation in Python
...
Key points
...
followup prompt: how can i GPU-accelerate the function with torch
GPU‑accelerated Softmax with PyTorch
...
What the code does
...
Tips for larger workloads
...
Overall 1103 Tokens including reasoning
Totally. I think OSS is outright annoying with its verbosity. A system prompt will get around that
You could use it to force resolution, HDR, vrr, refresh rate. It helped you isolate this not being a compositor issue.
Qwen 3 or Qwen 3 Coder? Qwen3 comes in a 235B, 30B and smaller sizes. Qwen 3 Coder comes in a 30B or 480B size.
Open Router has multiple quant options and, for coding, I’d try to only use 8bit int or higher.
Claude also has a ton of sizes and deployment options with different capabilities.
As far as reasoning, the newest Deepseek V3.1 Terminus should be pretty good.
Honestly, all of these models should be able to help you up to a certain level with docker. I would double check how you connect to open router, making sure your hyperparams are good, making sure thinking/reasoning is enabled. Maybe try duck.ai and see if the models there are matching up to whatever you’re doing in open router.
Finally, not being a hater, but LLMs are not intelligent. They cannot actually reason or think. They can probabilistically align with answers you want to see. Sometimes your issue might be too weird or new for them to be able to give you a good answer. Even today models will give you docker compose files with a version number at the top, a feature which has been deprecated for over a year.
Edit: gpt-oss 120 should be cheap and capable enough. Available on duck.ai
I’m sure someone will give a better answer but this smells of a UEFI/secure boot problem. Look in your BIOS and turn those off or to legacy or “other os”
I see your edits and, dang, I was hoping we could move the needle. Another idea:
Try running a game through game scope. You would add this line to the launch options in Steam’s per-game settings. Adjust to your resolution and desired refresh:
gamescope -W 1920 -H 1080 -r 60 -- %command%
https://github.com/ValveSoftware/gamescope
You can use it to troubleshoot all kinds of things.
Chiming in to say this is a very reasonable starting place and wanted to highlight to op that this solution is 100% self-hosted
Cetus-Lupeedus!
Fwiw, I’ve had some very similar problems with GPU performance on my very weird setup. I’m going to share what I know and if that helps you diagnose, great. If anyone has suggestions, please reply.
My setup
My problem areas are very similar:
Performance (fps) degradation during gaming. Games slow down while temps decrease. VRAM and Ram are nowhere near full utilization. Hard drive is near room temp. CPU load is minimal. I have a wattage tracker at the wall and can see wattage drop. Steam games with multiple versions of Proton/ Proton GE
Vulkan-based compute workloads. Hence the weird GPU setup. Same deal. Start a workload at 100% throughput and watch it drop to 30% over the span of a few minutes. This is with artificial benchmarks where I can control workload variables
What ive found
LACT has helped. Setting a card to “Highest Clocks” has a meaningful difference
on some games, simply switching to the desktop and back resets performance. Works on Deep Rock
simply running vulkaninfo rests performance to 100%. I often resort to: watch -n .5 vulkaninfo
x11 behaves better but its not a complete fix
on x11 nvidia control panel “Prefer Maximum Performance” makes a big difference
I’m still figuring out how to get GPU p-states to lock to maximum. I’ve tried locking clocks but that’s not doing it
OS power saving set to lowest power tanks performance, but between balanced and high there’s no impact to this problem
disabling anything related to PCIe power saving in bios hasn’t made a difference
Yep. That was bleak.
I’m not close to web dev so I dont have context. Why is tailwind bad?
Seems like a not-too-bloated alternative to react.
I’ve been using the new UI since release and its been good.
I haven’t seen this mentioned but apart from 8K being expensive, requiring new production pipelines, unweildley for storage and bandwidth, unneeded, and not fixing g existing problems with 4K, it requires MASSIVE screens to reap benefits.
There are several similar posts, but suffice to say, 8K content is only perceived by average eyesight at living room distances when screens are OVER 100 inches in diameter at the bare minimum. That’s 7 feet wide.
Source: https://www.rtings.com/tv/reviews/by-size/size-to-distance-relationship
It depends on what you mean.
To me, Ollama feels like it’s designed to be a developer-first, local LLM server with just enough functionality to get you to a POC, from where you’re intended to use someone else’s compute resources.
llama.cpp actually supports more backends, with continuous performance improvements and support for more models.
Wow! Mesmerizing