• 0 Posts
  • 14 Comments
Joined 2 years ago
cake
Cake day: July 3rd, 2023

help-circle



  • Also the Android TV app is AWESOME!

    What do you run Android TV on? Raspberry Pi? My cheapo solution has been to use an old Android phone that supports DP alt mode (USB-C to HDMI adapter) combined with a USB hub + generic air mouse/remote + customized launcher.

    It actually works surprisingly well. I installed FCast on it, so it even works like a Chromecast. If I’m watching a video on my phone using Grayjay, I can just cast it to the phone and it will start playing automatically. The only thing stopping it from being perfect is that it can’t turn the TV on automatically. As a plus, since the phone has a battery, it’s always powered on so I don’t have to wait for stuff to boot, and it uses relatively little power.

    … but overall it’s janky and finicky, and the OEM bloatware is probably spying on me, so I’ve been looking for alternatives that can match the good parts of this setup.

    I don’t like Raspberry Pis for this because they’re overpriced. I have a couple that I could use for this, but I’m hoping to find a cheaper solution, and one that I can recommend to friends/family when they ask. (the Android phone I’m using cost me a total of $15 on ebay)




  • gamer@lemm.eetoAsklemmy@lemmy.mlWhy would'nt this work?
    link
    fedilink
    arrow-up
    8
    arrow-down
    1
    ·
    18 days ago

    This doesn’t account for blinking.

    If your friend blinks, they won’t see the light, and thus would be unable to verify whether the method works or not.

    But how does he know when to open his eyes? He can’t keep them open forever. Say you flash the light once, and that’s his signal to keep his eyes open. Okay, but how long do you wait before starting the experiment? If you do it immediately, he may not have enough time to react. If you wait too long, his eyes will dry out and he’ll blink.

    This is just not going to work. There are too many dependent variables.


  • gamer@lemm.eetoAsklemmy@lemmy.mlSuperbowl sadness
    link
    fedilink
    arrow-up
    16
    arrow-down
    1
    ·
    18 days ago

    I’m seeing people say that the broadcaster (Fox Sports, of course) injected cheers into the broadcast for Trump, and boos for Taylor Swift. I don’t want to spread misinfo though so does anyone know if it’s true, or if there’s a way to validate it? (Eg by analyzing the audio)


  • So the dems are dumb for not fighting fascism with fascism? You can’t save democracy by destroying it. That “ends justify the means” thinking is why the republican party ended up this way. Why would republicans in congress want to disenfranchise themselves by installing a dictator? They’re just morons grasping for conspiracy theories to win elections, without thinking about the long term consequences. Maybe I’m naive, but save for the few actual lunatics (like MTG), I’m sure many republicans would turn on Trump the instant they felt they could get away with it, especially now



  • 96 GB+ of RAM is relatively easy, but for LLM inference you want VRAM. You can achieve that on a consumer PC by using multiple GPUs, although performance will not be as good as having a single GPU with 96GB of VRAM. Swapping out to RAM during inference slows it down a lot.

    On archs with unified memory (like Apple’s latest machines), the CPU and GPU share memory, so you could actually find a system with very high memory directly accessible to the GPU. Mac Pros can be configured with up to 192GB of memory, although I doubt it’d be worth it as the GPU probably isn’t powerful enough.

    Also, the 83GB number I gave was with a hypothetical 1 bit quantization of Deepseek R1, which (if it’s even possible) would probably be really shitty, maybe even shittier than Llama 7B.

    but how can one enter TB zone?

    Data centers use NVLink to connect multiple Nvidia GPUs. Idk what the limits are, but you use it to combine multiple GPUs to pool resources much more efficiently and at a much larger scale than would be possible on consumer hardware. A single Nvidia H200 GPU has 141 GB of VRAM, so you could link them up to build some monster data centers.

    Nivida also sells prebuilt machines like the HGX B200 which can have 1.4TB of memory in a single system. That’s less than the 2.6TB for unquantized deepseek, but for inference only applications, you could definitely quantize it enough to fit within that limit with little to no quality loss… so if you’re really interested and really rich, you could probably buy one of those for your home lab.


  • If all you care about is response times, you can easily do that by just using a smaller model. The quality of responses will be poor though, and it’s not feasible to self host a model like chatgpt on consumer hardware.

    For some quick math, a small Llama model is 7 billion parameters. Unquantized that’s 4 bytes per parameter (32 bit floats), meaning it requires 28 billion bytes (28 gb) of memory. You can get that to fit in less memory with quantization, basically reducing quality for lower memory usage (use less than 32 bits per param, reducing both precision and memory usage)

    Inference performance will still vary a lot depending on your hardware, even if you manage to fit it all in VRAM. A 5090 will be faster than an iPhone, obviously.

    … But with a model competitive with ChatGPT, like Deepseek R1 we’re talking about 671 billion parameters. Even if you quantize down to a useless 1 bit per param, that’d be over 83gb of memory just to fit the model in memory (unquantized it’s ~2.6TB). Running inference over that many parameters would require serious compute too, much more than a 5090 could handle. This gets into specialized high end architectures to achieve that performance, and it’s not something a typical prosumer would be able to build (or afford).

    So the TL; DR is no



  • I think this comment encapsulates the problem well: laymen who are not involved in the process in any way (on either side) acting like armchair experts and passing harsh judgement. You’re making some very unfair assumptions based on age, and nothing about the actual technical arguments.

    This is why people like Martin feel justified going on social media to publicly complain, because they know they’ll get a bunch of yesmen with no credible arguments to mindlessly harrass the developers they disagree with. It’s childish and unproductive, and while I’ve personally respected Martin as a developer for a long time, I don’t believe he’s mature enough to be involved in the Rust for Linux effort (tbf, he’s not the only Rust dev with this attitude). If the project fails, it will be because of this behavior, not because of the “old guys” being stubborn.