@Spedwell

Spedwell@lemmy.world · 7 days ago

404media is doing excellent work on tracking the non-consentual porn market and technology. Unfortunately, you don’t really see the larger, more mainstream outlets giving it the same attention beyond its effect on Taylor Swift.

Spedwell@lemmy.world · 11 days ago

Right concept, except you’re off in scale. A MULT instruction would exist in both RISC and CISC processors.

The big difference is that CISC tries to provide instructions to perform much more sophisticated subroutines. This video is a fun look at some of the most absurd ones, to give you an idea.

Spedwell@lemmy.world · 15 days ago

The current assumption made by these companies is that AI training is fair use, and is therefore legal regardless of license. There are still many ongoing court cases over this, but one case was already resolved in favor or the fair use position.

Spedwell@lemmy.world · 21 days ago

Ah, yes. The famously singular “westerners” who all 100% agreed with every foreign affairs policy of their government over the past century.

Spedwell@lemmy.world · edit-2 27 days ago

There is an episode of Tech Won’t Save Us (2024-01-25) discussing how weird the podcasting play was for Spotify. There is essentially no way to monetize podcasts at scale, primarily because podcasts do not have the same degree of platform look-in as other media types.

Spotify spent the $100 million (or whatever the number was) to get Rogan exclusive, but for essentially every other podcast you can find a free RSS feed with skippable ads. Also their podcast player just outright sucks :/

Spedwell@lemmy.world · 27 days ago

Spin up c/notquitetheonion?

Spedwell@lemmy.world · 30 days ago

Errrrm… No. Don’t get your philosophy from LessWrong.

Here’s the part of the LessWrong page that cites Simulacra and Simulation:

Like “agent”, “simulation” is a generic term referring to a deep and inevitable idea: that what we think of as the real can be run virtually on machines, “produced from miniaturized units, from matrices, memory banks and command models - and with these it can be reproduced an indefinite number of times.”

This last quote does indeed come from Simulacra (you can find it in the third paragraph here), but it appears to have been quoted solely because when paired with the definition of simulation put forward by the article:

A simulation is the imitation of the operation of a real-world process or system over time.

it appears that Baudrillard supports the idea that a computer can just simulate any goddamn thing we want it to.

If you are familiar with the actual arguments Baudrillard makes, or simply read the context around that quote, it is obvious that this is misappropriating the text.

Spedwell@lemmy.world · edit-2 1 month ago

The reason the article compares to commercial flights is your everyday reader knows planes’ emissions are large. It’s a reference point so people can weight the ecological tradeoff.

“I can emit this much by either (1) operating the global airline network, or (2) running cloud/LLMs.” It’s a good way to visualize the cost of cloud systems without just citing tons-of-CO2/yr.

Downplaying that by insisting we look at the transportation industry as a whole doesn’t strike you as… a little silly? We know transport is expensive; It is moving tons of mass over hundreds of miles. The fact computer systems even get close is an indication of the sheer scale of energy being poured into them.

Spedwell@lemmy.world · 1 month ago

Feel the same way. My Camry is a 2013—recent enough to have a simple display and Bluetooth, but old enough to predate the ‘modern’ infotainment systems.

Believe me, I plan to drive this car until the scrapyards run out of part donors.

Spedwell@lemmy.world · edit-2 1 month ago

concepts embedded in them

internal model

You used both phrases in this thread, but those are two very different things. It’s a stretch to say this research supports the latter.

Yes, LLMs are still next-token generators. That is a descriptive statement about how they operate. They just have embedded knowledge that allows them to generate sometimes meaningful text.

Spedwell@lemmy.world · 1 month ago

330 micrograms per gram

That seems like… a lot. Way more than I expected or am comfortable thinking about.

Spedwell@lemmy.world · 1 month ago

It’s not really stupid at all. See the matrix code example from this article: https://spectrum.ieee.org/ai-code-generation-ownership

You can’t really know when the genAI is synthesizing from thousands of inputs or just outright reciting copyrighted code. Not kosher if it’s the latter.

Spedwell@lemmy.world · 2 months ago

Just curious, where does the Anti Commercial-AI bit come from? The page linked does not include that term in the title or summary, and from what I understand of the legal situation it wouldn’t make a difference to explicitly mention AI.

Spedwell@lemmy.world · 2 months ago

I get that there are better choices now, but let’s not pretend like a straw you blow into is the technological stopping point for limb-free computer control (sorry if that’s not actually the best option, it’s just the one I’m familiar with). There are plenty of things to trash talk Neuralink about without pretending this technology (or it’s future form) is meritless.

Spedwell@lemmy.world · 2 months ago

The issue on the copyright front is the same kind of professional standards and professional ethics that should stop you from just outright copying open-source code into your application. It may be very small portions of code, and you may never get caught, but you simply don’t do that. If you wouldn’t steal a function from a copyleft open-source project, you wouldn’t use that function when copilot suggests it. Idk if copilot has added license tracing yet (been a while since I used it), but absent that feature you are entirely blind to the extent which it’s output is infringing on licenses. That’s huge legal liability to your employer, and an ethical coinflip.

Regarding understanding of code, you’re right. You have to own what you submit into the codebase.

The drawback/risks of using LLMs or copilot are more to do with the fact it generates the likely code, which means it’s statistically biased to generate whatever common and unnoticeable bugged logic exists in the average github repo it trained on. It will at some point give you code you read and say “yep, looks right to me” and then actually has a subtle buffer overflow issue, or actually fails in an edge case, because in a way that is just unnoticeable enough.

And you can make the argument that it’s your responsibility to find that (it is). But I’ve seen some examples thrown around on twitter of just slightly bugged loops; I’ve seen examples of it replicated known vulnerabilities; and we have that package name fiasco in the that first article above.

If I ask myself would I definitely have caught that? the answer is only a maybe. If it replicates a vulnerability that existed in open-source code for years before it was noticed, do you really trust yourself to identify that the moment copilot suggests it to you?

I guess it all depends on stakes too. If you’re generating buggy JavaScript who cares.

Spedwell@lemmy.world · 2 months ago

We should already be at that point. We have already seen LLMs’ potential to inadvertently backdoor your code and to inadvertently help you violate copyright law (I guess we do need to wait to see what the courts rule, but I’ll be rooting for the open-source authors).

If you use LLMs in your professional work, you’re crazy. I would never be comfortably opening myself up to the legal and security liabilities of AI tools.

Spedwell@lemmy.world · 2 months ago

That’s significantly worse privacy-wise, since Google gets a copy of everything.

A recovery email in this case was used to uncover the identity of the account-holder. Unless you’re using proton mail anonymously (if you’re replacing your personal gmail, then probably not) then you don’t need to consider the recover email as a weakness.

Spedwell@lemmy.world · 2 months ago

I think it’s more the dual-use nature of defense technology. It is very realistic to assume the tech that defends you here, is also going to be used in armed conflict (which historically for the US, involves in many civilian deaths). To present the technology without that critical examination, especially to a young audience like Rober’s, is irresponsible. It can help form the view that this technology is inherently good, by leaving the adverse consequences under-examined and out of view to children watching this video.

Not that we need to suddenly start exposing kids to reporting on civilian collateral damage, wedding bombings, war crimes, etc… But if those are inherently part of this technology then leaving them out overlooks a crucial outcome of developing these tools. Maybe we just shouldn’t advertise defense tech in kids media?

Spedwell@lemmy.world · 3 months ago

I’m still a big fan of D for personal projects, but I fear the widespread adoption ship has sailed at this point, and we won’t see the language grow anymore. It’s truly a beautiful, well-rounded language.

Also just recently a rather prominent contributor forked the entire compiler/language so we’re seeing more fragmentation :/

Spedwell@lemmy.world · 3 months ago

I don’t believe that explanation is more probable. If the NSA had the power to compell Apple to place a backdoor in their chip, it would probably be a proper backdoor. It wouldn’t be a side channel in the cache that is exploitable only in specific conditions.

The exploit page mentions that the Intel DMP is robust because it is more selective. So this is likely just a simple design error of making the system a little too trigger-happy.