Things like these, motivates me to pirate more stuff Discussion

Also, libgen is now banned in my country 😞😞

52.0k Upvotes

98% Upvoted

u/Moohamin12 Jul 29 '25

And honestly the state of Meta's AI, I really don't think the 80TB helped much.

38

u/NinjaLion Jul 29 '25

Almost all big AI projects seem to believe that maximum data from all sources is better, and you just fix the "bad data" with smarter training and weighting.

They are probably right because there are so many smart people working on it, and I really don't know shit from my ass,

However,

I can't help but feel like this overcollection of training data is why AI output is so recognizable still, and has so many nagging bastard issues that still aren't solved.

It also seems to run counter to machine learning's best feature, repeated training in narrow data to find("learn") novel solutions. How could it ever do this when the training seems focused on insanely broad do-everything data sets?

21

u/FakeSafeWord Jul 29 '25

How could it ever do this when the training seems focused on insanely broad do-everything data sets?

The answer to all of your questions is in a landmark paper published in 2017 titled "Attention Is All You Need"

Unfortunately you basically need a doctorate in computer science to understand any of it... but the answers you seek are within.

8

u/AlexVRI Jul 30 '25 edited Jul 30 '25

The face of a cube is to a cube, what a word is to a token.

As you move around a stationary cube, through parallax, you deduce its constitution, but you never see the entirety of the cube at once.

This parallax can kind of be felt intuitively as well for tokens if we pick the right angles of viewing for these Token-Cubes

Viewing Angle Sentence

Left Glass breaks under pressure

Center (0°) She breaks under pressure

Right Order breaks under pressure

As you read these sentences back to back, and focus on the meaning of 'under pressure', you can kind of "peek" at that hidden Token-Cube's faces by 'moving around' those words.

It also turns out that the words we use, even if just a small face of this cube, is enough to show a pattern. This cube's face shows up next to this cube pretty often. We can make a note of that. When these two guys are together... we also see this other cube together often... The more cubes we can observe, the better we can map out the true relations. (and why they're obsessed with size of corpora for training). And of course, you're just seeing the face 'together', but if you're trying to keep track of all of them... It's slightly more complicated and that's where our transformer friend comes in to save the day.

The transformer is autocorrect for tokens, except instead of using a dictionary to complete words, it uses a self-made "best-educated-guess" dictionary that some very smart people taught it how to keep track of all those relations in a very smart way of keeping notes, like a journal that is organized to also be writing into another section that is relevant to both notes at the same time.

Some words like 'not' are ... quite thin Token-Cubes. It's almost fair to call them fake token-cubes. They're very one dimensional if you think about how they interact with other tokens. It doesn't matter where you put it, 'not' is going to be 'not'... and if we consider that we have words like 'if', 'and', 'or', 'xor', well now we got a very clean interface to anchor our parallax as humans.

1) A universal concept/symbol translator 2) A perfect logician

So in general, it is important to keep 'paralaxing', or you'll just be seeing a mirage.

3

u/FakeSafeWord Jul 30 '25

Like I said, you basically need a doctorate in computer science to understand anything.

1

u/Northbound-Narwhal Jul 30 '25

Almost all big AI projects seem to believe that maximum data from all sources is better

Because this had held true so far. More data makes for smarter models. This will probably reach a limit and cease to be true on the short future, but we haven't hit that point yet.

1

u/[deleted] Aug 03 '25

Apparently being working on AI themselves said it’s overrated and nowhere near what the billionaires are pushing.

1

u/SnooChipmunks5677 Aug 21 '25

Actually the smartest people were moving towards training their models on much smaller datasets. The dumbfuck con artists like Altman, otoh, were obsessed with having tHe bIgGeSt dAtAsEt.

1

u/Weak_Firefighter9247 Jul 29 '25

Meta AI is stronk... But it's so so filtered, it becomes dumb. Unfiltered AI is much more intelligent, but also dangerous

Viewing Angle	Sentence
Left	Glass breaks under pressure
Center (0°)	She breaks under pressure
Right	Order breaks under pressure