@Treczoks

Treczoks@lemmy.world · 9 hours ago

In politics and diplomacy, calling a decision “brave” has a very interesting meaning.

Treczoks@lemmy.world · 9 hours ago

This depends on what you are actually looking for, and how you are looking for it.

Do you really need pattern matching, or do you only look for fixed strings? Then other tools may be faster.

If you need case independent search on an upper- and lowercase data set, make a copy that is all upper or all lower, and search there.

If you only search in certain columns, make a copy that only includes these.

Or import the data into a database.

Treczoks@lemmy.world · 2 days ago

We just celebrated 28 years of this development, so 1997. We live here since 2002.

Treczoks@lemmy.world · 4 days ago

First of all, he should drop Python for anything resource intensive as such a simulation. And then think about how to optimize the algorithm.

Treczoks@lemmy.world · 5 days ago

As long as it finances the tax break for billionaires…

Treczoks@lemmy.world · 8 days ago

Maybe he should ask the Americans living close to the border how many cards he really has.

Treczoks@lemmy.world · 9 days ago

You might notice that not all PhDs work in academic environments. Actually most of them work in businesses.

Treczoks@lemmy.world · 9 days ago

This does not help when the deepfaker can’t be found.

Treczoks@lemmy.world · 9 days ago

I don’t think he really has a position of power. And it could well be that this costs him the chance for the crown.

Treczoks@lemmy.world · 10 days ago

FTFH: Orbán: “Ukraine is not NATO member and I ~~intend~~ was commanded by Putin to keep it that way”

Treczoks@lemmy.world · edit-2 12 days ago

The Iranians would have been terminally stupid if they hadn’t moved out anything that’s not bolted down (and even some that is) from the known locations in the days before the attacks. The IDF was openly demanding the US to bomb those sites, so they knew they were in the crosshairs. And if the only wrapped it up and buried the stuff in the sand somewhere.

The US might have damaged the location, but believing they had in any significant form damaged the program is moot. On the contrary, Iran now has the irrefutable proof that the US does not care even about their own secret services report that Iran had given up (or at least was not actively working on) on the bomb. Now they have the incentive to actually build it so they can use it as a deterrent and if needed, in self-defence.

Treczoks@lemmy.world · 12 days ago

Hey, Donald, projecting again?

Treczoks@lemmy.world · 14 days ago

To the surprize of absolutely no one, maybe exept the US government.

Treczoks@lemmy.world · 14 days ago

May they tear up each other in court.

Treczoks@lemmy.world · 14 days ago

Killing Khamenei would be a bad move. That’s how martyrs are made, and they are already dialed up to 11 in Iran, so there is no reason to pour even more oil in the fire.

Treczoks@lemmy.world · 16 days ago

The compressing and renumbering seems to be more common with embedded Chinese fonts - Space-wise it makes a lot of sense. But yes, mark and copy text, paste it into word or writer, and you get gibberish. Can’t verify the search, though. And, of course, Google translate can’t do anything with it, either.

Treczoks@lemmy.world · 17 days ago

If you ever need to edit a PDF that way, just use Inkscape. It is way better than LO draw for that.

Treczoks@lemmy.world · 17 days ago

It is not a curse. It does exactly what it is intended to do: Create an archive of a document that is universally reproduceable.

It is a very well designed cul-de-sac for exactly this purpose. Using it for anything else is calling for trouble.

Treczoks@lemmy.world · 17 days ago

The problem lies in the PDFs themselves. In there are objects that represent lines of glyphs. If you are lucky. A conversion tool can guess which of those lines belong together and produce the text.

It cannot know any intentions behind it, though. Take a numbered list. The first line is two line objects: the number plus the . or the ), and the first line of text. The conversion tool can now guess. As the line blocks with the numbers are all left of the line blocks with text, this could be a numbered list. Or it could be a table with two columns. Nothing in the PDF is giving any hints.

And that is the easy part. This assumes that the document either uses default fonts, or keeps its embedded fonts untouched. If they use embedded fonts and a PDF optimizer that only embeds the used characters and renumbers them, any copy or conversion tool is bound to fail.

Same with protected PDFs where you simply cannot copy the text from the start.

And then there are PDFs that just consist of scanned pages. Here you would need an OCR software to get something readable out of them.

PDF is an archival, output format, the end of a process. Not something to work from.

Always preserve the original file. Keep it safe. If you change tools, make sure you have a conversion path into something editable. The PDF is for giving away, nothing else.

Treczoks@lemmy.world · 17 days ago

Tough getting a spoon of their own medicine. Although the mullahs claim they were actually targeting something nearby, while the IDF just claims every hospital, school, or whatever to be Hamas HQ.