How can you remove the watermark of several PDFs using the terminal on Linux?

Moshpirit

New Member
Credits
46
I have the watermark picture as a PNG in a folder and the watermarked PDFs, is there any way to remove this big PNG picture right now embed in these PDFs using the terminal?
  • I made a research and found this post asking for a script with pdftk but it doesn't seem to work since it seems to apply just to text watermarks.
  • Here is a Slack Exchange post with GhostScript but I'm not sure if this could work for me since I don't want to remove all pictures, just one picture (that appears in every page).
  • I extracted the watermark with LibreOffice Draw, so I have the sample watermark as a PNG but if I try to remove all the watermarks manually the formulas get seriously screwed and it's a very hard word because there are a lot of PDFs.
  • Despeck works only for a very specific situation and mine it's not this one. Unfortunately seems like the watermark has to be colored and mine it's not (it's grey).

Here's one of the documents, it's some notes from an old website (no longer available). I wanted to print them and study them but the watermark make it very tedious.

I thought about some script that could remove big dimensions images or images that looks like this extracted picture.
 
Last edited by a moderator:


f33dm3bits

Gold Member
Gold Supporter
Credits
25,700
Exactly, that's the thing
It's basically extracting the pdf into uncompressed format than figuring out which object is setting the watermark and removing all instances of that object. It's going to take a while to figure out though.
 

Moshpirit

New Member
Credits
46
I'm sorry but I have no idea of how to do so. Could you show me how to do it? maybe I can edit one of the TDFs with LibreOffice so it's just one page with the picture in the pdf. This should give us the info of the image, right? I saw this link which seems similar to what you're saying but I don't really know how to adapt it to this case (specially since it doesn't seem to figure out anything from the PDF)
 

f33dm3bits

Gold Member
Gold Supporter
Credits
25,700

Moshpirit

New Member
Credits
46
I was able to remove some watermarks, but basically convert it to a word document, then you can use google docs to remove the watermark image.
1. Convert to word document: https://www.easepdf.com/pdf-to-word
2. Open with google docs and remove watermark image files: https://docs.google.com
3. Then you can save it and convert it back to pdf: https://www.easepdf.com/word-to-pdf
Great! I'll check it out! thanks a lot! are the equations ok? this is the major problem with office software but I hope it was just my LO config lacking some fonts
 

f33dm3bits

Gold Member
Gold Supporter
Credits
25,700
Great! I'll check it out! thanks a lot! are the equations ok? this is the major problem with office software but I hope it was just my LO config lacking some fonts
I only tried the first 3-4 pages there weren't equations there, I also tried the other methods to remove the watermark. Seems like it has gotten harder to remove watermarks, but I guess that's the whole point of watermarks. Maybe someone else on the forums has experience with it, I just tried searching and using what I find.
 
$100 Digital Ocean Credit
Get a free VM to test out Linux!

Staff online

Members online


Top