batch rename PDFs based on a specific form field

databasewolfy

New Member
What would be the best method to batch rename 500,000 PDFs based on a specific form field within the files? All of the files would have the identical form field structure.
 


Rob

Administrator
Staff member
You could use the rename command - i've used it in the past.. the options are different however (at least when I last used it) comparing centos and debian/ubuntu..

Here's a tutorial on it that i wrote a while ago:
 

JasKinasis

Well-Known Member
I'd second Robs suggestion. The rename command is almost certainly the best option here.

If you know what you're looking for in each file - it should be fairly simple to write a script to find the field in each pdf file and then rename it using the rename command.

And you have an extremely large number of files to rename. So you might want also want to consider using something like GNU parallel to speed up the process by multi-threading it.

If you use parallel - the more processor cores/threads you have available - the faster your script will complete.

But Parallel does have a bit of a learning curve. So it's a case of weighing up the time-cost of learning GNU parallel and writing a script that can leverage it's power - against the time taken to write a simpler, non-multi-threaded script and the time it will take to execute!
 

Members online

No members online now.

Latest posts

Top