Need send to archive loads of data...looking for the best way of doing that

90okm

New Member
Joined
May 5, 2022
Messages
3
Reaction score
0
Credits
60
Hi!
There are a lot of finished projects which must be kept for future.
each project is a folder , which contains different kind of data in different subfolder levels
the average size of a project 2-3 Tb
There are a lot of different files inside - tiny ones, big ones , and even huge ones

the task is - sending all of projects to the cloud storage AWS S3
The question is - which way is better and convenient ??????????

I've got a couple of ideas so far:
1) pack each project in multipart tar archive by 150Gb each part. so project "xxx" looks like "xxx.tar00 xxx.tar01 xxx.tar02" etc
2) all the same but gzip packing is added (the key z in tar command). the tarball appears a bit smaller, but pack/unpack speed is a bit slower

May be somebody has faced such trivial task and could give an efficient and convenient solution. May be there are better option than tar...

and another question in addition:
there are different ways to make multipart tar archive
packing tar into file and send it via pipeline to split command like this - tar -xvf - xxx | split -d -b150G xxx.tar

or using built-in tar options like --tape-length=150G --file=xxx{00..50}.tar
but I heard that in this case there is probability NOT to unpack this archive if you have another tar version on another computer. Because different versions may have different parts delimiter format or something like this )))

Would like to hear something from professionals....
Thanx..
 


How does that saying go?

"Never underestimate the bandwidth of a station wagon loaded with tapes cruising down the highway."

By that, I mean is a delivery company, even postal services, something you can consider?

Last I knew, services like Backblaze would also send you a fancy box of drives that acted like one giant drive where you could pre-load your data for them to backup, rather than having you try to upload terabytes of data.
 
the transfer itself is not a problem...

the question was:
is it the best way using tar for archiving or there are ways which is better

and the second question regarded multipart tar archives:

using built-in tar options like --tape-length=150G --file=xxx{00..50}.tar
I heard that in this case there is probability NOT to unpack this archive if you have another tar version on another computer. Because different versions may have different parts delimiter format or something like this )))
 
I've used rsync for stuff like this. Last time we moved around 50TiB that way. It took about a week.
rsync is great for this, since it can resume from where it got to if the transmission is interrrupted for some reason.

/tony
 

Members online


Top