amazon web services - AWS S3 copy to bucket from remote location -
there large dataset on public server (~0.5tb, multi-part here), copy own s3 buckets. seems aws s3 cp
local files or files based in s3 buckets?
how can copy file (either single or multi-part) s3? can use aws cli or need else?
there's no way upload directly s3 remote location. can stream contents of remote files machine , s3. means have downloaded entire 0.5tb of data, computer ever hold tiny fraction of data in memory @ time (it not persisted disc either). here simple implementation in javascript:
const request = require('request') const async = require('async') const aws = require('aws-sdk') const s3 = new aws.s3() const bucket = 'nyu_depth_v2' const baseurl = 'http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/' const parallellimit = 5 const parts = [ 'basements.zip', 'bathrooms_part1.zip', 'bathrooms_part2.zip', 'bathrooms_part3.zip', 'bathrooms_part4.zip', 'bedrooms_part1.zip', 'bedrooms_part2.zip', 'bedrooms_part3.zip', 'bedrooms_part4.zip', 'bedrooms_part5.zip', 'bedrooms_part6.zip', 'bedrooms_part7.zip', 'bookstore_part1.zip', 'bookstore_part2.zip', 'bookstore_part3.zip', 'cafe.zip', 'classrooms.zip', 'dining_rooms_part1.zip', 'dining_rooms_part2.zip', 'furniture_stores.zip', 'home_offices.zip', 'kitchens_part1.zip', 'kitchens_part2.zip', 'kitchens_part3.zip', 'libraries.zip', 'living_rooms_part1.zip', 'living_rooms_part2.zip', 'living_rooms_part3.zip', 'living_rooms_part4.zip', 'misc_part1.zip', 'misc_part2.zip', 'office_kitchens.zip', 'offices_part1.zip', 'offices_part2.zip', 'playrooms.zip', 'reception_rooms.zip', 'studies.zip', 'study_rooms.zip' ] async.eachlimit(parts, parallellimit, (key, cb) => { s3.upload({ key, bucket, body: request(baseurl + key) }, cb) }, (err) => { if (err) console.error(err) else console.log('done') })
Comments
Post a Comment