Intern:File Upload

PANGAEA File Upload Workflow

This a short overview about the current state of file uploads and file archiving.

Upload limits

 * Via ticket: max size attachments 100 MB / max 20 files
 * Via uploader: max size 15 GB per file, no limit in total volume (for curators no limit in file size)

Create data submission ticket
An author creates a submission request: https://www.pangaea.de/submit/

As curator, check submissions: https://issues.pangaea.de/projects/PDI

Request “upload link” for large file uploads
If files exceed the submission limit, as curator click on “Request file upload”.

Example: https://issues.pangaea.de/browse/PDI-24552

Upload files
Author follows generated link and uploads files to, e.g.

https://issues.pangaea.de/upload/?code=5ef052c2338474.86421682

Web page must be open while uploading. A page reload cancels current uploads. So wait!

Author clicks “Confirm file upload” to comment on submission ticket.

Check uploaded files
As curator, check files and file names e.g. for invalid characters.

Upload files are located on Isilon and can be edited there. This is the source.

can be opened and edited for example in file explorer.

When using WinSCP, the files are located here:

/pangaea/ext/isilon/upload/

After editing, e.g. file rename, you can check the upload overview:

https://issues.pangaea.de/upload/?code=5ef052c2338474.86421682

You can also connect to the server via ssh or filezilla, see below: Get file list for editorial

Binary File Import

 * There are two options: A Files are on your PC or B Files are in the upload tool

A: Files are on your PC

 * Prepare the files to be uploaded in a single directory on your computer
 * Make an Import File with a column   (a list of File names to be imported, without path)
 * Open 4D and select Import > Data > Open
 * Choose import file
 * You are now requested to choose the folder, where the files are on your PC, then click ok
 * Do everything else as usual (transfer from issue, ...)
 * After import you can add additional parameter by adding the binary object parameter to the lower half several times, then changing them to MD5 or file size
 * Ready

B: Files are in the upload tool

 * Go to upload tool and click on 'download Import Matrix'
 * you now have the import file (the paths will start with ), which can be edited and completed with additional parameters or multiple Events
 * Open 4D and goto Import > Data > Open
 * choose import file
 * you are now also requested to choose the folder, but this time click 'abbrechen', because the files are already staged
 * do everything else as usual (transfer from issue, ...)
 * after import you can add additional parameter by adding the binary object parameter to the lower half several times, then changing them to MD5 or file size
 * ready

C: A combination of files in the upload tool and on PC
A combination of files in the upload tool and on your PC (A + B) is also possible within one column in the upload matrix. The files in upload tool start with, and those in a single directory on your computer remain without prefix (just file name).

D: Files archived under /hs/platforms (WORM)
This is relevant for archiving raw data from expeditions stored at /hs/platforms (WORM).
 * For AWI platforms, typically the import files with lists of links are prepared by the AWI Data Logistics Support group and submitted in ticket from  (user: o2a-ingest)
 * The header contains one or multiple parameters
 * The links pointing to /hs/platforms start with, followed by "sensor path"

E: Files archived under AWI-Servers (e.g., hs/projects/)
This is relevant for archiving data from AWI-Geophysic-Group.
 * The file folder on hs must be shared with the user pangaeaadm.  Management by the user takes place via cloud.awi.de. (seismicsea is set up)
 * The header contains one or multiple parameters
 * The links pointing to /hs/gsys/ start with, followed by "server path"

F: Files archived under isilon project folders (e.g., /isibhv/projects/p_mosaic_als/)
This is relevant for archiving data from AWI Projects.
 * The user pangaeaadm must be added by the project owner through cloud.awi.de (e.g., project p_mosaic_als).
 * The header contains one or multiple parameters
 * The links pointing to /isi/projects start with, followed by the complete path to individual files

Access restrictions
During (or also after) import, access restrictions can be set or removed in 4D.

Exchanging files
"Exchanging" individual files cannot be done. Rewrite the old dataset with a complete newly imported one.

Staging folders (Files archived under AWI-Servers, e.g., hs/gsys/, WORM)

 * Background information: the HSM archiving (and storing) data (hssrv1) documentation is available at RZ confluence.
 * When accessing more than a few files, prior staging is strongly recommended (this will make the processing much faster).
 * Connect to AWI VPN.
 * Open terminal window (e.g. Eingabeaufforderung under Windows or Terminal on Mac).
 * Connect to . Use your AWI user password, for the first time you might need to type   to confirm you trust the connection.
 * Alternatively, when your PC account user name is not identical with the AWI user name, add the short version of the AWI user name to the command:
 * A message starting with "Welcome to Ubuntu..." will appear after successful login.
 * Before staging a folder containing your files for import, you can inspect the content of your folder on hs with the list command.
 * Stage the entire folder:.
 * Alternatively, if this leads to an error command not found:.
 * Check the staging status: . When the "user" column of the report contains your user name or pangaeaadm, you need to wait (see Fig.)
 * After staging is completed, import the list of files (see above sections D and E).