cube2segy, mseed2segy — convert seismic data to SEG-Y format
cube2segy
[ -v
| --verbose
] [--include-pattern=
PATTERN
]... [--index-cache=
FILE
] [--output-dir=
DIRECTORY
[--force-overwrite
] [--force-concat
]] [--segy-format=
FORMAT
] {--project=
FILE
} { --shot-gather=
FFIDs
| --receiver-gather=
CHANNELs
} [--trace-length=
DURATION
] [--trace-offset=
SHIFT
] [--reduction-velocity=
VELOCITY
] file
| directory
...
cube2segy
[ -h
| --help
] [--version
]
mseed2segy
[ -v
| --verbose
] [--include-pattern=
PATTERN
]... [--index-cache=
FILE
] [--output-dir=
DIRECTORY
[--force-overwrite
] [--force-concat
]] [--segy-format=
FORMAT
] {--project=
FILE
} { --shot-gather=
FFIDs
| --receiver-gather=
CHANNELs
} [--trace-length=
DURATION
] [--trace-offset=
SHIFT
] [--reduction-velocity=
VELOCITY
] file
| directory
...
mseed2segy
[ -h
| --help
] [--version
]
The two GIPPtool utilities cube2segy and mseed2segy gather seismic data from one or more recorders and integrate the traces into seismic sections. Both programs are based on the same source code and differ only in the file format read as input. Therefore, unless noted otherwise, the following always applies to both utilities.
The programs support the creation of shot or receiver gathers. You
can switch between the two gather modes by specifying
--shot-gather
or --receiver-gather
at
the command line. The option --segy-format
is used to
further specify the output file format (SEG-Y,
Seismic Unix, ...).
To convert the recorded data into seismic sections, the program must
know about the setup of the experiment. This is done by providing a
"project file" (command line option --project
) containing
information about the location of every shot and receiver point in the
experiment as well as the times when the receivers were actually
recording, respectively when the sources were triggered.
After parsing the project file, the program will index the seismic
data contained in the files given at the command line. This step is
necessary so that the program later knows, which of the input files
contain the traces required for writing the seismic section. Use the
cube2segy utility to read from files in
Cube format, mseed2segy expects
miniSEED files as input. If a
directory
is given at the command line, the
program searches recursively for input files inside that directory. The
search can be shortened to contain only files matching a pattern given by
the --include-pattern
option. If you plan to use this
program repeatedly on the same dataset, you might also consider saving the
index to a file in between program runs (option
--index-cache
).
After all the required information is available, the utility will
begin to prepare (internal) lists of time windows describing what data
belongs into the respective shot/receiver gather. This is usually done by
taking the trigger time of the seismic source and then looking-up, which
receivers were recording at that time. However, you can further influence
the calculation of relevant time windows by applying a reduction velocity,
changing the trace length and/or setting a trace offset. (See
--reduction-velocity
, --trace-length
and
--trace-offset
options.)
Finally, the entries from the shot/receiver gather list are used to
look-up the respective input files in the previously created index of the
available seismic data. The program then reads the necessary data snippets
from the input files, converts them to SEG-Y or
Seismic Unix format and writes the resulting
seismic sections to the current working directory or in a separate
directory (use option --output-dir
).
The program pretty much follows expected Unix command line syntax.
Some of the command line options have two variants, one long and an
additional short one (for convenience). These are shown below, separated
by commas. However, most options only have a long variant. The
‘=
’ for options that take a parameter is required and
can not be replaced by a whitespace.
-h
, --help
Print a short summary of the available command line options and exit.
--version
Print the release information of the utility and exit.
-v
, --verbose
This option increases the amount of information given to the user during the program execution. By default (i.e. without this option) the GIPPtool utilities only report warnings and errors. (See also the diagnostics section below.)
--include-pattern=
PATTERN
Only read data from files whose filename matches the given
PATTERN
. Files with a name not matching
the search PATTERN
will be ignored. This
option is quite useful to speed up recursive searches through large
subdirectory trees and can be used more than once in the same
command line.
You can use the two wild card characters (
'*
', '?
') when specifying a
PATTERN (e.g. '*.pri?'
). Or alternatively, you
can also use a predefined filter called GIPP
that
can be used exclude all files not following the usual
GIPP naming convention.
The given search PATTERN
is only
applied to the filename part and not to the full pathname of a
file.
--index-cache=
FILE
Enable caching of the (internal) input file index. Before this
program can write seismic sections it first must know which input
file contains the necessary traces. For that purpose an index of all
available input files is build first. However, since the content of
all input files must be scanned for the index, this can be a quiet
time consuming process. To avoid lengthy, repeated scans every time
the program is started, the --index-cache
option
can be used to cache the index of all input files in a separate
FILE
.
It is the responsibility of the user to ensure that the cached index is up-to-date and (still) corresponds to the files and directories given as input at the command line. If in doubt, simply delete the cache manually and let the utility save a new cache file during the next program run.
If the cache FILE
already exist the
index will be read from it. If the file does not exist (yet), the
index will be written to it after scanning the input files. Already
existing cache files will never be overwritten! By default (i.e.
without this option) caching of the index is disabled.
--output-dir=
DIRECTORY
Save the resulting seismic sections to this
DIRECTORY
. The directory must already
exist and be writable! Already existing files in that directory will
not be overwritten unless the option
--force-overwrite
is used as well.
--force-overwrite
If this option is used, already existing files in the output directory will be overwritten without mercy!
The default behavior however is not to overwrite already existing files. Instead a new file is created with an additional number in between filename and extension.
--force-concat
Use this option to concatenate all resulting shot/receiver gathers into a single file. This can be useful if you plan to import the gathers into other software packages for further processing. (You only need to import one single file instead of many different files.)
By default however, a new output file is created for every single gather created.
--segy-format=
FORMAT
Select one of the following predefined output formats:
SEGY
Standard SEG-Y revision 1 (default).
SUOLD
Seismic Unix (old) native binary format.
SUXDR
Seismic Unix platform independent XDR format.
--project=
FILE
Use this mandatory option to indicate the file describing the experiment setup. For a detailed description of the file format see the project file section below.
Sometimes, this file is also called the "master" or the "geometry" file.
--shot-gather=
FFIDs
The resulting seismic sections are organized as shot gather. If no ffid range is specified, the program will try to write a seismic section for every ffid found in the projects (geometry) configuration file (see project file section below).
Alternatively, you can also specify a comma separated list of single ffids or ranges of ffids. Use two dots between the first and last ffid to specify a range. Please note that ffid lists and ranges must not contain any space characters!
Example: To produce seismic sections of the shots with ffid 1,
4, 5 and 6 you could use
--shot-gather=1,4..6
.
--receiver-gather=
CHANNELS
The resulting seismic sections are organized as receiver gather. If no channel range is given, the program will try to write a seismic section for every channel found in the projects (geometry) configuration file (see project file section below).
Alternatively, you can also specify a comma separated list of single channels or ranges of channels. Use two dots between the first and last channel to specify a range. Please note that channel lists and ranges must not contain any space characters!
Example: To produce seismic sections for receivers with the
channel ids 10, 11, 24, 25 and 26 you would use
--receiver-gather=10,11,24..26
.
--trace-length=
DURATION
Length of the traces in the resulting seismic section. The
DURATION
is given in seconds. Fractions
of seconds will be rounded to microsecond accuracy. If there are not
enough samples in the input, the trace in the seismic section will
be padded. The default trace length is one minute.
Example: Use --trace-length=120
to obtain
two minute long seismic sections consisting of 12000 samples per
trace (assuming input data recorded at 100 Hz). To produce 12001
samples per trace you would use
--trace-length=120.01
as command line
option.
--trace-offset=
SHIFT
Use this option to shift the start time of the traces in the
seismic section relative to the trigger time of the shot (as read
from the project file). The SHIFT
is
given in seconds. If no time offset is given, the program will
default to begin the trace as close as possible to shot time.
Example: To start the seismic section 2 seconds before the
shot time use --trace-offset=-2
. (Note the minus
sign! No spaces!)
--reduction-velocity=
VELOCITY
Add a time delay proportional to the distance between source and receiver to every trace in the seismic section. This factor, more commonly known as reduction velocity, is given in meters per second. By default no reduction velocity is applied.
The distance between source and receiver point is calculated using the coordinates in the project file (see project file section below). Obviously, applying a reduction velocity must fail If the project file only contains dummy / place holder coordinates!
A project file plays an important role when building shot or receiver gathers as it contains a description of experiment setup. Chiefly, this are geographic locations as well as time information.
Project files are simple text files where every (non-comment) line represents one source or receiver point of the experiment. The general syntax rules are:
Everything from a '#
' character up to the end
of line is considered to be a comment (and will be ignored by the
program).
All empty lines are ignored as well.
Any sequence of space characters or tab-stops in a line containing (any) text will be interpreted as column separator! The use of spaces inside (column) strings is not supported.
The number and content of the different columns varies. Lines describing seismic sources ("shots") will e.g. contain the location and trigger time of the source. Receiver lines, however, describe where and when the recorders were operating. The following listing is an example describing three blasts (the "seismic sources") carried out during an experiment in South Africa.
# ---------------------------------------------------------------- # name lat/lon/elev ffid shot time optional # ---------------------------------------------------------------- S s21 -33.1968 22.0695 579 1 2005-11-17T06:05:01.170 7.5 S s32 -33.1882 22.0644 566 2 2005-11-17T06:36:29.593 5.0 10 S s41 -33.1767 22.0592 540 3 2005-11-17T07:12:36.225 7.5
In detail the columns of source point lines have the following meaning:
Every source point line must
start with the character 'S
' in the first text
column. (The software uses this as indicator to distinguish
S
ource point lines from lines describing receiver
points.) Capitalization does not matter.
The second "name" column contains an arbitrary text string that makes it easier for humans to work with this file. You can place a description of the source point here ("at_yellow_house"), mileage along the profile, a stake number or anything else you think might be helpful.
This column is only used for user feedback by the GIPPtools software and it's content will not appear in the resulting seismic section. You can also just use the same dummy string for each source point line. However, the column must exist! Otherwise the software will get out of sync and try to interpret the following longitude column as latitude, elevation values as longitudes, etc.)
The next three columns define the location of the source point
(latitude, longitude and elevation in that order). Latitude and
longitude should be given in decimal degrees. Latitudes south of the
equator are negative as well as longitudes west of Greenwich.
Elevation should be given in meters. If you don't have the
coordinates of your source points (yet) use some dummy values (like
0.0
). The coordinates given here are entered into
the SEG-Y trace header.
The coordinates are also used to calculate the absolute (i.e.
non-negative) distance between source and receiver, which is
required when applying a seismic reduction velocity to the seismic
section (option --reduction-velocity
).
The sixth column is the Field File IDentification (FFID). Every source point must have a unique (positive integer) FFID assigned to it. The FFID will be entered into the resulting SEG-Y trace header. Usually seismic processing software uses this number to identify the recorded traces.
The trigger time of the seismic source goes into the seventh
column. It consists of date and time information given in ISO-8601
format (example: 2005-11-17T16:05:01.170
). All
programs of the GIPPtools package resolve time
down to microseconds.
Use 'T
' or '_
' to
concatenate date and time. If you use a space character instead,
the time information will be interpreted as the next (optional)
column.
The date/time information is followed by a variable number of optional columns. Unlike the previous columns no place holder / dummy entry is needed if no value is available!
The intended use for these columns is to transport arbitrary additional information that may be required by further processing steps into the resulting seismic section (e.g. "amount of explosives used" or "water depth"). If optional columns are used, the value given is always interpreted as a 4 byte IEEE-754 floating point number. The value of the first optional column is entered into the 240 byte long SEG-Y trace header at it's end (bytes #237 to #240). A following second optional value is placed right before the value of the first column (bytes #232 to #236). The third optional again is placed before the second (at #228 to #231) and so on.
If you use to many optional values (there is no hard limit build into the software) you will begin to overwrite important fields in the SEG-Y trace header.
Unlike the variable length source point lines, receiver point lines always contain ten values (columns) describing the equipment used during the measurement, when they were recording data and where they were located while doing so. The following listing again is an example.
# -------------------------------------------------------------- # name lat/lon/elev chan recorder start stop # -------------------------------------------------------------- R rp1 -33.2133 22.0783 598 1 e3168 p0 2005-11-13 2005-11-19 R rp2 -33.2125 22.0780 598 2 e3168 p1 2005-11-13 2005-11-19 R rp3 -33.2116 22.0777 597 3 e3168 p2 2005-11-14 2005-11-19 R rp4 -33.2110 22.0775 597 4 e3185 p0 2005-11-14 2005-11-20 R rp5 -33.2102 22.0773 595 5 e3185 p1 2005-11-14 2005-11-20 R rp6 -33.2093 22.0769 596 6 e3185 p2 2005-11-14 2005-11-20 R rp7 -33.2083 22.0765 594 7 e3130 p0 2005-11-15 2005-11-20 R rp8 -33.2074 22.0763 594 8 e3130 p1 2005-11-15 2005-11-20 R rp9 -33.2065 22.0760 593 9 e3130 p2 2005-11-15 2005-11-21
In detail the columns of receiver point lines have the following meaning:
Every receiver point line must start with the character
'R
' in the first column. (The software uses this
as indicator to distinguish R
eceiver point lines
from lines describing source points.) Capitalization does not
matter.
The second 'name' column is an arbitrary text string that makes it easier for humans to work with this file. You can place a description of the receiver point here ("close_to_big_tree"), mileage along the profile, a stake number or anything else you think might be helpful.
This column is only used for user feedback by the GIPPtools software and it's content will not appear in the resulting seismic section. You can also just use the same dummy string for each receiver point line. However, the column must exist! Otherwise the software will get out of sync and try to interpret the following longitude column as latitude, elevation values as longitudes, etc.)
The next three columns define the location of the receiver
point (latitude, longitude and elevation in that order). Latitude
and longitude should be given in decimal degrees. Latitudes south of
the equator are negative as well as longitudes west of Greenwich.
Elevation should be given in meters. If you don't have the
coordinates of your receiver points (yet) use some dummy values
(like 0.0
).
The coordinates are also used to calculate the absolute (i.e.
non-negative) distance between source and receiver, which is
required when applying a seismic reduction velocity to the seismic
section (option --reduction-velocity
).
The sixth column is the 'channel' number. Each receiver point in the experiment must have an unique positive integer channel number assigned to it. The channel number will be entered into the resulting SEG-Y trace header. Usually seismic processing software uses this number to identify the recorded traces.
Do not confuse this (experiment/profile unique) channel number with the instrument recording channel (see column #8).
The next two columns are needed to locate the data in recorded
files. Column seven contains the recorder unit name used to record
the data at. At the GIPP this is usually a five
character long string like "e3456
" or
"e6789
" for EarthData Loggers
(EDL) or "c0043
" for
Cubes.
Column eight is used to indicate the recording channel of the
respective recording equipment. Possible values are
'p0
' to 'p5
' for the primary
EDL recording channels and
's0
' to 's5
' for the secondary
channels. (If you used a three channel unit in the field obviously
only values from 'p0
' to 'p2
'
make sense.) For one channel Cubes use
'p0
' as channel descriptor.
If you are unsure about the correct values to enter use the mseedinfo or cubeinfo utility to inspect your input data. Both tools can list the recorder unit and the recorder channel name.
The last two columns describe the begin and end of the
recording. They consist of date and time information given in
ISO-8601 format (example:
2005-11-17T16:05:01.170
). Depending on your
experiment setup it may be enough to give just date information. But
if you enter also time of day information here, you should at least
specify hour and minutes.
Use 'T
' or '_
' to
concatenate date and time. If you use a space character instead,
the time information will be interpreted correctly.
The following environment variables can optionally be used to influence the behavior of the various GIPPtool utilities during startup.
GIPPTOOLS_HOME
This environment variable is used to find the location of the
GIPPtools installation directory. In particular,
the Java class files that make up the GIPPtools
are expected to be in the java
subdirectory of
GIPPTOOLS_HOME
.
GIPPTOOLS_JAVA
The utilities of the GIPPtools are written in the programming language Java and consequently need a Java Runtime Environment (JRE) to execute. Use this variable to specify the location of the JRE which should be used.
GIPPTOOLS_OPTS
You can use this environment variable for additional fine-tuning of the Java runtime environmant. This is typically used to set the Java heap size available to GIPPtool programs.
It is usually not necessary to define any of those variables as suitable values should be selected automatically. However, if the automatic detection build into the start script fails or you need to choose between different GIPPtool or Java runtime releases installed on your computer, these environment variables might become quite helpful to troubleshoot the situation.
Occasional, the program will produce user feedback. In general, user
messages are classified as INFO,
WARNING or ERROR. The
INFO messages are only displayed when the
--verbose
command line option is used. They usually
report about the progress of the program run.
More important are WARNING messages. In general, they warn about (possible) problems that may influence the output. Although the program will continue with execution, you certainly should check the results carefully. You might not have gotten what you (thought you) asked for. Finally, ERROR messages inform about problems that can not be resolved automatically. Program execution usually stops and the user must fix the problem first.
Use the following exit codes when calling the GIPPtool utility from scripts or other programs to see if finished successfully. Any non-zero code indicates an ERROR.
Success.
Command line syntax or usage error.
Input data error.
Input file did not exist or could not be opened.
Error in internal program logic.
I/O error.
Other, unspecified errors.
To prepare seismic sections from Cube data can use:
cube2segy --shot-gather --project=example.project ./cube-data/
This will produce a shot gather for every source point defined
in the example.project
file. The
Cube data will be read from the cube-data
directory and the seismic
section will be written to the current working directory.
Prepare seismic sections for the shots with ffid 2001 and 2002 only.
cube2segy --shot-gather=2001,2002 --project=example project ./cube-data/
Apply a reduction velocity of 6.5 km/s and shift all traces by half a second towards earlier times.
cube2segy --shot-gather --project=example project --reduction-velocity=6500 --trace-offset=-0.5 ./cube-data/
Create seismic section in the new (XDR) Seismic Unix format
cube2segy --shot-gather --project=example project --segy-format=suxdr ./cube-data/
$GIPPTOOLS_HOME/bin/cube2segy
The cube2segy "program". Usually just a symbolic link pointing to the standard GIPPtools start script.
$GIPPTOOLS_HOME/bin/mseed2segy
The mseed2segy "program". Usually just a symbolic link pointing to the standard GIPPtools start script.
$GIPPTOOLS_HOME/bin/gipptools
The GIPPtools start script. Almost all utilities of the GIPPtools package are started from this shell script.