Maintenance
archiveRawOwner(path) compresses raw.owner/
to raw.owner.tar.gz after extraction has succeeded,
verifies the archive is readable, and only then unlinks the
original.
The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
# Minimal executable example — selectRecords() works entirely in memory
library(gmsp)
library(data.table)
#>
#> Attaching package: 'data.table'
#> The following object is masked from 'package:base':
#>
#> %notin%
master <- data.table(
RecordID = c("aabbccdd00112233", "aabbccdd00112233", "eeff00112233aabb"),
OwnerID = c("NGAW", "NGAW", "CESMD"),
EventID = c("20100227T063452Z", "20100227T063452Z", "20110311T054624Z"),
StationID = c("ANTU", "ANTU", "MYG004"),
DIR = c("H1", "H2", "H1"),
EventMagnitude = c(8.8, 8.8, 9.1),
Repi = c(90, 90, 140)
)
sel <- selectRecords(master[EventMagnitude > 8 & DIR == "H1"])
print(sel)
#> RecordID OwnerID EventID StationID
#> <char> <char> <char> <char>
#> 1: eeff00112233aabb CESMD 20110311T054624Z MYG004
#> 2: aabbccdd00112233 NGAW 20100227T063452Z ANTUgmsp ships an optional layer for managing a local
strong-motion record archive. It is separate from the
signal-processing core (AT2TS, TS2IMF,
TSL2PS, getIntensity) — you can use the core
without ever touching the indexing layer.
The indexing layer assumes records on disk in a fixed directory
structure. The base paths are yours to choose; functions that touch disk
take explicit path, path.records, or
path.index arguments.
<recordsDir>/ ← you choose this
<OwnerID>/ e.g. "NGAW", "CESMD", "ESM"
<EventID>/ e.g. "20060803T030800Z"
<StationID>/ e.g. "NTYB"
raw.owner/ provider files as downloaded
record.json owner-supplied metadata
<component-files> .AT2 / .v2 / .ac / .tr / ...
raw/ gmsp output of extractRecord()
AT.<RecordID>.csv WIDE: provider OCID columns (scaled to mm)
AT.<RecordID>.json DIR / OCID / NP / PGA / dt / Fs / Units
<indexDir>/ ← you choose this
RawFileTable.<OwnerID>.csv provider file inventory
RawRecordTable.<OwnerID>.csv one row per RecordID
RawIntensityTable.<OwnerID>.csv per (RecordID, DIR), 20 IM scalars
EventTable.<OwnerID>.csv event metadata
StationTable.<OwnerID>.csv station metadata
<selectionDir>/ ← you choose this
<name>.csv writeSelection() output
<name>.json sidecar with audit metadata
OwnerID |
Format | Parser | Quantity | Notes |
|---|---|---|---|---|
NGAW |
AT2 | readAT2() |
AT | PEER NGA-West2 (4-line header, NPTS/DT) |
CESMD |
V2 / V2c | readV2() |
AT | multi-channel V2 or single-channel V2c |
NWZ |
V2A | readV2A() |
AT | NWZ-flavoured V2 |
GSC |
TR (A/B/C/Z) | readTR() |
AT | Geological Survey of Canada |
IGP |
ACA / LIS | readAC() |
AT | Instituto Geofísico del Perú |
UCR |
ACB | readAC() |
AT | Universidad de Costa Rica |
| Generic | two-col | readTwoCol() |
AT | (t, s) ASCII columns; used by CAL, CENA, etc. |
ISEE |
ISEE | readISEE() |
VT | Micromate / ISEE blasting seismograph (mm/s velocity, MicL dropped) |
Each parser returns a LONG data.table(t, OCID, s) for
one component file. parseRecord() is the dispatcher that
consults .OWNER_FORMAT and calls the right parser for the
owner.
parseRecord() ── reads raw.owner/* via the owner's parser
│ returns LONG (t, OCID, s) for all components
▼
mapComponents() ── derives DIR labels H1 / H2 / UP from provider OCIDs
│ H1/H2 are derived processing directions
│ `extractRecord()` uses rotate = FALSE
│ Returns NULL for arrays or 2-comp records
▼
alignComponents() ── pads (or truncates) to equal NP across components
│
▼
extractRecord() ── scales to canonical mm via .parseUnits + .getSF
writes raw/<KIND>.<RecordID>.csv + <KIND>.<RecordID>.json
CSV columns remain provider OCID values; the JSON
sidecar stores the DIR -> OCID mapping.
KIND ∈ {AT, VT, DT} -- derived from the Units
suffix by .parseKind(), or forced by the
`kind = "VT"` argument (e.g. for blasting
records whose Units may be missing).
Sidecar peak field is named accordingly:
PGA (KIND=AT) / PGV (KIND=VT) / PGD (KIND=DT).
RecordID = first 16 hex chars of md5(CSV).
extractRecord() is the orchestrator; parsers and
mapComponents() are public so they can be reused or
audited. Public calls use parseRecord(.x, path) and
extractRecord(.x, path), where .x is the
one-record master subset and path is the records root.
After extractRecord() has produced raw/
outputs for some records, the indexing functions scan the records tree
and emit per-owner CSVs to <indexDir>/:
buildRawFileTable() — provider-file inventory (one row
per ComponentID × FileID); reads
raw.owner/record.json or raw.owner.tar.gz
(post-archive safe).buildRawRecordTable() — one row per
RecordID (NP = max(post-align),
pad = max NP − min NP, Fs).buildRawIntensityTable() — calls
getRawIntensities() per station; emits three rows per
record (one per DIR), each carrying the 20 AT-derivable
scalars from getIntensity().The provider-flatfile + USGS catalog join
(buildEventTable()) is under development and ships in
inst/dev/; it is not yet part of the exported API.
buildMaster() joins, per owner:
RawRecordTable.<O>.csv (record list),EventTable.<O>.csv (event scalars, merged via
fcoalesce with source precedence *.owner >
*.USGS > *.ISC),StationTable.<O>.csv (station scalars including
Vs30),and emits a data.table keyed at
(RecordID, DIR). It adds:
Repi — epicentral distance (haversine, km),Rhyp — hypocentral distance, \(\sqrt{\mathrm{Repi}^2
+ \mathrm{EventDepth}^2}\) (km).After buildMaster() you can filter the master and pass
the subset to selectRecords() to produce a
(RecordID, OwnerID, EventID, StationID) selection, which is
the input contract for the readTS() family —
readAT() / readVT() / readDT()
are KIND-specific wrappers around
readTS(.x, path, kind = ...) — and for
writeSelection() (persists the selection to disk for
orchestration).
The natural composition for acceleration records is:
M <- buildMaster(path = "<your index path>")
Selection <- selectRecords(M[EventMagnitude > 7 & Repi < 100 & DIR == "H1"])
TS <- readAT(.x = Selection, path = "<your records path>")
ATS <- TS[, AT2TS(.SD, units.source = "mm", Fmax = 25),
by = .(RecordID, OwnerID, EventID, StationID)]The output of readAT() is a wide table keyed by
(RecordID, OwnerID, EventID, StationID, t) with one column
per provider OCID. AT2TS() consumes it per
record. The shape is identical for readVT() and
readDT(); pair them with VT2TS() /
DT2TS(). Blasting records (e.g. ISEE) typically flow
through readVT() + VT2TS().
auditSite(M) — flags rows with missing or out-of-range
StationVs30.auditDistances(M) — flags lat/lon NA or
out-of-range, negative depths, large Repi, geometric
impossibility (Rhyp < Repi).auditParsers(.x = M, owner = "NGAW", path = ...) —
dry-run parseRecord() per (EventID, StationID)
of one owner and report OK / FAIL / WARN with reason.archiveRawOwner(path) compresses raw.owner/
to raw.owner.tar.gz after extraction has succeeded,
verifies the archive is readable, and only then unlinks the
original.
raw.owner/ is the user’s responsibility.
Examples under examples/maintenance/ in the source
repository show a pattern for ingestion (USGS catalog matching, staging
/ promote / rollback, etc.).RecordID is a 16-character hex hash
(openssl::md5 of the WIDE CSV body, truncated). It is
stable across re-extraction of the same record.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.