Indexer
« Back to Receiving Documentation
Also see this page describing Indexer Processing
The Indexer maintains an index of received products. It uses this
index to associate related products into events, based on eventid or
time, latitude, and longitude. When multiple sources submit information
for the same event, the indexer determines which source is considered
preferred for that type of information.
The Index
The index is typically a database, although it is not required to
be. The default implementation uses JDBC, and should be able to maintain
an index in any JDBC compliant database.
Archive Policies
Archive policies define rules for when the indexer should remove
information from its index.
Search
Enabled by default. The indexer listens on a socket to
allow external users to search and retrieve information from The Index.
See the command line client --search
option, or
SearchSocket API class.
Searches and results use an XML format. See
etc/schema/indexer.xsd for details.
Indexer Events
When a product arrives and is added to the index, the indexer
keeps track of the changes it makes. Each Indexer Event is a group of
one or more changes that were made in response to one product arriving.
This tracking is performed through an onEventTrigger database trigger. For
a technical description of this trigger and instructions for implementing it
on the MySQL database, see Configuring the Product
Index to Use MySQL
Change Types
- EVENT_ADDED
- An event was added to the index. This occurs when a product
arrives that cannot associate to an existing event, but has enough
information (time, latitude, longitude) to create a new event.
- EVENT_SPLIT
- An event that was part of another event in the index, in now
considered a separate event. This usually occurs when a network
updates is location far enough away from the "parent" event.
There may be several EVENT_SPLIT changes, but there will always also
be an EVENT_UPDATED for the event that the split events were split
from.
- EVENT_UPDATED
- An event that already existed in the index was updated. This
occurs when a product arrives and associates to an existing event. This
does not necessarily mean the preferred event properties (eventid,
time, latitude, longitude, magnitude, depth) have changed, only that
information associated to this event is different than before.
- EVENT_DELETED
- An event that already existed in the index was deleted. This
effectively means the event did not occur. This occurs when a product
arrives, associates to an existing event, and because of the new
information the event no longer has a time, latitude, or longitude.
- EVENT_MERGED
- An event that already existed in the index merged with
another event. This means this event still occured, but is now part of
another event (and is not preferred).
There may be several EVENT_MERGED changes, but there will always also
be an EVENT_UPDATED for the event that the merged events were merged
into.
- EVENT_ARCHIVED
- An event was removed from the index due to a configured
archive policy. The event still occured, but is no longer being tracked
by this indexer.
- PRODUCT_ADDED
- A product arrived, was unable to associate to an event, and
did not have enough information (time, latitude, longitude) to create a
new event.
- PRODUCT_UPDATED
- An unassociated product was updated. If an update causes the
product to associate, there will be an EVENT_UPDATED change instead of
PRODUCT_UPDATED.
- PRODUCT_DELETED
- An unassociated product was deleted.
- PRODUCT_ARCHIVED
- An unassociated product was removed from the index due to a
configured archive policy.
Example Indexer Configuration File
In this example, an indexer is configured to:
- Download "origin" and "shakemap-input" type products
- Call a listener named "shakemap_listener" whenever an event's
preferred mag,lat,lon,depth, or time change, which triggers an
executable script "/home/shake/bin/ProductClient/trigger_pdl".
- Automatically clean up old versions of products, and events
after 60 days
; note this configuration does not include senders,
; which would be required for sending products.
receivers = receiver_pdl
listeners = indexer
enableTracker = false
; receive from production hubs
[receiver_pdl]
type = gov.usgs.earthquake.distribution.EIDSNotificationReceiver
storageDirectory = data/receiver_storage
indexFile = data/receiver_index.db
serverHost = prod01-pdl01.cr.usgs.gov
serverPort = 39977
alternateServers = prod02-pdl01.cr.usgs.gov:39977
cleanupInterval = 900000
storageage = 900000
; indexer is only listener
; currently it only receives origin messages
[indexer]
type = gov.usgs.earthquake.indexer.Indexer
listenerIndexFile = data/indexer_listener_index.db
storageDirectory = data/indexer_product_storage
indexfile = data/indexer_product_index.db
includeTypes = origin, associate, disassociate, trump, trump-origin
listeners = indexerlistener_example
archivePolicy = policyOldEvents, policyOldProducts, policyOldProductVersions
[policyOldEvents]
; remove events after one month
type = gov.usgs.earthquake.indexer.ArchivePolicy
maxAge = 2592000000
[policyOldProducts]
; remove unassociated products after one week
type = gov.usgs.earthquake.indexer.ProductArchivePolicy
maxAge = 604800000
onlyUnassociated = true
[policyOldProductVersions]
; remove old versions of products after one hour
type = gov.usgs.earthquake.indexer.ProductArchivePolicy
maxAge = 3600000
onlySuperseded = true
; whenever the indexer makes a change, it calls this listener
; currently it only receives changes triggered by origin products
[indexerlistener_example]
type = gov.usgs.earthquake.indexer.ExternalIndexerListener
storageDirectory = data/indexerlistener_storage
command = echo
processPreferredOnly = true
includeTypes = origin
Indexer Summarization
As an aid to indexing, the Indexer maintains a product summary
of products, associating them to seismic events using time, latitude and
longitude. Using these three attributes, the Indexer assigns an eventID
to the summaries, so that multiple products can be efficiently cross-referenced
to a single event.
As part of the summarization process, the Indexer extracts a specific subset
of properties from various products, so that important key aspects of an event
are visible without having to interrogate the details of multiple products.
Summarized Properties
The following properties are extracted from products and are associated with
summarizations of events:
- region
- The name of a particular geographic region. Initially the Indexer makes an
attempt at obtaining the region directly from the origin or
geoserve products. Failing that, it derives the region using the event's
latitude and longitude. This derivation is performed by the feplus feature
of the Indexer, where individual regions are defined by latitude/longitude
within the etc/config/regions.xml file.
- maxmmi
- The maximum shaking intensity found in the shakemap product, although
maxmmi is directly obtained from the losspager product. If not available
from losspager, then maxmmi is obtained from the dyfi product.
- alertlevel
- A categorized fatality or economic loss level, obtained from the
losspager product.:
- Green
- 0 fatalities OR less than 1 million U.S. dollars economic loss.
- Yellow
- 1-99 fatalities OR less than 100 million U.S. dollars economic loss.
- Orange
- 100-999 fatalities OR less than 1 billion U.S. dollars economic loss.
- Red
- 1000+ fatalities OR greater than 1 billion U.S. dollars economic loss.
- review_status
- Whether this event has been reviewed by a human, obtained from the
origin product.
- event_type
- The type of event, such as earthquake or landslide, obtained
from the origin product.
- azimuthal_gap
- Azimuthal Gap is obtained from the origin product.
- magnitude
- Magnitude is obtained from the origin product.
- num_Resp
- The number of individuals completing the DYFI web dialogue for this event,
obtained from the nresponses attribute of the event_data.xml file included in
the dyfi product.
- tsunamiFlag
- A [“true”|“false”] Boolean string indicating if
the tsunami flag should be triggered automatically, obtained from the
geoserve product.
- utcOffset
- Number of minutes between the epicenter timezone and UTC, obtained
from the geoserve product.
- significance
-
An integer value indicating the significance of an event, calculated
from properties of the origin, losspager and dyfi
products.
Significance is calculated from the following multi-step formula:
- magnitude_significance
- = (100/6.5) * magnitude2
- pager_significance
- = 2000 if red 1000 if orange 500 if yellow
- dyfi_significance
- = MIN(num_Resp, 1000) * maxmmi * 0.10
- significance
- = MAX(magnitude_significance, pager_significance) + dyfi_significance
Product Summarized Preferred Weight
Within each type of product, the summary with the largest preferred weight is
considered preferred. This calculated weight is the sum of four components:
- DEFAULT_PREFERRED_WEIGHT = 1
- All product summaries have a preferred weight of at least 1.
- SAME_SOURCE_WEIGHT = 5
- Weight added when product source is same as event source.
- AUTHORITATIVE_WEIGHT = 100
- Weight added when product author is in the product's authoritative
region.
- AUTHORITATIVE_EVENT_WEIGHT = 50
- Weight added when product refers to an authoritative event.
Indexer Components
Indexer SQL Dependencies
The Indexer is dependent on two SQL components: the feplus system and
OnEventUpdate stored procedures:
- mysql_feplus
- Found in the schema/mysql_feplus directory, feplus implements
region-identifying functionality based on latitude and longitude. It uses
the definitions in the etc/config/regions.xml file to associate a
region-name with a particular latitude/longitude location of an event or
product. OnEventUpdate stored procedures uses this functionality for
origin and geoserve products, which ultimately determine
properties such as event significance.
- onEventTrigger Stored Procedures
- Found in the schema/productIndexOnEventUpdateMysql.sql file,
these procedures summarize products and events for efficient retrieval.
The trigger is evoked when the Indexer's Java classes use
time/latitude/longitude information in products to create or modify events.
Some Major Java Components
- JDBCProductIndex
- This class implements the ProductIndex interface to maintain events,
product summaries, event summaries and properties. It contains and executes
the SQL manipulations of the database.
- Indexer
- This key class uses JDBCProductIndex to maintain the database, as well
as adds and removes listeners, receives products and sends notifications. It
extends the DefaultNotificationListener class.
Indexer Modules
Specific products sometimes have special needs for indexing; the three existing
product type of this nature are the shakemap, dyfi, and
moment-tensor products. This special indexing is configured in
config.ini, as is documented in the
Indexer Components section of the
configuration documentation and illustrated below.
The following code snippet from config.ini shows the minimum entries
necessary for requesting special indexing for the shakemap and dyfi products:
[indexer]
modules = indexer_module_shakemap, indexer_module_dyfi
[indexer_module_shakemap]
type = gov.usgs.earthquake.shakemap.ShakeMapIndexerModule
[indexer_module_dyfi]
type = gov.usgs.earthquake.dyfi.DYFIIndexerModule
[indexer_module_momenttensor]
type = gov.usgs.earthquake.momenttensor.MTIndexerModule
-
The modules = line creates labels for further shakemap and
dyfi definition.
-
The [indexer_module_shakemap], [indexer_module_dyfi],
and [indexer_module_momenttensor] lines mark the start of those
definitions.
-
The three type = lines specify the Java code classes that will handle
the special indexing for those three product types.
As has been noted elsewhere in this documentation, the custom programming of
these special indexing classes requires coordination between the product producer
and the PDL web team at jmfee@usgs.gov.
- gov.usgs.earthquake.shakemap.ShakeMapIndexerModule
- This class implements the ProductIndex interface to maintain events,
product summaries, event summaries and properties. It contains and executes
the SQL manipulations of the database.
- gov.usgs.earthquake.dyfi.DYFIIndexerModule
- This key class uses JDBCProductIndex to maintain the database, as well
as adds and removes listeners, receives products and sends notifications. It
extends the DefaultNotificationListener class.
- gov.usgs.earthquake.momenttensor.MTIndexerModule
-
This class adjusts the weight of moment tensor products.