Kaggle竞赛:LANL Earthquake Predict
1 问题
Forecasting earthquakes is one of the most important problems in Earth science because of their devastating consequences. Current scientific studies related to earthquake forecasting focus on three key points: when the event will occur, where it will occur, and how large it will be.
In this competition, you will address when the earthquake will take place. Specifically, you’ll predict the time remaining before laboratory earthquakes occur from real-time seismic data.
If this challenge is solved and the physics are ultimately shown to scale from the laboratory to the field, researchers will have the potential to improve earthquake hazard assessments that could save lives and billions of dollars in infrastructure.
This challenge is hosted by Los Alamos National Laboratory which enhances national security by ensuring the safety of the U.S. nuclear stockpile, developing technologies to reduce threats from weapons of mass destruction, and solving problems related to energy, environment, infrastructure, health, and global security concerns.
Los Alamos National Laboratory举办了本次比赛。该实验室通过保证美国的核储备安全来加载国家安全,开发技术来降低大规模杀伤性武器的威胁,并且解决与能源、环境、基础设施、健康和全球安全相关的问题。
2 数据
The goal of this competition is to use seismic signals to predict the timing of laboratory earthquakes. The data comes from a well-known experimental set-up used to study earthquake physics. The acoustic_data input signal is used to predict the time remaining before the next laboratory earthquake (time_to_failure).
The training data is a single, continuous segment of experimental data. The test data consists of a folder containing many small segments. The data within each test file is continuous, but the test files do not represent a continuous segment of the experiment; thus, the predictions cannot be assumed to follow the same regular pattern seen in the training file.
For each seg_id in the test folder, you should predict a single time_to_failure corresponding to the time between the last row of the segment and the next laboratory earthquake.
File descriptions
train.csv - A single, continuous training segment of experimental data.
test - A folder containing many small segments of test data.
sample_sumbission.csv - A sample submission file in the correct format.
train.csv - 一个单独的连续的训练实验数据块
test - 一个文件夹,包含许多小的测试数据块
sample_sumbission.csv - 上传结果文件示例
Data fields
acoustic_data - the seismic signal [int16]
time_to_failure - the time (in seconds) until the next laboratory earthquake [float64]
seg_id - the test segment ids for which predictions should be made (one prediction per segment)
acoustic_data - 地震信号 [int16]
time_to_failure - 到下次实验地震的剩余时间[float64],单位是秒
seg_id - 测试数据块索引号,每个测试数据块对应着一个预测结果
3 已有的解决方案
3.1 Basic Feature Benchmark
- Basic Feature Benchmark