Data Scientist

Data Analytics Lifecycle

2017-09-09  本文已影响11人  张荣恩Sophia

Value

Time-focused

Easy transition of the Project

Repeatable and validable

Lifecycle

1. Discovery:

Learn

the business domain to determine the general problem type and learn from the past relevant experience;

Access

available technology, raw data, right people and time scope

Formulate

initail hypothesis,

2. Data Preparation

Prepare work space (the analytic sandbox )

Preform ELT(Extract Load Transform Data)

Understand the data:   Compare What you needVSwhat you have

Clean & Normalize data

Decriptive Statistics  & Visualize to have an overview of the data qulity

3. Model Planing

Select methods based on data volume and structure, hypothesis and bussiness objectives

Determine workflow of candidate tests

Identify modeling assumptions

Explore dastaset and select significant variables via certain dimension reduction method

4. Model Building

Split the availble data into training data and test data

Get best environment to run the model(fast hardware, parallel)

5. Communicate Result

Interpret the results to identify key findings

Quantify bussiness value acording to the customers

6. Operationalize

Run a pilot to assess the benefits

Deliver and excute the final result in operation

Define process to improve the model as needed

Key roles for a Anylytic Project

Bussiness User

Project Sponsor

Project Manager

Business Intelligence Analyst: Usually come from the customer company with domain expertise so that they have deep understanding of the data, APIs

Data Engineer: with deep technical skills such as SQL queries and extraction data for analyse

DBA: who configures database enviroment to support analytic needs

Data Scientist: conduct  data modeling and valid analyse to meet overall analytic objectives

 

Deliverables to meet stakeholders' needs

Presentation for Sponsors:

Big picture takeaways and key messages aiding decision-making

clean and easy visulization to understand

Presentation for Analysts

Bussiness Process changes

Reporting changes

More technical graphs (ROC curves, density plots,histograms)

Code and Specification for technical people

Case Study

上一篇 下一篇

猜你喜欢

热点阅读