大数据工程师核心技能学习资源(附链接),成都大数据培训机构
本文重点列出了很多与核心技能相关的的优秀学习资源。
1
数据工程入门
《数据工程入门指南》(第1部分):
https://medium.com/@rchang/a-beginners-guide-to-data-engineering-part-i-4227c5c457d7
《数据工程入门指南》(第2部分):
https://medium.com/@rchang/a-beginners-guide-to-data-engineering-part-ii-47c4e7cbda71
《数据工程入门指南》(第3部分):
https://medium.com/@rchang/a-beginners-guide-to-data-engineering-the-series-finale-2cc92ff14b0
2
基本语言要求:Python
在Scratch平台上使用Python学习数据科学的完整教程:
https://www.analyticsvidhya.com/blog/2016/01/complete-tutorial-learn-data-science-python-scratch-2/
使用Python的数据科学导论:
https://trainings.analyticsvidhya.com/courses/coursev1:AnalyticsVidhya+DS101+2018T2/about
Codeacademy上学习Python课程:
https://www.codecademy.com/learn/learn-python
Allen Downey的《思考Python》:
http://www.greenteapress.com/thinkpython/thinkpython.pdf
Python 3的非程序员教程:
https://upload.wikimedia.org/wikipedia/commons/1/1d/Non-Programmer%27s_Tutorial_for_Python_3.pdf
3
操作系统知识
Linux服务器管理和安全:
https://www.coursera.org/learn/linux-server-management-security
CS401-操作系统:
https://learn.saylor.org/course/cs401
Raspberry Pi平台和Raspberry Pi的python编程:
https://www.coursera.org/learn/raspberry-pi-platform
4
数据库知识-SQL和NoSQL
免费学习SQL:
https://www.codecademy.com/learn/learn-sql
快速查找SQL命令的备忘录:
https://github.com/enochtangg/quick-SQL-cheatsheet
MYSQL教程:
http://www.mysqltutorial.org/
学习Microsoft SQL Server:
https://www.tutorialspoint.com/ms_sql_server/
PostgreSQL教程:
http://www.postgresqltutorial.com/
Oracle Live SQL:
https://livesql.oracle.com/apex/f?p=590:1000
NoSQL数据库
MongoDB来自MongoDB:
https://university.mongodb.com/courses/catalog
MongoDB简介:
https://www.coursera.org/learn/introduction-mongodb
学习Cassandra:
https://www.tutorialspoint.com/cassandra/index.htm
Redis Enterprise:
https://university.redislabs.com/
Google Bigtable:
https://www.coursera.org/learn/gcp-fundamentals
Couchbase:
http://training.couchbase.com/store
5
Hadoop、Hive、Pig、Spark、Kafka...
Hadoop基础知识:
https://cognitiveclass.ai/learn/hadoop/
Hadoop入门包:
https://www.udemy.com/hadoopstarterkit/
HortonWorks教程:
https://hortonworks.com/tutorials/
MapReduce简介:
https://www.analyticsvidhya.com/blog/2014/05/introduction-mapreduce/
Hadoop超越了传统的MapReduce-简版:
https://www.analyticsvidhya.com/blog/2014/11/hadoop-mapreduce/
《Hadoop详解》:
https://www.packtpub.com/packt/free-ebook/hadoop-explained
《Hadoop-你应该了解的》:
https://www.oreilly.com/data/free/hadoop-what-you-need-to-know.csp?intcmp=il-data-free-lp-lgen_free_reports_page
《使用MapReduce进行数据密集型文本处理》:
https://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf
Hadoop LinkedIn小组:
https://www.linkedin.com/groups/988957/profile
Apache Spark、RDD和Dataframes(使用PySpark)的综合指南:
https://www.analyticsvidhya.com/blog/2016/09/comprehensive-introduction-to-apache-spark-rdds-dataframes-using-pyspark/
初学者学习Spark R的详细指南:
https://www.analyticsvidhya.com/blog/2016/06/learning-path-step-step-guide-beginners-learn-sparkr/
Spark的基础知识:
https://cognitiveclass.ai/courses/what-is-spark/
ApacheSpark和AWS简介:
https://www.coursera.org/learn/bigdata-cluster-apache-spark-and-aws
涵盖Hadoop、Spark、Hive和Spark SQL的综合教程
大数据基础知识-HDF、MapReduce和Spark RDD:
https://www.coursera.org/learn/big-data-essentials
大数据分析-Hive、Spark SQL、DataFrames 和GraphFrames:
https://www.coursera.org/learn/big-data-analysis
大数据应用-实时流:
https://www.coursera.org/learn/real-time-streaming-big-data
使用Apache Kafka简化数据管道:
https://cognitiveclass.ai/courses/simplifyingdatapipelines/
Kafka官方文档:
https://kafka.apache.org/intro
用Kafka给数据科学家赋能:
https://multithreaded.stitchfix.com/blog/2018/09/05/datahighway/
6
基本的机器学习知识
学习机器学习基础知识的新手指南:
https://www.analyticsvidhya.com/blog/2015/06/machine-learning-basics/
机器学习算法基本知识:
https://www.analyticsvidhya.com/blog/2017/09/common-machine-learning-algorithms/
新手必读的机器学习和人工智能书籍:
https://www.analyticsvidhya.com/blog/2018/10/read-books-for-beginners-machine-learning-artificial-intelligence/
提升你知识和技能的24个终极数据科学项目:
https://www.analyticsvidhya.com/blog/2018/05/24-ultimate-data-science-projects-to-boost-your-knowledge-and-skills/
相关全文:https://www.toutiao.com/i6650363054452113927/
成都加米谷大数据培训机构,大数据开发,数据分析与挖掘,2019春节前预报名学费特惠活动,详情见加米谷大数据官网。