【Tableau】Data Visualization and
About this Specialization and Course
Welcome to the Course!
将会学到
- Data analysis and modeling
- Data communication and storytelling
学习路线
- Define Context -> Ask questions and make hypotheses
- Data Analysis -> Data Visualization
- Write your Data Story -> Data Visualization
- Tell your Data Story -> Data Communication and persuasion
课程结束将掌握
- Craft right questions to ensure your analysis projects succeed
- Leverage questions to design and implement logical structured analysis plans
- Create import graphs in Tableau
- Transform data and make dashboards in Tableau
- Tell data stories
- Design effective slide presentations to showcase your data story
- Deliver compelling business presentations
Introduction 1
Tips for becoming a Data Analyst
-
Ask Questions, Nourish Curiosity and Embrace Unknown.
- a. like learning
- b. self-motivated to try new things
- c. easily adapt to new environments
-
Start Thinking about Everything You See as a Dependent and Independent Variable.
- Numbers can be divided into dependent and in dependent variables.
- Dependent variable is the measure you are most interested in understanding. (Dependent on or changes in response to other factors (Independent variables)).
-
Start Exploring the Advantages of Continuous vs. Discrete Variables.
- Discrete: Easier to understand, Often less precise(bar graphs)
- Continuous: Harder to interpret by eye, Often detailed information(line graphs)
-
Listen and Contribute (data analysis projects are almost always collaborations!).
-
Train Your Skepticism Muscle.
当有人极度自信的时候要保持怀疑的态度,事情可能比人认为的更加复杂和凌乱。 -
Seek Details.
-
Cherish Precision.
-
Best Practice do NOT Equal Common Practices.
-
Expectations Matter!
如果分析结果与预期不符合,那么无论分析工作做得再好也无济于事。 -
Put Yourself in Other People's Shoes.
You need to understand other people's perspective.
Asking the Right Questions
When I recruit for Business Intelligence/Business Analysis roles, it's import that the students have the following coursework/knowledge...
Top 3 Responses:
- Communication Skills
- SQL and Query Skills
- Basic Analytics
Ask a lot of right questions. Be a Sponge and soak as much information as you possible can in the time your project allows. To achieve the goal of put yourself in others' shoes. Keep ask opening ended questions.
Asking more questions reduce the need to have all the answers. -- Donald Peterson
Asking Questions Before You Have Data
SMART Objectives
Questions:
- What problems is this business having that you hope to solve by developing this project?
- Can you tell me more about how this problem is affecting the business?
- What is your ideal outcome of this project?
Remember to really listen to the answers of stakeholders because this is your chance to figure out what their needs are.
- Specific
- Measurable
- Attainable
- Relevant
- Time_bound
Specific and Measurable
How should my business metric (Dependent Variable) change if my recommendations are put into action, and by how much? -> Make it very clear what you are dealing with.
What gets measured gets improved. -- Peter Ducker
An example:
Before version: Increase the number of returning visitors1 on a month-by-month basis2 by 15% compared to the same month last year3.
1: Dependent Variable(Rows)
2,3: Columns needed (Date and returning visitors during same time last year)
After version: The goal of the project is, with in two months, analysis achieved click-stream data to determine the website changes that will most efficiently increase revenues by 15% compared to the same month last year.
Final steps
- Make sure your first draft of the SMART goals to your stakeholders.
- go back and forth with them until everybody signs off on the goals.
- Depending on the context, you can supplement the goals with a document specifying things like other import deadlines, who will assess the project, and what types of business process changes are or are not up for grabs .
Your Stakeholders
Listening to Stakeholders during Elicitation
Elicitation: The process of drawing out or bringing forth.
Elicit information from groups, plan and conduct group elicitation sessions with working groups to assess alternatives, uncertainties and value and risk preferences. -- Job description of SAS Data Scientist
What do you do during Elicitation? -- Ask questions.
Thinking about elicitation sessions as trying to achieve three main goals.
1. Identify your key stakeholders.
《Five Questions to Identify Key Stakeholders》
2. Identify Independent Variables to Test
-
Include in-person meetings with as many of your potential stakeholders as you can, especially with people who you know are going to be highly impacted by your project.
-
Good questions:
- What have been tried before?
- How did it turn out?
- What do you think might solve this problem?
-
The answers to these questions will give you idea about what factors you should operationalize as independent variables in your analyses.
They will also help to learn about potential constraints the business process changes you will ultimately suggest as the end of the project. -
Learn as much as you can about what types of changes are easy or hard to do, given your company's business culture or data architecture.
3. Determine whether stakeholders agree about problem to be solved
To summarize, part of your job as a business analyst and certainly a part of what will make you most successful, will be keeping thoughtful tabs on who participates on the context in which you are trying to solve your business problem.
Throughout your project, keep talking to people, keep asking questions, keep listening and keep thinking about how you can turn what people say into variables you can test. No matter how long you've been in a specific industry, your stakeholders are the best domain experts in the specific problem you are trying to solve.
Asking them questions and listening to their answers will be the best way you can ultimately take advantage of their business knowledge in your data analysis.
Stakeholders Expectations Matter
Four level of analytics:
1. Descriptive analytics: What's happening
2. Diagnostic analytics: Why things are happening (Finding root causes)
3. Predictive analytics: What is going to happen (Forecasting)
4. Prescriptive analytics: Giving recommendations
Pay attention to the analytic tools and results they are willing to work with.
Structure Pyramid Analysis Plans
Using SPAPs to Structure Your Thinking
Having a Structured Data Analysis Plan Makes Sure You:
- Stay on task
- Identify gaps in your thinking
- Explain your thinking to others
- Report your progress
- Can work as a team
SPAP StructureStructured
Pyramid
Analysis
Plan
Using SPAP to Create Insights
Visualization strategy
-
To make one or two charts(restricting yourself to bar charts, scatter plots, and line charts during this phase) to assess every single category specifically in layers two and three in your SPAP. (Hold off on making visualizations for deeper layers for right now. If you don't see an obvious effect at these top layers, you're less likely to see them at more detailed layers. )
-
Briefly describe what your charts would look like next to your variable or in the bottom layer of your pyramid so that they're easy to keep track of.
-
One by one, work through each of the categories in layers two and three to see if any of the charts unearth obvious patterns.
For each graph you make, ask yourself do any patterns stand out to me or catch my eye?- If not, highlight or mark the category with a color that means not likely to lead to an insight, and don't go any further down the pyramid tracks beneath that category.
- If it looks like maybe there is some kind of relationship between your smart metric and your independent variable, mark it with a color or symbol that represents how likely you think it will be to lead to an insight.
- If there is definitely a relationship between your smart metric and your independent variable, mark it with a symbol that means come back to me (work your way down all of the layers of that part of the pyramid).
-
Making graphs for each layer of subcategories until you think you have a good strong hypothesis about what's going on and why your smart metric is being impacted by these variables.
-
As you go through the layers of your pyramid, make sure to incorporate what you learn into your plan. Add new hypotheses or cross out ones you know are no longer relevant. Look at what patterns emerge and refine and streamline your hypothesis. (For example, if you found two possible categories of factors that likely have very strong relationships with your S.M.A.R.T. Metric, it will likely be important to examine how these factors interact and you'll likely want to add that to your analysis plan.)
-
Remember to continue to get feedback throughout the process as well, to make sure what you are working on makes sense and is useful.
-
Eventually by the end of this process, you will whittle you way down to the factors that seem to have the strongest relationship with your SMART Metric.
-
By the time you have worked through the entire SPAP
- You will either have a strong hypothesis about what business changes could be implemented to achieve your smart goal, or you will have a much clearer idea of what other resources you need to finish the project.
- Further, just by virtue of going through the process, you will have a way to document what you've done for your team and stakeholders, and have a mechanism for splitting work up with other members of your team if there's more work to do.
Data Visualization with Tableau
Introduction 2
Use Data Visualization to Drive Your Analysis
Why Tableau?
Using Tableau to Determine How Much You Can Make as a Data Analyst
Meet Your Salary Data
Columns:
- Offered Yearly Wage
- Prevailing Yearly Wage
- Decision Date
- Company
- City
- Nationality
- Visa Type
- Job Title Sub-category
Just for 2015 Data:
- Years required Experience
- Education Level Required
- Data Application Received
Meet Your Dognition Data
Our Analysis Plan
Let's Get Started!
Salaries of Data-Related Jobs:Your First Graph
- Load the data
- Correct the columns into Dimensions and Measures
- In this video:
- Dependent variable: Paid Wage Per Year -> Drag to Row Shelf
- Independent variable: Job Title Subgroup -> Drag to Column Shelf
- Capsules:
- Green Ones: Measures
- Blue Ones: Dimensions
Formatting and Exporting Your First Graph
Digging Deeper Using the Rows and Columns Shelves
Population Standard Deviation:
- Data represents the general population very well
- Don't care if the data relates to the general population.
- Just want to describe the specific data set and don't care how that relates to other data sets.
Standard Deviation:
- Interpret the data as if it does represent the general population.
- Probably have some type of biased sample or may not be the perfect representation of the general population.
Working with the Marks Card
Understanding the Masks Cards
Outliers, Filtering and Groups
Removing Outliers using Scatter Chart
- 标准偏差
- 分析-聚合度量
- 筛选器(设定具体值)
- 选中某些点-组(将异常值按照case number分为一组,将case number 放在筛选器里,选择others)
Analyzing Data-related Salaries in Different State Using Filtering and Groups
- 在State中创建组(将州的全称和缩写合并为一组)
Line Graphs and Box Plots
When to Use Line Graphs
Dates as Hierarchical Dimensions or Measures
- 层级:年-季度-月
- 新建层级:将一个dimension拖到另一个dimension上
Analyzing Data-related Salaries over Time Using Date Hierarchies
Analyzing Data-related Salaries over Time Using Trend Lines
Analyzing Data-related Salaries over Time Using Box Plots
Dynamic Data Manipulation and Presentation in Tableau
Introduction 3
Customizing and Sharing New Data in Tableau
Row-level Calculations
Tableau Calculation types
How to Write Calculations
- 连接两个表格
- 在计算表格里,变量的颜色是橙色,不同类型的函数颜色不同(浅蓝色:聚合计算)
Calculations that Make Filtering More Efficient
- Row calculations: make a different number for every single row of your data. (Make a new column in Excel)
- 可以用计算用作筛选器(if, case when end 语句)
Identifying Companies that Pay Less than the Prevailing Wage
Blending and Aggregation-level Calculations
Blending Price Parity Data with Our Salary Data
- 当一个大表和小表结合(blend)时,小表为primary table,大表为secondary table。
- 特点:
- 在结果中不会出现小表内容之外的信息;出现在小表中但不在大表中的信息会显示为null;
- 为与小表混合的大表的field将会变成measures(度量)而且不能重新编辑成维度,因为你将用这些维度来定义混合表时创建的细节层级(level of detail)
Adjusting Data-Related Salaries for Cost of Living
Table Calculations and Parameters
Calculating which State have the Top Adjusted Salaries within Job Subcategories
- 表计算(当column中有多个胶囊时,排序结果奇怪,为解决这个问题,我们采取计算的方法让Tableau知道可视化工作区中正在发生什么)
- 具体操作:将数据拖到Detail中,建立表计算,Rank,不选中任何一种硬编码的排序方式(Never),直接进入Advanced,选择想要依照排序的category,将表计算放在一个较大的维度胶囊后面(以较大维度胶囊为分类依据),转换成维度(如果之前是度量)
Using Parameters to Define Top States
- Creat Parameter
- 创建计算,rank<=Top X (logical类型 布尔)
- 将计算拖到标记区域进行表计算编辑 然后放到筛选器里面,选择为真
Dashboards and Story Points
Calculation which Companies have the Top Adjusted Salaries within Job Subcategories
Designing a Dashboard to Determine Where you Should Apply for a Data-Related Job
- Action的创建
- 某个图的筛选器可用于dashboard里面的多个表