大数据

011 大数据与云计算-综合指南

2019-07-27  本文已影响0人  胡巴Lei特

011 Big Data And Cloud Computing – A Comprehensive Guide

1. Objective

1. 目标

This cloud computing tutorial for Big data and cloud computing will help you in learning Big data with Cloud technology to understand what is cloud storage, Big data in the cloud, characteristics of cloud computing, cloud computing services and cloud hosting, cloud data storage and deployment models, cloud computing companies and cloud service providers, cloud infrastructure, advantages of cloud computing and issues with cloud computing.

这个面向大数据和云计算的教程将帮助你用云技术学习大数据,了解什么是云存储、云上大数据、云计算的特点、 云计算服务和云托管、云数据存储和部署模式、云计算公司和云服务提供商、云基础设施、云计算的优势和云计算的问题.

Big Data and cloud computing

Big Data and cloud computing

2. Introduction to Big data and Cloud Computing

2. 大数据、云计算介绍

Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). It’s a virtualization framework.
It is like resource on demand whether it be storage, computing etc. Cloud follows pay per usage model. You need to pay the amount of resource you use.
This computing service by cloud charges you based only on the amount of computing resources we use. So for example, if you want to give demo to a client on a cluster of more than 100 machines and you do not have so many machines currently available with you, then in such case cloud computing plays a very important role.
Cloud plays an important role within the Big Data world, by providing horizontally expandable and optimized infrastructure that supports practical implementation of Big Data.

云计算是指通过网络 (通常是互联网) 作为服务提供的计算资源 (硬件和软件) 的使用.这是一个虚拟化框架.
无论是存储、计算等,它都像按需资源一样,云遵循按使用付费的模式.你需要支付你使用的资源数量.
云计算服务仅根据我们使用的计算资源数量向您收费. 例如,如果你想在 100 多台机器的集群上给客户端演示,而你目前没有这么多机器可用, 在这种情况下,云计算就扮演着非常重要的角色.
云在大数据领域发挥着重要作用,它提供了横向扩展和优化的基础设施,支持大数据的实际实施.

3. Cloud Computing and Big Data

3. 云计算、大数据

In cloud computing, all data is gathered in data centers and then distributed to the end-users. Further, automatic backups and recovery of data is also ensured for business continuity, all such resources are available in the cloud. We do not know exact physical location of these resources provided to us. You just need dummy terminals like desktops, laptops, phones etc. and a net connection.
There are multiple ways to access the cloud:

  1. Applications or software as a service (SAAS) ex. Salesforce.com, dropbox, google drive etc.
  2. Platform as a service (PAAS)
  3. Infrastructure as a service (IAAS)

在云计算中,所有的数据都集中在数据中心,然后分发给终端用户.此外,为了业务连续性,还确保了数据的自动备份和恢复,所有这些资源都可以在云中获得.我们不知道提供给我们的这些资源的确切物理位置.你只需要像台式机、笔记本电脑、手机等虚拟终端和网络连接.
访问云有多种方式:

  1. 应用程序或软件即服务 (SAAS) Salesforce.com 、 dropbox 、 google drive 等.
  2. 平台即服务 (PAAS)
  3. 基础设施即服务 (IAAS)

4. Features of Cloud Computing

4. 云计算的特点

Let us see few features of cloud computing:

让我们看看云计算的几个特点:

a. Scalability

a. 可扩展性

Scalability is provided by using distributed computing

分布式计算提供了可扩展性

b. Elasticity

b. 弹性

Customers are allowed to use and pay for only that much resource which it is using. In cloud computing, elasticity is defined as the degree to which a system is able to adapt to workload changes in an autonomic manner, so that at any time the available resources match the current demand as closely as possible.

客户只允许使用和支付它正在使用的那么多资源.在云计算中,弹性被定义为系统能够以自主的方式适应工作负载变化的程度, 因此,在任何时候,可用资源都尽可能地与当前需求相匹配.

c. Resource Pooling

c. 资源池

Same resources are allowed to be used by multiple organizations. The computing resources are pooled for serving various consumers via multi-tenant model, with different resources dynamically assigned and reassigned according to consumer demand.

多个组织允许使用相同的资源.通过多租户模型将计算资源汇集起来,为不同的消费者提供服务,并根据消费者的需求动态分配和重新分配不同的资源.

d. Self service

d. 自助服务

Customers are provided easy to use interface through which they can choose services they want. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed without requiring human interaction.

为客户提供了易于使用的界面,通过该界面他们可以选择他们想要的服务.消费者可以根据需要单方面提供计算能力,如服务器时间和网络存储,而无需人工交互.

e. Low Costs

e. 低成本

It charges you based only on the amount of computing resources we use and you need not buy expensive infrastructure. Pricing on a utility computing basis is usage-based and fewer IT skills are required for implementation.

它只根据我们使用的计算资源数量向您收费,您不需要购买昂贵的基础设施.基于效用计算的定价是基于使用的,实施所需的 IT 技能更少.

f. Fault Tolerance

f. 容错性

Allows recovery in case of a part in cloud system fails to respond.

允许在云系统中的某个部件无法响应的情况下进行恢复.

5. Cloud Deployment Models

5. 云部署模式

There are mainly 2 types of cloud deployments models:

云部署模式主要有两种:

6. Cloud Delivery Models

6. 云交付模式

Cloud services are categorized as below:

  1. Infrastructure as a service (IAAS): It means complete infrastructure will be provided to you. Maintenance related tasks will be done by cloud provider and you can use it as per your requirement. It can be used as public and private both.
    Examples of IaaS are virtual machines, load balancers, and network attached storage.
  2. Platform as a service (PAAS): Here we have object storage, queuing, databases, runtime etc. All these we can get directly from the cloud provider. It’s our responsibility to configure and use that. Providers will give us the resources but connectivity to our database and other similar activities are our responsibility. Examples of PaaS are Windows Azure and Google App Engine (GAE).
  3. Applications or software as a service (SAAS) ex. Salesforce.com, dropbox, google drive etc. Here we do not have any responsibility. We are using the application that is running on the cloud. All infrastructure setup is the responsibility of the service provider. For SaaS to work, the infrastructure (IaaS) and the platform (PaaS) must be in place.

云服务分类如下:

  1. 基础设施即服务 (IAAS): 这意味着将向您提供完整的基础设施.云提供商将完成与维护相关的任务,您可以根据自己的要求使用它.公共和私人都可以使用它.
    IaaS 的例子包括虚拟机、负载均衡器和网络连接存储.
  2. 平台即服务 (PAAS): 这里我们有对象存储、队列、数据库、运行时等,所有这些都可以直接从云提供商那里获得.配置和使用它是我们的责任.提供商将为我们提供资源,但我们有责任连接我们的数据库和其他类似活动.Windows 的例子有 Windows Azure 和 Google App Engine (GAE).
  3. 应用程序或软件即服务 (SAAS)在这里,我们没有任何责任.我们正在使用在云上运行的应用程序.服务提供商负责所有基础设施的设置.SaaS 要工作,必须有基础设施 (IaaS) 和平台 (PaaS).

7. Cloud for Big Data

7. 云之大数据

Below are some examples of how cloud applications are used for Big Data:
**IAAS in a public cloud: **Using a cloud provider’s infrastructure for Big Data services, gives access to almost limitless storage and compute power. IaaS can be utilised by enterprise customers to create cost effective and easily scalable IT solutions where cloud providers bear the complexities and expenses of managing the underlying hardware. If the scale of a business customer’s operations fluctuate, or they are looking to expand, they can tap into the cloud resource as and when they need it rather than purchase, install and integrate hardware themselves.
**PAAS in a private cloud: **PaaS vendors are beginning to incorporate Big Data technologies such as Hadoop and MapReduce into their PaaS offerings, which eliminate the dealing with the complexities of managing individual software and hardware elements. For example, web developers can use individual PaaS environments at every stage of development, testing and ultimately hosting their websites. However, businesses that are developing their own internal software can also utilise Platform as a Service, particularly to create distinct ring-fenced development and testing environments.
**SAAS in a hybrid cloud: **Many organizations feel the need to analyse the customer’s voice, especially on social media. SaaS vendors provide the platform for the analysis as well as the social media data. Office software is the best example of businesses utilising SaaS. Tasks related to accounting, sales, invoicing and planning can all be performed through SAAS. Businesses may wish to use one piece of software that performs all of these tasks or several that each perform different tasks. The software can be subscribed through internet and then accessed online via any computer in the office using a username and password. If needed, they can switch to software that fulfills their requirements in better manner. Everyone who needs access to a particular piece of software can be set up as a user, whether it is one or two people or every employee in a corporation that employs hundreds.

以下是云应用程序如何用于大数据的一些示例:
公有云中的 IAAS: 将云提供商的基础架构用于大数据服务,可以获得几乎无限的存储和计算能力.企业客户可以利用 IaaS 创建经济高效、易于扩展的 IT 解决方案,其中云提供商承担管理底层硬件的复杂性和费用.如果业务客户的运营规模波动,或者他们希望扩大规模,他们可以在需要时利用云资源,而不是购买云资源, 硬件本身的安装和集成.
私有云中的 PAAS: PaaS 供应商开始将 Hadoop 和 MapReduce 等大数据技术整合到他们的 PaaS 产品中,这消除了管理单个软件和硬件元素的复杂性.例如,web 开发人员可以在开发、测试和最终托管网站的每个阶段使用单独的 PaaS 环境.然而,正在开发自己内部软件的企业也可以利用平台即服务,特别是创建不同的环网化开发和测试环境.
混合云中的 SAAS: 许多组织认为有必要分析客户的声音,尤其是在社交媒体上. SaaS 供应商为分析和社交媒体数据提供了平台.办公软件是使用 SaaS 的企业的最佳例子.与会计、销售、开票和计划相关的任务都可以通过 SAAS 执行.企业可能希望使用一个执行所有这些任务的软件,或者使用几个执行不同任务的软件.该软件可以通过互联网订阅,然后使用用户名和密码通过办公室的任何计算机在线访问.如果需要,他们可以以更好的方式切换到满足需求的软件.每个需要访问特定软件的人都可以作为用户来设置,无论是一两个人,还是一家拥有数百名员工的公司.

8. Providers in the Big Data Cloud Market

8. 大数据云计算市场供应商

Cloud computing companies come in all shapes and sizes. All large software vendors either have already started offerings in cloud space, or are in the process of launching one. In addition there are many startups that have interesting products in cloud space. Here we have a list of major vendors of cloud computing. Few of the cloud providers are google, citrix, netmagic, redhat, rackspace etc. Amazon (aws) is the leading cloud provider amongst all. Microsoft is also providing cloud services and it is called as azure.
Infrastructure as a Service cloud computing companies:

各种规模的云计算公司都有.所有大型软件供应商要么已经开始在云空间提供产品,要么正在推出产品.此外,还有很多初创公司在云领域推出了有趣的产品.在这里,我们列出了云计算的主要供应商.很少有云提供商是 google、 citrix 、 netmagic 、 redhat 、 rackspace 等. Amazon(aws) 是所有云提供商中领先的.微软也提供云服务,被称为 azure.

云计算公司的基础设施即服务:

Platform as a Service cloud computing companies

云计算公司平台即服务

Software as a Service companies

软件即服务公司

9. Issues in Using Cloud Services

9. 使用云服务时的问题

Some important cloud services issues are as listed:

列出了一些重要的云服务问题:

a. Data Security

a. 数据安全

Organizations must ensure that their agreement with the cloud service provider ensure data security. Handing over private data to others worries some people. Corporate executives might hesitate to take advantage of a cloud computing system because they can’t keep their company’s information under lock and key.

公司必须确保与云服务提供商的协议确保数据安全.一些人担心将私人数据交给其他人.企业高管可能会犹豫是否利用云计算系统,因为他们无法将公司的信息保密.

b. Performance

b. 性能

Parameters of cloud performance must be specified in the agreement and quantified wherever possible. Exceptions must be clearly noted. Service-Level Agreement (SLA) should clearly state all the terms and conditions between a service user and a service provider to ensure proper performance.

必须在协议中指定云性能的参数,并尽可能量化. 必须明确指出例外情况.服务级别协议 (SLA) 应明确说明服务用户和服务提供商之间的所有条款和条件,以确保适当的性能.

c. Compliance

c. 合规性

Cloud services must be compatible with the compliance needs of the business. Some companies are also concerned about regulatory issues. Market observers say that around 50 percent people worry that they will be tied to one provider of cloud storage.

云服务必须与业务的合规性需求相兼容.一些公司也担心监管问题.市场观察人士说,大约 50% 人担心他们将与一家云存储提供商联系在一起.

d. Legal Issues

d. 法律问题

Organization must ensure that the location of the physical resources of the cloud does not bring any legal issue. The cloud presents a number of legal challenges towards privacy issues involved in data stored in multiple locations in the cloud, additionally increasing the risk of confidentiality and privacy breaches.

组织必须确保云物理资源的位置不会带来任何法律问题.云对存储在云中多个位置的数据涉及的隐私问题提出了一些法律挑战,此外还增加了保密和隐私泄露的风险.

e. Costs

e. 成本

Organizations should be aware of all the costs involved with the use of cloud, and use the services in a controlled manner as cloud offers pay as per usage method of the cost incurred by the company.

组织应该了解使用云所涉及的所有成本,并以受控的方式使用服务,因为云按照公司发生的成本的使用方法提供支付.

https://data-flair.training/blogs/big-data-and-cloud-computing-comprehensive-guide

上一篇下一篇

猜你喜欢

热点阅读