最近看过一篇与Hadoop有关的英文文档,其实就是一本书里的一部分内容。觉得很好,基本阐述了一个hadoop管理员的职责。平时,工作当中接触到hadoop的朋友,可以看下,这篇文档中所描述的知识和技能,大家是否都已经具备了?
译文:
一个Hadoop管理员的职责
随着对大数据日益增长的兴趣和洞察力,各个组织正在积极计划或者组建他们的大数据团队。要开始进行数据工作,他们需要一个良好而扎实的基础架构。
一旦他们具备基础架构,他们就须要针对集群的维护,管理和排错进行控制和指定策略。
市场对Hadoop管理员的需求日益增长,他们的工作(创建和维护集群)使得数据分析成为真正的可能。
Hadoop管理员在网络,操作系统,和存储方面,须要很好的系统操作技能。在复杂的网络环境中,对于计算机硬件和硬件操作,他们需要具备大量的知识。
Apache Hadoop软件主要运行在Linux操作系统,所有必须对Linux操作系统具备诸如:监控,排错,配置,安全管理等这些技能。
为集群设置节点涉及很多重复性的工作,Hadoop管理员应该使用快速而有效率的方法把这些服务器使用起来,比如使用Puppet,Chef和CFEngine这样的管理工具.
除了这些工具,管理也应该具备良好的规划技能去设计和规划集群.
在一个集群中许多节点须要复制数据,比如,namenode守护进程的fsimage文件,可以被配置为写入相同节点的不同硬盘,或者写入不同节点。
所以hadoop管理员须要理解NFS挂载点以及如何配合集群来建立NFS挂载.管理员也可能被要求在特定的节点上配置磁盘RAID.
因为Hadoop所有的服务和守护进程都是建立在Java之上,所以JVM(Java Virtual Machine Java虚拟机)的基本知识,和对Java异常的理解将会非常有用.
这些知识能够帮助管理员快速的确认问题.
Hadoop管理员应具备进行基准测试的技能,能够在高流量的场景下测试集群的性能.
集群总是在持续不断的运行,并处理大量的数据,所以集群比较容易出现故障.为了监控集群的健康状况,管理员须要部署监控工具,诸如:Nagios 和 Ganglia等等.
并且管理员须要为关键节点配置告警和监控,在出现问题之前,提前预见到问题.
具备良好的脚步语言编程知识,诸如: Python,Ruby, 或者 Shell,将会极大的帮助到Hadoop管理员.
通常,Hadoop管理员会被要求把一些预定的文件从外部文件源,分期的导入至HDFS. 脚步技能可以帮助管理员通过执行脚本来自动化地管理这些工作.
最重要的是,Hadoop管理员应该很好的了解Apache Hadoop的体系结构和它的内部运作.
下面这些项目是Hadoop管理员必须掌握的一些关键hadoop操作:
规划集群,评估集群须要处理的数据量,以此来决定集群中的节点数量.
在集群上安装和升级Apache Hadoop.
通过使用Hadoop的各种配置文件来配置和调试Hadoop.
理解所有Hadoop守护进程,以及它们在集群中的角色和承担的职责.
Hadoop 管理员应该知如何阅读和解释Hadoop的日志.
在集群中添加和删除节点.
在集群中重新平衡节点.
使用认证和认证系统来启用安全机制,比如Kerberos
几乎所有的组织都会遵循一定的策略来备份他们的数据,执行数据备份工作是Hadoop管理员的责任.
所以Hadoop管理员应该熟悉服务器的备份和恢复操作.
原文:
Responsibilities of a Hadoop administrator
With the increase in the interest to derive insight on their big data,
organizations are now planning and building their big data teams aggressively.
To start working on their data, they need to have a good solid infrastructure.
Once they have this setup, they need several controls and system policies in place to maintain, manage,and troubleshoot their cluster.
There is an ever-increasing demand for Hadoop Administrators in the market
as their function (setting up and maintaining Hadoop clusters) is what makes analysis really possible.
The Hadoop administrator needs to be very good at system operations, networking, operating systems, and storage.
They need to have a strong knowledge of computer hardware and their operations, in a complex network.
Apache Hadoop, mainly, runs on Linux. So having good Linux skills such as monitoring, troubleshooting, confguration, and security is a must.
Setting up nodes for clusters involves a lot of repetitive tasks
and the Hadoop administrator should use quicker and effcient ways to bring up these servers using confguration management tools
such as Puppet, Chef, and CFEngine.
Apart from these tools, the administrator should also have good capacity planning skills to design and plan clusters.
There are several nodes in a cluster that would need duplication of data,
for example, the fsimage file of the namenode daemon can be confgured to write to two different disks on the same node
or on a disk on a different node.
An understanding of NFS mount points and how to set it up within a cluster is required.
The administrator may also be asked to set up RAID for disks on specifc nodes.
As all Hadoop services/daemons are built on Java,
a basic knowledge of the JVM along with the ability to understand Java exceptions would be very useful.
This helps administrators identify issues quickly.
The Hadoop administrator should possess the skills to benchmark the cluster to test performance under high traffc scenarios.
Clusters are prone to failures as they are up all the time and are processing large amounts of data regularly.
To monitor the health of the cluster, the administrator should deploy monitoring tools such as Nagios and Ganglia
and should confgure alerts and monitors for critical nodes of the cluster to foresee issues before they occur.
Knowledge of a good scripting language such as Python, Ruby, or Shell would greatly help the function of an administrator.
Often, administrators are asked to set up some kind of a scheduled file staging from an external source to HDFS.
The scripting skills help them execute these requests by building scripts and automating them.
Above all, the Hadoop administrator should have a very good understanding of the Apache Hadoop architecture and its inner workings.
The following are some of the key Hadoop-related operations that the Hadoop administrator should know:
Planning the cluster, deciding on the number of nodes based on the estimated amount of data the cluster is going to serve.
Installing and upgrading Apache Hadoop on a cluster.
Confguring and tuning Hadoop using the various confguration files available within Hadoop.
An understanding of all the Hadoop daemons along with their roles and responsibilities in the cluster.
The administrator should know how to read and interpret Hadoop logs.
Adding and removing nodes in the cluster.
Rebalancing nodes in the cluster.
Employ security using an authentication and authorization system such as Kerberos.
Almost all organizations follow the policy of backing up their data
and it is the responsibility of the administrator to perform this activity.
So, an administrator should be well versed with backups and recovery operations of servers
另外有需要云服务器可以了解下创新互联scvps.cn,海内外云服务器15元起步,三天无理由+7*72小时售后在线,公司持有idc许可证,提供“云服务器、裸金属服务器、高防服务器、香港服务器、美国服务器、虚拟主机、免备案服务器”等云主机租用服务以及企业上云的综合解决方案,具有“安全稳定、简单易用、服务可用性高、性价比高”等特点与优势,专为企业上云打造定制,能够满足用户丰富、多元化的应用场景需求。
分享题目:一个Hadoop管理员的职责(翻译)-创新互联
标题路径:http://www.jxjierui.cn/article/dsoioc.html