Region Server的启动流程

  其他常见问题
内容纲要

概要描述


本文主要讲述hyperbase组件在启动过程中,region server的启动流程,文中穿插重要步骤日志的信息。

关键步骤


1.打印启动命令行和jvm相关信息

2020-06-22 20:13:01,974 INFO org.apache.hadoop.hbase.util.VersionInfo: HBase 1.3.1-transwarp-6.0.2
2020-06-22 20:13:01,975 INFO org.apache.hadoop.hbase.util.VersionInfo: Source code repository git://es541-jrq3t/home/jenkins/workspace/3-hyperbase-1.3.1-postcommit-build-push/hbase-community revision=f9e53b95883ad83247e33cdd5d839799866a45ff
2020-06-22 20:13:01,975 INFO org.apache.hadoop.hbase.util.VersionInfo: Compiled by jenkins on Fri Mar 8 11:31:31 CST 2019
2020-06-22 20:13:01,975 INFO org.apache.hadoop.hbase.util.VersionInfo: From source with checksum ecb77e98ecd8da2d0a54076e7541939a
2020-06-22 20:13:02,224 INFO org.apache.hadoop.hbase.util.ServerCommandLine: env:PATH=/usr/java/jdk1.8.0_25/bin:/usr/java/jdk1.7.0_71//bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/guardian-utils/bin:/usr/lib/zookeeper/bin:/usr/lib/transwarp/scripts:/usr/lib/hbase/bin:/usr/lib/transwarp/scripts
2020-06-22 20:13:02,224 INFO org.apache.hadoop.hbase.util.ServerCommandLine: env:HBASE_THRIFT_SERVER_MEMORY=24000m
...
...
2020-06-22 20:13:02,227 INFO org.apache.hadoop.hbase.util.ServerCommandLine: env:HOME=/root
2020-06-22 20:13:02,228 INFO org.apache.hadoop.hbase.util.ServerCommandLine: vmName=Java HotSpot(TM) 64-Bit Server VM, vmVendor=Oracle Corporation, vmVersion=25.25-b02
2020-06-22 20:13:02,228 INFO org.apache.hadoop.hbase.util.ServerCommandLine: vmInputArguments=[-agentpath:/usr/lib/hadoop/lib/native/libagent.so, -agentpath:/usr/lib/hbase/lib/native/libjvm_agent.so, -XX:OnOutOfMemoryError=kill -9 %p, -Dproc_regionserver, -verbose:gc, -Xms1024m, -Xmx24000m, -XX:MaxPermSize=512m, -Dsun.net.inetaddr.ttl=60, -XX:+HeapDumpOnOutOfMemoryError, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=80, -XX:+CMSClassUnloadingEnabled, -XX:+ExplicitGCInvokesConcurrent, -XX:+UseCMSCompactAtFullCollection, -XX:CMSFullGCsBeforeCompaction=0, -XX:+UseParNewGC, -XX:NewRatio=3, -XX:NewSize=512m, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintGCTimeStamps, -Dhbase.log.dir=/var/log/hyperbase1, -Dhbase.log.file=hbase-hbase-regionserver-bpnode1.log, -Dhbase.home.dir=/usr/lib/hbase/, -Dhbase.id.str=hbase, -Dhbase.root.logger=INFO,DRFA, -Djava.library.path=/usr/lib/hadoop/lib/native:/usr/lib/hbase/bin/../lib/native/Linux-amd64-64, -Dhbase.security.logger=INFO,RFAS, -Djava.net.preferIPv4Stack=true, -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.ssl=false, -Dcom.sun.management.jmxremote.password.file=/etc/hyperbase1/conf/jmxremote.passwd, -Dcom.sun.management.jmxremote.access.file=/etc/hyperbase1/conf/jmxremote.access, -Dcom.sun.management.jmxremote.port=10102, -Xms12000m, -Xmx24000m, -verbose:gc, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintGCTimeStamps]

2. 初始化HRegionServer

2.1 获取配置信息,初始化RPC服务, 最终通过SimpleRpcSchedulerFactory生成调度器

2020-06-22 20:13:02,382 INFO org.apache.hadoop.hbase.regionserver.RSRpcServices: Using hostname: bpnode1
2020-06-22 20:13:02,449 INFO org.apache.hadoop.hbase.regionserver.RSRpcServices: regionserver/bpnode1/11.8.8.184:60020 server-side HConnection retries=350
2020-06-22 20:13:02,595 INFO org.apache.hadoop.hbase.ipc.SimpleRpcScheduler: Using fifo as user call queue, count=10
2020-06-22 20:13:02,611 INFO org.apache.hadoop.hbase.ipc.RpcServer: regionserver/bpnode1/11.8.8.184:60020: started 10 reader(s) listening on port=60020

2.2 文件系统相关配置:fs.defaultFS,hbase校验,获取根目录,创建文件系统表描述

2.3 初始化ZKwatcher ,启动BaseCoordinatedStateManager

2020-06-22 20:13:03,957 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=regionserver:60020 connecting to ZooKeeper ensemble=bpnode7:2181,bpnode8:2181,bpnode3:2181
2020-06-22 20:13:03,964 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-transwarp--1, built on 10/17/2017 07:28 GMT
2020-06-22 20:13:03,964 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=bpnode1
...
2020-06-22 20:13:04,050 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to bpnode7/11.8.8.190:2181, initiating session
2020-06-22 20:13:04,056 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server bpnode7/11.8.8.190:2181, sessionid = 0xb72dbc8308f05b9, negotiated timeout = 180000
...
2020-06-22 20:13:04,091 DEBUG org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hyperbase1/tokenauth/keymaster already exists
2020-06-22 20:13:04,092 INFO org.apache.hadoop.hbase.zookeeper.ZKLeaderManager: Found existing leader with ID: bpnode1,60020,1592827982743

2.4 启动RPC Service

2020-06-22 20:13:04,121 INFO org.apache.hadoop.hbase.ipc.RpcServer: RpcServer.responder: starting
2020-06-22 20:13:04,122 INFO org.apache.hadoop.hbase.ipc.RpcServer: RpcServer.listener,port=60020: starting
2020-06-22 20:13:04,122 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.default.handler=0,queue=0,port=60020
2020-06-22 20:13:04,122 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.default.handler=1,queue=1,port=60020
....
2020-06-22 20:13:04,154 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.priority.handler=0,queue=0,port=60020
2020-06-22 20:13:04,154 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.priority.handler=1,queue=1,port=60020
2020-06-22 20:13:04,154 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.priority.handler=2,queue=0,port=60020
2020-06-22 20:13:04,155 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.priority.handler=3,queue=1,port=60020
...
2020-06-22 20:13:04,177 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.replication.handler=0,queue=0,port=60020
2020-06-22 20:13:04,177 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.replication.handler=1,queue=0,port=60020
2020-06-22 20:13:04,177 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.replication.handler=2,queue=0,port=60020
2020-06-22 20:13:04,177 DEBUG org.apache.hadoop.hbase.ipc.RpcExecutor: Started RpcServer.FifoWFPBQ.replication.handler=3,queue=0,port=60020

2.5 启动webUI

2020-06-22 20:13:04,267 INFO org.apache.hadoop.hbase.http.HttpRequestLog: Http request log for http.requests.regionserver is not defined
2020-06-22 20:13:04,281 INFO org.apache.hadoop.hbase.http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2020-06-22 20:13:04,281 INFO org.apache.hadoop.hbase.http.HttpServer: Added global filter 'clickjackingprevention' (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter)
2020-06-22 20:13:04,283 INFO org.apache.hadoop.hbase.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context regionserver
2020-06-22 20:13:04,283 INFO org.apache.hadoop.hbase.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2020-06-22 20:13:04,283 INFO org.apache.hadoop.hbase.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2020-06-22 20:13:04,297 INFO org.apache.hadoop.hbase.http.HttpServer: Jetty bound to port 60030
2020-06-22 20:13:04,297 INFO org.mortbay.log: jetty-6.1.26
2020-06-22 20:13:04,685 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:60030

2.6 创建ChoreService,启动compactedFileDischarger

创建compactedFileDischarger的chore服务,默认2分钟调用一次,用来移除compacted 文件

this.choreService = new ChoreService(getServerName().toString(), true);
this.compactedFileDischarger =
       new CompactedHFilesDischarger(cleanerInterval, (Stoppable)this, (RegionServerServices)this);
   choreService.scheduleChore(compactedFileDischarger);

3. RegionServer进行Run

3.1 检查license

2020-06-22 20:13:04,722 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=bpnode7:2291,bpnode8:2291,bpnode3:2291 sessionTimeout=30000 watcher=io.transwarp.msl.host.CLSZnodeWatcher@51588ae
2020-06-22 20:13:04,723 INFO org.apache.zookeeper.ClientCnxn: Expect server principal: zookeeper/bpnode7
2020-06-22 20:13:04,723 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server bpnode7/11.8.8.190:2291. Will not attempt to authenticate using SASL (Force non secure zookeeper client.)
2020-06-22 20:13:04,723 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to bpnode7/11.8.8.190:2291, initiating session
2020-06-22 20:13:04,725 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server bpnode7/11.8.8.190:2291, sessionid = 0x872dbba8786021a, negotiated timeout = 30000
2020-06-22 20:13:04,764 INFO org.apache.hadoop.util.StringUtils: Get st from ZK: 1556124641470
2020-06-22 20:13:04,764 INFO org.apache.hadoop.util.StringUtils: Read license successfully.
2020-06-22 20:13:04,766 INFO org.apache.zookeeper.ZooKeeper: Session: 0x872dbba8786021a closed
2020-06-22 20:13:04,766 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down

3.2 向HMaster注册之前的初始化操作

3.2.1 配置cluster connection——–
2020-06-22 20:13:04,832 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x27f979e5 connecting to ZooKeeper ensemble=bpnode7:2181,bpnode8:2181,bpnode3:2181
2020-06-22 20:13:04,832 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=bpnode7:2181,bpnode8:2181,bpnode3:2181 sessionTimeout=180000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@56849740
2020-06-22 20:13:04,833 INFO org.apache.zookeeper.ClientCnxn: Expect server principal: zookeeper/bpnode7
2020-06-22 20:13:04,833 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server bpnode7/11.8.8.190:2181. Will not attempt to authenticate using SASL (unknown error)
2020-06-22 20:13:04,833 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to bpnode7/11.8.8.190:2181, initiating session
2020-06-22 20:13:04,835 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server bpnode7/11.8.8.190:2181, sessionid = 0xb72dbc8308f05ba, negotiated timeout = 180000
2020-06-22 20:13:04,885 DEBUG org.apache.hadoop.hbase.ipc.AbstractRpcClient: Codec=org.apache.hadoop.hbase.codec.KeyValueCodec@4a92f27a, compressor=null, tcpKeepAlive=true, tcpNoDelay=true, connectTO=10000, readTO=20000, writeTO=60000, minIdleTimeBeforeClose=120000, maxRetries=0, fallbackAllowed=false, bind address=null
3.2.2 读取 cluter id /hyperbase1/hbaseid,
2020-06-22 20:13:04,892 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: ClusterId : 27bfaac6-1bd8-4b5a-bae8-d1aafb6c15c5
3.2.3 加载Procedure
2020-06-22 20:13:04,897 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure flush-table-proc is initializing
2020-06-22 20:13:04,901 DEBUG org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hyperbase1/flush-table-proc/acquired already exists
2020-06-22 20:13:04,905 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure flush-table-proc is initialized
2020-06-22 20:13:04,905 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure online-snapshot is initializing
2020-06-22 20:13:04,906 DEBUG org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hyperbase1/online-snapshot/acquired already exists
2020-06-22 20:13:04,907 DEBUG  org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure online-snapshot is initialized
3.2.4 初始化一些定时线程,包括:

memflusher

CompactSplitThread

CompactionChecker、PeriodicMemstoreFlusher、PeriodicFullGcTrigger、Leases、MovedRegionsCleaner

createCleanupScheduledChore,StorefileRefresherChore

2020-06-22 20:13:04,911 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: globalMemStoreLimit=11.4 G, globalMemStoreLimitLowMark=10.9 G, maxHeap=22.9 G
2020-06-22 20:13:04,925 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: CompactionChecker runs every 10sec
3.2.5 建立和master之间的rpc连接

2020-06-22 20:13:04,928 DEBUG org.apache.hadoop.hbase.ipc.AbstractRpcClient: Codec=org.apache.hadoop.hbase.codec.KeyValueCodec@d19c55a, compressor=null, tcpKeepAlive=true, tcpNoDelay=true, connectTO=10000, readTO=20000, writeTO=60000, minIdleTimeBeforeClose=120000, maxRetries=0, fallbackAllowed=false, bind address=bpnode1/11.8.8.184:0

3.3 安装ShutdownHook,初始化、加载协处理器

2020-06-22 20:13:04,931 DEBUG org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver/bpnode1/11.8.8.184:60020
2020-06-22 20:13:04,947 INFO org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost: System coprocessor loading is enabled
2020-06-22 20:13:04,947 INFO org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost: Table coprocessor loading is enabled

3.4 向HMaster上报信息reportForDuty注册,如果HMaster没有回应,则sleep一段时间后反复尝试

2020-06-22 20:13:04,951 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty to master=bpnode7,60000,1592827361936 with port=60020, startcode=1592827982743

3.5 接受到成功的响应后,开始启动服务

获取hmaster发送的配置:

2020-06-22 20:13:05,137 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://nameservice1/hyperbase1
2020-06-22 20:13:05,137 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.defaultFS=hdfs://nameservice1
2020-06-22 20:13:05,137 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.master.info.port=60010

配置WAL和replication

2020-06-22 20:13:05,245 DEBUG org.apache.hadoop.hbase.replication.regionserver.Replication: ReplicationStatisticsThread 300
2020-06-22 20:13:05,256 INFO org.apache.hadoop.hbase.wal.WALFactory: Instantiating WALProvider of type class org.apache.hadoop.hbase.wal.DefaultWALProvider

启动常驻线程Executor Service,HeapMemoryTuner,RegionServerFlushTableProcedureManager,RegionServerSnapshotManager等等

2020-06-22 20:13:05,293 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_REGION-bpnode1:60020, corePoolSize=40, maxPoolSize=40
2020-06-22 20:13:05,293 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_META-bpnode1:60020, corePoolSize=1, maxPoolSize=1
2020-06-22 20:13:05,293 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_PRIORITY_REGION-bpnode1:60020, corePoolSize=3, maxPoolSize=3
2020-06-22 20:13:05,293 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_REGION-bpnode1:60020, corePoolSize=40, maxPoolSize=40
2020-06-22 20:13:05,294 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_META-bpnode1:60020, corePoolSize=1, maxPoolSize=1
2020-06-22 20:13:05,294 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_LOG_REPLAY_OPS-bpnode1:60020, corePoolSize=2, maxPoolSize=2
2020-06-22 20:13:05,294 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_COMPACTED_FILES_DISCHARGER-bpnode1:60020, corePoolSize=10, maxPoolSize=10
2020-06-22 20:13:05,295 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_REGION_REPLICA_FLUSH_OPS-bpnode1:60020, corePoolSize=40, maxPoolSize=40
2020-06-22 20:13:05,301 INFO org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Current list of replicators: [bpnode1,60020,1592827982743, bpnode2,60020,1592827894533, bpnode8,60020,1592827968314, bpnode3,60020,1592827927496] other RSs: [bpnode1,60020,1592827982743, bpnode2,60020,1592827894533, bpnode8,60020,1592827968314, bpnode3,60020,1592827927496]
2020-06-22 20:13:05,356 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker bpnode1,60020,1592827982743 starting
2020-06-22 20:13:05,357 INFO org.apache.hadoop.hbase.regionserver.HeapMemoryManager: Starting HeapMemoryTuner chore.
2020-06-22 20:13:05,360 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Serving as bpnode1,60020,1592827982743, RpcServer on bpnode1/11.8.8.184:60020, sessionid=0xb72dbc8308f05b9

自此Region Server算处于online的状态了

启动相关的procedure,以及Quota Manager

2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure flush-table-proc is starting
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager: Start region server flush procedure manager bpnode1,60020,1592827982743
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Starting procedure member 'bpnode1,60020,1592827982743'
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Checking for aborted procedures on node: '/hyperbase1/flush-table-proc/abort'
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Looking for new procedures under znode:'/hyperbase1/flush-table-proc/acquired'
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure flush-table-proc is started
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure online-snapshot is starting
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager: Start Snapshot Manager bpnode1,60020,1592827982743
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Starting procedure member 'bpnode1,60020,1592827982743'
2020-06-22 20:13:05,360 DEBUG org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Checking for aborted procedures on node: '/hyperbase1/online-snapshot/abort'
2020-06-22 20:13:05,361 DEBUG org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Looking for new procedures under znode:'/hyperbase1/online-snapshot/acquired'
2020-06-22 20:13:05,361 DEBUG org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost: Procedure online-snapshot is started
2020-06-22 20:13:05,368 INFO org.apache.hadoop.hbase.quotas.RegionServerQuotaManager: Quota support disabled

3.6 服务启动完成,主进程定期上报RS的负载信息

tryRegionServerReport(lastMsg, now)

这篇文章对您有帮助吗?

平均评分 0 / 5. 次数: 0

尚无评价,您可以第一个评哦!

非常抱歉,这篇文章对您没有帮助.

烦请您告诉我们您的建议与意见,以便我们改进,谢谢您。