insight主机N_A分析思路.docx
+新建/编岗一样发W全扉空间公约GoMenDBFAQINSlGHT问题FAQInSight外场St见问BFAQ由三M0313000145婕.酶由SK三½03130229IViK于2024-10-142029:14Q90日4QD定位思路:(各组件日志问题可查看对应大甥8姐%insightAgent问财S)可随便挑一台机器查看详情-住触(据是否正意.若正言说明中间大数据组件一系列流程没有问题切第谢拟下步褰中的第三步-消费kafka,若无法正常消费就看前面的insightAgent、filebeat,kafka问题.否贝!三看后面Iogstash,di,es、insightSerVer以及具体采集字段的问场数据链路大概是:is写采集任务到zk,iagzk定时洞采集脚本写日志到Logcollect.fW6采集内容送到kafka.d谕i费清洗后送到eis从es窗嘘展示i泻蜗件信息.踞集喇zk.4监听z析书点内容、创建采集任匆,恻解频率创理定时翳,定时我.分为残:,跖地执行采集脚轲¾1果没有大做蛔哂则去捏这个)、历执行为精甸,通过momoa发,送给Oma执行dbtool原车,返回给ia,ia发送到kafka如果个别服务器11a,先查对应服务器的iaiflfb.如为IKSSna,先查。玄curl-uelasticInSight202LeS-XPOST-HContent-TypezappIicationZjsonhttp7/$esJp:92M/insight-hostdynamic/_search?prettyd'sort”:createtime”:消费kafka数据"ord"":"(tesc"HJjSiZeFregrep-w'createtimeip',命令说明1检有异常机SSinsightAgent采集日志时间是否更新Lsu-insight2 .检却aJOfbM进程是否存在PS-ef|grepagent-confymlps>ef|grepfilebeat(613L脂不ffifflfb)3 .观察采集日志是否更新:cdSHOME/insightAgentIlrtagent/host_shellit-loghoststatk.logIlTtagenthstshellit-logbostdynamiGog4 .检查匕和fb日志是否有报Ig:cdSHOMEZinsightAgentviewagenti11sight.logviewfilebeatfiebeatJog如果进程不存在或有异常.D以三启对应进建重启命令:agentdbmoni-Stopragentdbmoni-startfilebeatdbmoni-stopilebeatdbmoni-start如果hoststatic.log、卜。式(1丫恒1匕109内容不更新,SJginsightagent.»K:Il-rtSHOME/insightAgent/agent/hostsbell/rtJogIl-rtSHOMEZimightAgentZagent店行中IwM缸I2查看kafka集群是否正常su-insight;雌密Z燔枕如牛source-binalarmsender.sh;modifyPasswordzookeeper;#登录Zksh-/bigdata/zookeeper/bin/zkCli.sh-serverIocalhost32181睡入Z垢输入期addauthdigestinsightlnsightZk-2021#检查kafka集群数量是否正确Isbrokersids完之后记得加密回去modifyEncodedPasswordzookeeper;其他手动解击方式为:修改$HOME/bigdata/ZookeePer/conf/zk_cIientjaSS.conf中的专文为明文(lnsightZk-2021),原密文不可注释掉保留,直接行奂若kafka节J节点号(与$HOMEbigdatakafkaCOnfig/server.PrOPertieS中broker.id5应)去相应机器查看kafka进程及日志($HOME/bigdata/kafka/logs/server.log)是否正常。SS命令:kafkadbmoni-stop;kafkadbmoni-start拍照:Isbrokersids若kafka中无法消费出数据(执行命令后一直卡着)r可查看filebeat进程、日志(JHOMEinsightAgentfiebeat*fiebeatlog)、配置文件(JHOMEinsightAgentfilebeat*fiebeatyml)是否正常重启命令:filebeatdbmoni-StopjfiIebeatdbmoni-stop拍照:左边命令的执行结果PS:Q:报错kafka-console-consumer.sh语法错误vimkafka-console-consumer.sh,删除下图中圈住的部分,保存退出后重新消费。su-insight;source-binalarmsender.sh;modifyPasswordkafka;#进入$HOMEbigdatakafkabin目录,替换相应ip、port,手动消费kafka数据(后面可加Igepip查询具体某个机器的捌g是否正常诩):cd-bigdatakafkabin./kafka-console-consumer.sh-bootstrap-server$ip:9092topichoststaticconsumerconfig.configconsumenproperties.Aafka-console-consumer.sh-bootstrap-server$ip:9092topichostdynamic-consumer.config./config/consumenproperties(如果去除了kafka鉴权,可不修改为明文密码,消费命令为./kafkaconsoleconsumer;sh-bootstrapserver10.229.31227:9092-topichostdynamic./config/consumenproperties)查完之后记得加密回去modifyEncodedPasswordkafka;其他手动解密方式为:修改$HOME/bigdata/kafka/config/kafka_clientJassxonf中的密文为明文(KafkaCIient-202亍,原专文不可注释掉保留,直接替换三ie湫态1 .查询es集群节点状态:curluserelastic:!nsight2021_es-XGEThttp:/$ip:9200/_cat/nodes?v'2 .查询es健康状态:curluserHaStiClnSight202IeS-XGEThttp:/$ip:9200/_cluster/health?prettyB3 .部蜘应ip、潴口查询es数据:curl-H'Content-Type:applicationjson'-userelastic:!nsight2021_es-XPOST,10229.31.224:9200/insight-hoststatic/_search?pretty'-d'"size':1query'r'boo:"must":"term":Bhost_ipB:'value*:10.229.31.224'11sort':ncreate_time.keywordn:"orderB:"descB'curl-H'Content-Type:application/json"userelastic:lnsight2021_es-XPOST(10.229.31.224:9200/insight-hostdynamic/_search?pretty'-d''size":1query"'boo:'must":"term"fip''value"10.229.31.224","sort":"createtime.keyword-order'/desc-'1 .若es节点在®®机§§es进S日志($HOME/bigdata/elasticsearch/logs/es-insight.log),三J三命令:esdbmoni-stop;esdbmoni-start2 .若esWS为red,SiS异常索引CUrl-USerelastic:lnsight2021_es'http:/$ip:9200/_cat/indices'grepred,并将mi除ClllI-USerelastic:InSight202LeS-XDELETEhttp:$ip:9200/异常亲引;3 .若es查询数据显示时间不为最新,或者没有数据,或者报错没有索引,查看Iogstash进程及日志($HoME/bigdata/IOgStaSh/logs/logstash-plainOg)咖ataintegration进程及日志($HOME/bigdata/dataintegration/logs/dataintegration.log)日志常见异inElaStiCSearChClusterBlockExceptionblockedby:FORBIDDEN/12/indexread-only/allowdelete(api),表明磁盘空间超过阈值导致es被设置为只读,需要清理磁盘并口中一台机器执行CUrl-USerelastic:InSight202LeS-XPUT-H'Content-Type:applicationjson'http:/$ip:9200/_all/_settings-d'index.blocks.read_only_allow_deleteSiSRDB表SRDBSi旬goldendbjnsighthOStinfo表若goldendbinsighthOStinfO表中问题,SSWinsightServer($HOME/insightServer/insight-0.0.1-SNAPSHOT/logs/insightlogj关键字getHostlnfoFromDiThenllpdateRdb)Odataintegration($HOME/bigdata/dataintegration/logs/dataintegration.log)itOg与页面的关系:SPS三-节点-14g:processdynamic.log资源管理-主机列表-性能:hostdynamic.bgitOg与页面的关系:祖PSa-节点-性能:processdynamicjog资«3-主tfUJ表-性眼KostdynamiclogQ:A:机三i咪自host_info虱表中数据通过定时任务从insightJnStaILinf。以及OmmfiS09gdb_devicJinf0、gdb_CityinStalLbaSiCS同步Q:主机列表无i三常查司A:查看insightserve旧志报连接ZkW.连接被对方重设”.看看Zk日志报"maxis.ins