> 文章列表 > MongoDB:记一次生产环境中mongo出现的严重出错与排查解决

MongoDB:记一次生产环境中mongo出现的严重出错与排查解决

MongoDB:记一次生产环境中mongo出现的严重出错与排查解决

造成此种错误的原因有如下几种常见情况:

系统磁盘已满导致mongo无法向文件系统写数据。

* 系统突然死机(或系统重启)造成mongo文件系统被损坏。

* 使用非正当方法强制停止mongo服务,如:kill -1/-9的方式。

* mongo文件系统遭篡改。

还有很多种原因............

启动mongod失败

#mongod -port 27019

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] MongoDB starting : pid=253port=27019 dbpath=/data/db 64-bit host=da6056799173

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] db version v3.2.8

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] git version: ed70e33130c977bda0024c125b56d159573dbaf0

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL1.0.1e-fips 11 Feb 2013

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] allocator: tcmalloc

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] modules: none

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] build environment:

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten]     distmod: rhel70

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten]     distarch: x86_64

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten]     target_arch: x86_64

2017-05-23T09:12:43.023+0800 I CONTROL  [initandlisten] options: { net: { port: 27019}, security: { authorization: "enabled" } }

2017-05-23T09:12:43.045+0800 I -        [initandlisten] Detected data files in/data/db created by the 'wiredTiger' storage engine, so setting the activestorage engine to 'wiredTiger'.

2017-05-23T09:12:43.045+0800 W -        [initandlisten] Detected unclean shutdown -/data/db/mongod.lock is not empty.

2017-05-23T09:12:43.045+0800 W STORAGE  [initandlisten] Recovering data from the lastclean checkpoint.

2017-05-23T09:12:43.045+0800 I STORAGE  [initandlisten] wiredtiger_open config:create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),

2017-05-23T09:12:45.028+0800 E STORAGE  [initandlisten] WiredTiger (0)[1495501965:28163][253:0x7fdf1a440dc0], file:index-59-8962705381866868852.wt,WT_CURSOR.insert: read checksum error for 12288B block at offset 124456960:calculated block checksum of 2057573707 doesn't match expected checksum of1789932834

2017-05-23T09:12:45.028+0800 E STORAGE  [initandlisten] WiredTiger (0)[1495501965:28237][253:0x7fdf1a440dc0], file:index-59-8962705381866868852.wt,WT_CURSOR.insert: index-59-8962705381866868852.wt: encountered an illegal fileformat or internal value

2017-05-23T09:12:45.028+0800 E STORAGE  [initandlisten] WiredTiger (-31804)[1495501965:28249][253:0x7fdf1a440dc0], file:index-59-8962705381866868852.wt,WT_CURSOR.insert: the process must exit and restart: WT_PANIC: WiredTigerlibrary panic

2017-05-23T09:12:45.028+0800 I -        [initandlisten] Fatal Assertion 28558

2017-05-23T09:12:45.028+0800 I -        [initandlisten]

***aborting after fassert() failure

2017-05-23T09:12:45.044+0800 F -        [initandlisten] Got signal: 6(Aborted).

 0x131cfa2 0x131c0f9 0x131c902 0x7fdf190ad1000x7fdf18d115f7 0x7fdf18d12ce8 0x12a68a2 0x10a0a53 0x1a7e43c 0x1a7e8fd 0x1a7ece40x19b2f97 0x19cfcaa 0x19d5280 0x19f5727 0x19c36cf 0x1a11c9e 0x1a8d673 0x1a308d20x1a8de67 0x1a07e57 0x1a00f4a 0x1088aff 0x1084dc3 0xfadd78 0x9b4a2d 0x96e17d0x7fdf18cfdb15 0x9b1169

----- BEGIN BACKTRACE -----

{"backtrace":[{"b":"400000","o":"F1CFA2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F1C0F9"},{"b":"400000","o":"F1C902"},{"b":"7FDF1909E000","o":"F100"},{"b":"7FDF18CDC000","o":"355F7","s":"gsignal"},{"b":"7FDF18CDC000","o":"36CE8","s":"abort"},{"b":"400000","o":"EA68A2","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"CA0A53"},{"b":"400000","o":"167E43C","s":"__wt_eventv"},{"b":"400000","o":"167E8FD","s":"__wt_err"},{"b":"400000","o":"167ECE4","s":"__wt_panic"},{"b":"400000","o":"15B2F97","s":"__wt_bm_read"},{"b":"400000","o":"15CFCAA","s":"__wt_bt_read"},{"b":"400000","o":"15D5280","s":"__wt_page_in_func"},{"b":"400000","o":"15F5727","s":"__wt_row_search"},{"b":"400000","o":"15C36CF","s":"__wt_btcur_insert"},{"b":"400000","o":"1611C9E"},{"b":"400000","o":"168D673"},{"b":"400000","o":"16308D2","s":"__wt_log_scan"},{"b":"400000","o":"168DE67","s":"__wt_txn_recover"},{"b":"400000","o":"1607E57","s":"__wt_connection_workers"},{"b":"400000","o":"1600F4A","s":"wiredtiger_open"},{"b":"400000","o":"C88AFF","s":"_ZN5mongo18WiredTigerKVEngineC2ERKSsS2_S2_mbbb"},{"b":"400000","o":"C84DC3"},{"b":"400000","o":"BADD78","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"400000","o":"5B4A2D","s":"_ZN5mongo13initAndListenEi"},{"b":"400000","o":"56E17D","s":"main"},{"b":"7FDF18CDC000","o":"21B15","s":"__libc_start_main"},{"b":"400000","o":"5B1169"}],"processInfo":{"mongodbVersion" : "3.2.8", "gitVersion" :"ed70e33130c977bda0024c125b56d159573dbaf0","compiledModules" : [], "uname" : { "sysname" :"Linux", "release" : "3.10.0-514.2.2.el7.x86_64","version" : "#1 SMP Tue Dec 6 23:06:41 UTC 2016","machine" : "x86_64" }, "somap" : [ {"elfType" : 2, "b" : "400000","buildId" : "5F3E5C743BB6FE5AA37A0C943A2741BC6F69AA7E" }, {"b" : "7FFD5F7AD000", "elfType" : 3,"buildId" : "183CE4B56A9471419F233CCEF078E0504837ABF5" }, {"b" : "7FDF19FC6000", "path" : "/lib64/libssl.so.10","elfType" : 3, "buildId" :"478D01A08B923A251D755BB421F3EBAF9F2982C1" }, { "b" :"7FDF19BDE000", "path" :"/lib64/libcrypto.so.10", "elfType" : 3, "buildId": "42AAFD25E9B5F4CE2EFE6309491445B1A92A575D" }, { "b" :"7FDF199D6000", "path" : "/lib64/librt.so.1","elfType" : 3, "buildId" :"CB0D2C9F29DBD13C47E7D2EEFB94B35835698CCA" }, { "b" :"7FDF197D2000", "path" : "/lib64/libdl.so.2","elfType" : 3, "buildId" :"091060A163E7EDA25572F3B1BAF2E8F80209C00E" }, { "b" :"7FDF194D0000", "path" : "/lib64/libm.so.6","elfType" : 3, "buildId" :"F9DF294FB70243549DCB643F1322BB20E70E9FE8" }, { "b" :"7FDF192BA000", "path" : "/lib64/libgcc_s.so.1","elfType" : 3, "buildId" :"6AA1DCC4DE7F1836344949857FC2017278631FFD" }, { "b" :"7FDF1909E000", "path" : "/lib64/libpthread.so.0","elfType" : 3, "buildId" :"723F0AC75EF88E778940AE8A8BC30141D85B116A" }, { "b" :"7FDF18CDC000", "path" : "/lib64/libc.so.6","elfType" : 3, "buildId" :"088D48A9AB5A512D9F75BA3D66B6CF77EB6588F9" }, { "b" :"7FDF1A233000", "path" : "/lib64/ld-linux-x86-64.so.2","elfType" : 3, "buildId" :"09E1BB4D034C7263810A41100647068858A7ECB6" }, { "b" :"7FDF18A90000", "path" :"/lib64/libgssapi_krb5.so.2", "elfType" : 3,"buildId" : "D46A230FFF4A7B808B3CFC213D31FCAC542FB504" }, {"b" : "7FDF187AB000", "path" :"/lib64/libkrb5.so.3", "elfType" : 3, "buildId" :"6D6136A0E795420B05854DEF13A10C226FE9CCB2" }, { "b" :"7FDF185A7000", "path" : "/lib64/libcom_err.so.2","elfType" : 3, "buildId" :"3A1166709F88740C49E060731832E3FAD2DFB66B" }, { "b" :"7FDF18375000", "path" :"/lib64/libk5crypto.so.3", "elfType" : 3,"buildId" : "AA97A848DD7C9E57B06EC913E10D420AEBBCE027" }, {"b" : "7FDF1815F000", "path" : "/lib64/libz.so.1","elfType" : 3, "buildId" :"1982C8CDAE90F898D1AD26DC07E807333B4789D0" }, { "b" :"7FDF17F50000", "path" :"/lib64/libkrb5support.so.0", "elfType" : 3,"buildId" : "AEF6C3D3C5152F339942041519A106FC055DAF71" }, {"b" : "7FDF17D4C000", "path" : "/lib64/libkeyutils.so.1","elfType" : 3, "buildId" :"2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" :"7FDF17B32000", "path" : "/lib64/libresolv.so.2","elfType" : 3, "buildId" :"D02DC134F38F06F3885231FD2486D5EF4796E5F9" }, { "b" :"7FDF1790D000", "path" :"/lib64/libselinux.so.1", "elfType" : 3,"buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, {"b" : "7FDF176AC000", "path" :"/lib64/libpcre.so.1", "elfType" : 3, "buildId" :"AE64AA461A26E01F60408013D361749D56DD0AE1" }, { "b" :"7FDF17487000", "path" : "/lib64/liblzma.so.5","elfType" : 3, "buildId" :"98131C9354279ABD39FD80D4BE5B3EC5678BD9E0" } ] }}

 mongod(_ZN5mongo15printStackTraceERSo+0x32)[0x131cfa2]

 mongod(+0xF1C0F9) [0x131c0f9]

 mongod(+0xF1C902) [0x131c902]

 libpthread.so.0(+0xF100) [0x7fdf190ad100]

 libc.so.6(gsignal+0x37) [0x7fdf18d115f7]

 libc.so.6(abort+0x148) [0x7fdf18d12ce8]

 mongod(_ZN5mongo13fassertFailedEi+0x82)[0x12a68a2]

 mongod(+0xCA0A53) [0x10a0a53]

 mongod(__wt_eventv+0x42C) [0x1a7e43c]

 mongod(__wt_err+0x8D) [0x1a7e8fd]

 mongod(__wt_panic+0x24) [0x1a7ece4]

 mongod(__wt_bm_read+0x77) [0x19b2f97]

 mongod(__wt_bt_read+0x1EA) [0x19cfcaa]

 mongod(__wt_page_in_func+0x180) [0x19d5280]

 mongod(__wt_row_search+0x677) [0x19f5727]

 mongod(__wt_btcur_insert+0x45F) [0x19c36cf]

 mongod(+0x1611C9E) [0x1a11c9e]

 mongod(+0x168D673) [0x1a8d673]

 mongod(__wt_log_scan+0x9D2) [0x1a308d2]

 mongod(__wt_txn_recover+0x4B7) [0x1a8de67]

 mongod(__wt_connection_workers+0x37)[0x1a07e57]

 mongod(wiredtiger_open+0x155A) [0x1a00f4a]

 mongod(_ZN5mongo18WiredTigerKVEngineC2ERKSsS2_S2_mbbb+0x77F)[0x1088aff]

 mongod(+0xC84DC3) [0x1084dc3]

 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x598)[0xfadd78]

 mongod(_ZN5mongo13initAndListenEi+0x3DD)[0x9b4a2d]

 mongod(main+0x15D) [0x96e17d]

 libc.so.6(__libc_start_main+0xF5)[0x7fdf18cfdb15]

 mongod(+0x5B1169) [0x9b1169]

----- END BACKTRACE  -----

Aborted (core dumped)

网上推荐的添加参数 --repair依然无法修复

# mongod --repair

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] MongoDB starting : pid=322port=27017 dbpath=/data/db 64-bit host=da6056799173

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] db version v3.2.8

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] git version:ed70e33130c977bda0024c125b56d159573dbaf0

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL1.0.1e-fips 11 Feb 2013

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] allocator: tcmalloc

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] modules: none

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] build environment:

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten]     distmod: rhel70

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten]     distarch: x86_64

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten]     target_arch: x86_64

2017-05-23T09:31:20.854+0800 I CONTROL  [initandlisten] options: { repair: true }

2017-05-23T09:31:20.876+0800 I -        [initandlisten] Detected data files in/data/db created by the 'wiredTiger' storage engine, so setting the activestorage engine to 'wiredTiger'.

2017-05-23T09:31:20.876+0800 I STORAGE  [initandlisten] Detected WT journalfiles.  Running recovery from lastcheckpoint.

2017-05-23T09:31:20.876+0800 I STORAGE  [initandlisten] journal to nojournaltransition config:create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),

2017-05-23T09:31:20.982+0800 E STORAGE  [initandlisten] WiredTiger (-31803)[1495503080:982911][322:0x7f1a1ba1fdc0], txn-recover: Recovery failed:WT_NOTFOUND: item not found

2017-05-23T09:31:20.983+0800 I -        [initandlisten] Assertion:28718:-31803: WT_NOTFOUND: item not found

2017-05-23T09:31:20.984+0800 I STORAGE  [initandlisten] exception in initAndListen:28718 -31803: WT_NOTFOUND: item not found, terminating

2017-05-23T09:31:20.984+0800 I CONTROL  [initandlisten] dbexit:  rc: 100

解决方法将WiredTiger.turtle、WiredTiger.lock、mongod.lock文件进行移除,备份到非mongo dbpath指定的路径下即可

之后mongo服务即可正常启动。