Hi zuzammen, als mein PC eben aus der Bereitschaft erwachte, sah ich erstmals die Warnung:
Hard disk health warning the hard disk health status has changed. This could mean that hard drive failure is imminent. ...
Habe zwei Speicher, jetzt würde ich gerne die Ursache der Warnung verstehen und evtl. was das bedeutet. Über eine Einschätzung von euch würde ich mich freuen. Die NVME ist ca. 1,25 Jahre alt, die SSD ca. 1-2 Jahre älter.
~$ inxi -Fzxxx System: Kernel: 5.19.0-32-generic x86_64 bits: 64 compiler: N/A Desktop: GNOME 42.5 tk: GTK 3.24.33 wm: gnome-shell dm: GDM3 42.0 Distro: Ubuntu 22.04.2 LTS (Jammy Jellyfish) Machine: Type: Desktop Mobo: ASRock model: H470 Steel Legend serial: <superuser required> UEFI: American Megatrends v: P1.60 date: 03/09/2021 CPU: Info: 6-core model: Intel Core i5-10400 bits: 64 type: MT MCP smt: enabled arch: Comet Lake rev: 5 cache: L1: 384 KiB L2: 1.5 MiB L3: 12 MiB Speed (MHz): avg: 2025 high: 2900 min/max: 800/4300 cores: 1: 2900 2: 800 3: 800 4: 800 5: 2900 6: 800 7: 2900 8: 2900 9: 2900 10: 800 11: 2900 12: 2900 bogomips: 69597 Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx Graphics: Device-1: Intel CometLake-S GT2 [UHD Graphics 630] vendor: ASRock driver: i915 v: kernel ports: active: DP-1 empty: HDMI-A-1,HDMI-A-2 bus-ID: 00:02.0 chip-ID: 8086:9bc5 class-ID: 0300 Display: wayland server: X.org v: 1.21.1.3 with: Xwayland v: 22.1.1 compositor: gnome-shell driver: X: loaded: modesetting unloaded: fbdev,vesa gpu: i915 display-ID: 0 Monitor-1: DP-1 model: AOC U2777B serial: <filter> res: 3840x2160 dpi: 163 size: 597x336mm (23.5x13.2") diag: 685mm (27") modes: max: 3840x2160 min: 720x400 OpenGL: renderer: Mesa Intel UHD Graphics 630 (CML GT2) v: 4.6 Mesa 22.2.5 direct render: Yes Audio: Device-1: Intel Comet Lake PCH cAVS vendor: ASRock driver: snd_hda_intel v: kernel bus-ID: 00:1f.3 chip-ID: 8086:06c8 class-ID: 0403 Sound Server-1: ALSA v: k5.19.0-32-generic running: yes Sound Server-2: PulseAudio v: 15.99.1 running: yes Sound Server-3: PipeWire v: 0.3.48 running: yes Network: Device-1: Realtek RTL8125 2.5GbE vendor: ASRock driver: r8169 v: kernel pcie: speed: 5 GT/s lanes: 1 port: 4000 bus-ID: 03:00.0 chip-ID: 10ec:8125 class-ID: 0200 IF: enp3s0 state: up speed: 100 Mbps duplex: full mac: <filter> Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169 v: kernel pcie: speed: 2.5 GT/s lanes: 1 port: 3000 bus-ID: 04:00.0 chip-ID: 10ec:8161 class-ID: 0200 IF: enp4s0 state: down mac: <filter> IF-ID-1: virbr0 state: down mac: <filter> Drives: Local Storage: total: 1.14 TiB used: 760.91 GiB (65.3%) ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 250GB size: 232.89 GiB speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter> rev: 2B2QEXM7 temp: 43.9 C scheme: GPT ID-2: /dev/sda vendor: Crucial model: CT1000MX500SSD1 size: 931.51 GiB speed: 6.0 Gb/s type: SSD serial: <filter> rev: 032 scheme: MBR Partition: ID-1: / size: 54.7 GiB used: 46.74 GiB (85.4%) fs: ext4 dev: /dev/nvme0n1p3 ID-2: /boot/efi size: 92.5 MiB used: 6 MiB (6.5%) fs: vfat dev: /dev/nvme0n1p1 ID-3: /home size: 907.84 GiB used: 714.17 GiB (78.7%) fs: ext4 dev: /dev/sda1 Swap: ID-1: swap-1 type: partition size: 15.26 GiB used: 436 KiB (0.0%) priority: -2 dev: /dev/nvme0n1p4 Sensors: System Temperatures: cpu: 38.0 C pch: 50.0 C mobo: N/A Fan Speeds (RPM): N/A Info: Processes: 352 Uptime: 1d 14h 6m wakeups: 147 Memory: 15.28 GiB used: 7.44 GiB (48.7%) Init: systemd v: 249 runlevel: 5 Compilers: gcc: 11.3.0 alt: 11 Packages: 2439 apt: 2420 flatpak: 19 Shell: Bash v: 5.1.16 running-in: gnome-terminal inxi: 3.3.13
Nach → Festplattenstatus habe ich ausgeführt und dann vor der Ausgabe ca. 45 Min gewartet:
~$ sudo smartctl -t long /dev/nvme0 smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-32-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org NVMe device successfully opened Use 'smartctl -a' (or '-x') to print SMART (and more) information
~$ sudo smartctl -a /dev/nvme0 smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-32-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: Samsung SSD 970 EVO Plus 250GB Serial Number: S4EUNJ0N317013L Firmware Version: 2B2QEXM7 PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 250.059.350.016 [250 GB] Unallocated NVM Capacity: 0 Controller ID: 4 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 250.059.350.016 [250 GB] Namespace 1 Utilization: 91.437.809.664 [91,4 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 002538 5301409b99 Local Time is: Sat Feb 18 23:01:25 2023 CET Firmware Updates (0x16): 3 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Log Page Attributes (0x03): S/H_per_NS Cmd_Eff_Lg Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 85 Celsius Critical Comp. Temp. Threshold: 85 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 7.80W - - 0 0 0 0 0 0 1 + 6.00W - - 1 1 1 1 0 0 2 + 3.40W - - 2 2 2 2 0 0 3 - 0.0700W - - 3 3 3 3 210 1200 4 - 0.0100W - - 4 4 4 4 2000 8000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 44 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 3.028.943 [1,55 TB] Data Units Written: 5.122.674 [2,62 TB] Host Read Commands: 51.842.768 Host Write Commands: 140.883.191 Controller Busy Time: 1.477 Power Cycles: 4.539 Power On Hours: 463 Unsafe Shutdowns: 27 Media and Data Integrity Errors: 0 Error Information Log Entries: 3.163 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 44 Celsius Temperature Sensor 2: 48 Celsius Error Information (NVMe Log 0x01, 16 of 64 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 3163 0 0x5014 0x4004 - 0 0 -
~$ sudo smartctl -t long /dev/sda smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-32-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 30 minutes for test to complete. Test will complete after Sat Feb 18 22:35:14 2023 CET Use smartctl -X to abort test.
~$ sudo smartctl -a /dev/sda [sudo] Passwort für $USER: smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-32-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Crucial/Micron Client SSDs Device Model: CT1000MX500SSD1 Serial Number: 2007E28C17EB LU WWN Device Id: 5 00a075 1e28c17eb Firmware Version: M3CR032 User Capacity: 1.000.204.886.016 bytes [1,00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-3 T13/2161-D revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sat Feb 18 22:50:43 2023 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 30) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x0031) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0 5 Reallocate_NAND_Blk_Cnt 0x0032 100 100 010 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1313 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 5276 171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0 172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0 173 Ave_Block-Erase_Count 0x0032 099 099 000 Old_age Always - 20 174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 36 180 Unused_Reserve_NAND_Blk 0x0033 000 000 000 Pre-fail Always - 26 183 SATA_Interfac_Downshift 0x0032 100 100 000 Old_age Always - 0 184 Error_Correction_Count 0x0032 100 100 000 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 194 Temperature_Celsius 0x0022 070 037 000 Old_age Always - 30 (Min/Max 0/63) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_ECC_Cnt 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0 202 Percent_Lifetime_Remain 0x0030 099 099 001 Old_age Offline - 1 206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0 210 Success_RAIN_Recov_Cnt 0x0032 100 100 000 Old_age Always - 0 246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 9169348528 247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 85871848 248 FTL_Program_Page_Count 0x0032 100 100 000 Old_age Always - 232450271 SMART Error Log Version: 1 Warning: ATA error count 0 inconsistent with error log pointer 3 ATA Error Count: 0 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error -2 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 00 ec 00 00 00 00 00 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- ec 00 00 00 00 00 00 00 00:00:00.000 IDENTIFY DEVICE ec 00 00 00 00 00 00 00 00:00:00.000 IDENTIFY DEVICE ec 00 00 00 00 00 00 00 00:00:00.000 IDENTIFY DEVICE ec 00 00 00 00 00 00 00 00:00:00.000 IDENTIFY DEVICE c8 00 00 00 00 00 00 00 00:00:00.000 READ DMA SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 1313 - # 2 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.