|
|
@ -29,8 +29,27 @@ smartmontools build with: C++98, GCC 5.4.0 20160609 |
|
|
|
man:smartd.conf(5) |
|
|
|
~~~ |
|
|
|
|
|
|
|
## Utilisation basique |
|
|
|
|
|
|
|
## Utilisation |
|
|
|
Quelques exemples de commande de base : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl --scan |
|
|
|
# smartctl -a /dev/sda |
|
|
|
# smartctl -a /dev/sda | grep Power_On_Hours |
|
|
|
# smartctl -a /dev/sda | grep Power_Cycle_Count |
|
|
|
# smartctl -a /dev/sda -d megaraid,0 |
|
|
|
# smartctl -i /dev/sg0 |
|
|
|
~~~ |
|
|
|
|
|
|
|
|
|
|
|
## smartctl |
|
|
|
|
|
|
|
On peut s'assurer que toutes les fonctionnalités SMART sont activées sur un disque via : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -s on -o on -S on /dev/sda |
|
|
|
~~~ |
|
|
|
|
|
|
|
### Lister les disques |
|
|
|
|
|
|
@ -60,7 +79,7 @@ Sur une machine avec du RAID hardware : |
|
|
|
/dev/bus/0 -d megaraid,6 # /dev/bus/0 [megaraid_disk_06], SCSI device |
|
|
|
~~~ |
|
|
|
|
|
|
|
### Voir les informations sur un disque |
|
|
|
### Voir les informations d'un disque |
|
|
|
|
|
|
|
L'option `-i` permet d'afficher les informations sur un disque : |
|
|
|
|
|
|
@ -85,6 +104,33 @@ SMART support is: Available - device has SMART capability. |
|
|
|
SMART support is: Enabled |
|
|
|
~~~ |
|
|
|
|
|
|
|
L'option `-l error` permet d'afficher les éventuelles erreurs d'un disque : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -l error /dev/sda |
|
|
|
|
|
|
|
=== START OF SMART DATA SECTION === |
|
|
|
Error Information (NVMe Log 0x01, max 64 entries) |
|
|
|
Num ErrCount SQId CmdId Status PELoc LBA NSID VS |
|
|
|
0 120 0 0x0008 0x4004 - 0 0 - |
|
|
|
1 119 0 0x0018 0x4004 0x02c 0 0 - |
|
|
|
2 118 0 0x0017 0x4004 0x02c 0 0 - |
|
|
|
3 117 0 0x0008 0x4004 - 0 0 - |
|
|
|
4 116 0 0x0018 0x4004 0x02c 0 0 - |
|
|
|
5 115 0 0x0017 0x4004 0x02c 0 0 - |
|
|
|
6 114 0 0x0008 0x4004 - 0 0 - |
|
|
|
7 113 0 0x0018 0x4004 0x02c 0 0 - |
|
|
|
8 112 0 0x0017 0x4004 0x02c 0 0 - |
|
|
|
9 111 0 0x0008 0x4004 - 0 0 - |
|
|
|
10 110 0 0x0008 0x4004 - 0 0 - |
|
|
|
11 109 0 0x0008 0x4004 0x02c 0 0 - |
|
|
|
12 108 0 0x0008 0x4004 0x02c 0 0 - |
|
|
|
13 107 0 0x0018 0x4004 0x02c 0 0 - |
|
|
|
14 106 0 0x0017 0x4004 0x02c 0 0 - |
|
|
|
15 105 0 0x0008 0x4004 0x02c 0 0 - |
|
|
|
... (48 entries not shown) |
|
|
|
~~~ |
|
|
|
|
|
|
|
L'option `-a` permet d'afficher toutes les informations SMART : |
|
|
|
|
|
|
|
~~~ |
|
|
@ -125,37 +171,91 @@ Id Fmt Data Metadt Rel_Perf |
|
|
|
=== START OF SMART DATA SECTION === |
|
|
|
SMART overall-health self-assessment test result: PASSED |
|
|
|
|
|
|
|
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff) |
|
|
|
Critical Warning: 0x00 |
|
|
|
Temperature: 32 Celsius |
|
|
|
Available Spare: 100% |
|
|
|
Available Spare Threshold: 10% |
|
|
|
Percentage Used: 0% |
|
|
|
Data Units Read: 122 540 [62,7 GB] |
|
|
|
Data Units Written: 1 927 650 [986 GB] |
|
|
|
Host Read Commands: 1 767 402 |
|
|
|
Host Write Commands: 31 997 703 |
|
|
|
Controller Busy Time: 47 |
|
|
|
Power Cycles: 371 |
|
|
|
Power On Hours: 748 |
|
|
|
Unsafe Shutdowns: 53 |
|
|
|
Media and Data Integrity Errors: 0 |
|
|
|
Error Information Log Entries: 120 |
|
|
|
Warning Comp. Temperature Time: 0 |
|
|
|
Critical Comp. Temperature Time: 0 |
|
|
|
Temperature Sensor 1: 32 Celsius |
|
|
|
Temperature Sensor 2: 34 Celsius |
|
|
|
|
|
|
|
Error Information (NVMe Log 0x01, max 64 entries) |
|
|
|
Num ErrCount SQId CmdId Status PELoc LBA NSID VS |
|
|
|
0 120 0 0x0008 0x4004 - 0 0 - |
|
|
|
1 119 0 0x0018 0x4004 0x02c 0 0 - |
|
|
|
[...] |
|
|
|
General SMART Values: |
|
|
|
Offline data collection status: (0x00) Offline data collection activity |
|
|
|
was never started. |
|
|
|
Auto Offline Data Collection: Disabled. |
|
|
|
Self-test execution status: ( 23) The self-test routine was aborted by |
|
|
|
the host. |
|
|
|
Total time to complete Offline |
|
|
|
data collection: ( 1) seconds. |
|
|
|
Offline data collection |
|
|
|
capabilities: (0x75) SMART execute Offline immediate. |
|
|
|
No Auto Offline data collection support. |
|
|
|
Abort Offline collection upon new |
|
|
|
command. |
|
|
|
No Offline surface scan supported. |
|
|
|
Self-test supported. |
|
|
|
Conveyance Self-test supported. |
|
|
|
Selective Self-test supported. |
|
|
|
SMART capabilities: (0x0003) Saves SMART data before entering |
|
|
|
power-saving mode. |
|
|
|
Supports SMART auto save timer. |
|
|
|
Error logging capability: (0x01) Error logging supported. |
|
|
|
General Purpose Logging supported. |
|
|
|
Short self-test routine |
|
|
|
recommended polling time: ( 1) minutes. |
|
|
|
Extended self-test routine |
|
|
|
recommended polling time: ( 1) minutes. |
|
|
|
Conveyance self-test routine |
|
|
|
recommended polling time: ( 1) minutes. |
|
|
|
SCT capabilities: (0x003d) SCT Status supported. |
|
|
|
SCT Error Recovery Control supported. |
|
|
|
SCT Feature Control supported. |
|
|
|
SCT Data Table supported. |
|
|
|
|
|
|
|
SMART Attributes Data Structure revision number: 5 |
|
|
|
Vendor Specific SMART Attributes with Thresholds: |
|
|
|
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE |
|
|
|
3 Spin_Up_Time 0x0020 100 100 000 Old_age Offline - 0 |
|
|
|
4 Start_Stop_Count 0x0030 100 100 000 Old_age Offline - 0 |
|
|
|
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0 |
|
|
|
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 49872 |
|
|
|
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 15 |
|
|
|
170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0 |
|
|
|
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 |
|
|
|
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 |
|
|
|
183 Runtime_Bad_Block 0x0030 100 100 000 Old_age Offline - 0 |
|
|
|
184 End-to-End_Error 0x0032 100 100 090 Old_age Always - 0 |
|
|
|
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 |
|
|
|
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 13 |
|
|
|
199 UDMA_CRC_Error_Count 0x0030 100 100 000 Old_age Offline - 5 |
|
|
|
225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 575610 |
|
|
|
226 Load-in_Time 0x0032 100 100 000 Old_age Always - 18829 |
|
|
|
227 Torq-amp_Count 0x0032 100 100 000 Old_age Always - 0 |
|
|
|
228 Power-off_Retract_Count 0x0032 100 100 000 Old_age Always - 2992332 |
|
|
|
232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0 |
|
|
|
233 Media_Wearout_Indicator 0x0032 082 082 000 Old_age Always - 0 |
|
|
|
241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 575610 |
|
|
|
242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 581199 |
|
|
|
|
|
|
|
SMART Error Log Version: 1 |
|
|
|
No Errors Logged |
|
|
|
|
|
|
|
SMART Self-test log structure revision number 1 |
|
|
|
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error |
|
|
|
# 1 Short captive Completed without error 10% 49872 - |
|
|
|
# 2 Extended offline Completed without error 00% 49872 - |
|
|
|
# 3 Reserved (0x20) Completed without error 00% 49872 - |
|
|
|
# 4 Reserved (0x20) Completed without error 10% 14 - |
|
|
|
# 5 Reserved (0x20) Completed without error 10% 4 - |
|
|
|
# 6 Reserved (0x20) Completed without error 10% 4 - |
|
|
|
# 7 Vendor (0x58) Completed without error 10% 4 - |
|
|
|
|
|
|
|
Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run |
|
|
|
SMART Selective self-test log data structure revision number 0 |
|
|
|
Note: revision number not 1 implies that no selective self-test has ever been run |
|
|
|
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS |
|
|
|
1 0 0 Not_testing |
|
|
|
2 0 0 Not_testing |
|
|
|
3 0 0 Not_testing |
|
|
|
4 0 0 Not_testing |
|
|
|
5 0 0 Not_testing |
|
|
|
Selective self-test flags (0x0): |
|
|
|
After scanning selected spans, do NOT read-scan remainder of disk. |
|
|
|
If Selective self-test is pending on power-up, resume after 0 minute delay. |
|
|
|
~~~ |
|
|
|
|
|
|
|
|
|
|
|
### RAID matériel |
|
|
|
|
|
|
|
Si votre disque n'est pas un disque physique mais un volume d'un RAID matériel, il faut préciser le type et le numéro du disque physique voulu : |
|
|
|
|
|
|
|
~~~ |
|
|
@ -185,8 +285,6 @@ Dans certains cas, le contrôleur RAID dispose d'une possibilité de voir le dis |
|
|
|
# modprobe sg |
|
|
|
|
|
|
|
# smartctl -i /dev/sg0 |
|
|
|
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-92-generic] (local build) |
|
|
|
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org |
|
|
|
|
|
|
|
=== START OF INFORMATION SECTION === |
|
|
|
Model Family: Toshiba 3.5" MG03ACAxxx(Y) Enterprise HDD |
|
|
@ -207,13 +305,89 @@ SMART support is: Available - device has SMART capability. |
|
|
|
SMART support is: Enabled |
|
|
|
~~~ |
|
|
|
|
|
|
|
### Tester un disque |
|
|
|
|
|
|
|
On peut lancer un test rapide d'un disque : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -t short /dev/sda |
|
|
|
|
|
|
|
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === |
|
|
|
Sending command: "Execute SMART Short self-test routine immediately in off-line mode". |
|
|
|
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. |
|
|
|
Testing has begun. |
|
|
|
Please wait 1 minutes for test to complete. |
|
|
|
Test will complete after Thu Dec 7 02:51:10 2017 |
|
|
|
~~~ |
|
|
|
|
|
|
|
On peut visualiser les résultats du test avec : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -l selftest /dev/sda |
|
|
|
|
|
|
|
=== START OF READ SMART DATA SECTION === |
|
|
|
SMART Self-test log structure revision number 1 |
|
|
|
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error |
|
|
|
# 1 Extended offline Completed without error 00% 49872 - |
|
|
|
# 2 Reserved (0x20) Completed without error 00% 49872 - |
|
|
|
# 3 Reserved (0x20) Completed without error 10% 14 - |
|
|
|
# 4 Reserved (0x20) Completed without error 10% 4 - |
|
|
|
# 5 Reserved (0x20) Completed without error 10% 4 - |
|
|
|
# 6 Vendor (0x58) Completed without error 10% 4 - |
|
|
|
~~~ |
|
|
|
|
|
|
|
On peut aussi lancer un test long : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -t long /dev/sda |
|
|
|
~~~ |
|
|
|
|
|
|
|
Si l'on veut interrompre le test en cours : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -X /dev/sda |
|
|
|
|
|
|
|
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === |
|
|
|
Sending command: "Abort SMART off-line mode self-test routine". |
|
|
|
Self-testing aborted! |
|
|
|
~~~ |
|
|
|
|
|
|
|
|
|
|
|
## smartd |
|
|
|
|
|
|
|
On active **smartd** en listant les périphériques concernés via `/etc/default/smartmontools` : |
|
|
|
|
|
|
|
~~~ |
|
|
|
enable_smart="/dev/sda /dev/sdb" |
|
|
|
start_smartd=yes |
|
|
|
smartd_opts="--interval=1800" |
|
|
|
~~~ |
|
|
|
|
|
|
|
Puis on peut personnaliser l'adresse email de réception des alertes via `/etc/smartd.conf` : |
|
|
|
|
|
|
|
~~~ |
|
|
|
DEVICESCAN -d removable -n standby -m monitoring@example.com -M exec /usr/share/smartmontools/smartd-runner |
|
|
|
~~~ |
|
|
|
|
|
|
|
## FAQ |
|
|
|
|
|
|
|
Voir <https://www.smartmontools.org/wiki/FAQ> |
|
|
|
|
|
|
|
smartctl -s on /dev/hda %activer |
|
|
|
smartctl -a /dev/hda %infos |
|
|
|
smartctl -t long /dev/hda |
|
|
|
smartctl -l error /dev/hda |
|
|
|
gg |
|
|
|
### Device does not support SMART |
|
|
|
|
|
|
|
Certains disques ne supportent pas SMART. Exemple : |
|
|
|
|
|
|
|
~~~ |
|
|
|
# smartctl -a /dev/sda |
|
|
|
|
|
|
|
Device: ATA Maxtor 7Y250M0 Version: YAR5 |
|
|
|
Serial number: XXXXXX |
|
|
|
Device type: disk |
|
|
|
Local Time is: Thu Dec 7 01:59:43 2017 CET |
|
|
|
Device does not support SMART |
|
|
|
|
|
|
|
Error Counter logging not supported |
|
|
|
|
|
|
|
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on'] |
|
|
|
Device does not support Self Test logging |
|
|
|
~~~ |