[Adminsysters] SMART error meeting

Donna donna at genderchangers.org
Sun Oct 2 09:02:36 CEST 2022


Hi,

On the server adele we have a RAID1 setup (mirror) with 2 disks of 1TB 
each, see:

mdadm --detail /dev/md/0

I started looking around with smartctl but need to be off again so will 
continue later.

Donna

On 02/10/2022 08:20, Donna wrote:
> Hi,
> 
> My memory is really hopeless but I will log in and look around (with 
> mdadm) etc. BRB
> 
> xx, Donna
> 
> On 01/10/2022 18:19, mara wrote:
>> Does @donna or @gaba who did the server building and installation 
>> remember how the RAID is used? We might need to make use of the RAID 
>> to backup the whole disk, and then try and fix these sectors or 
>> deactivate them.
>>
>> I made a possible date picker here for next week.
>> https://transitional.anarchaserver.org/date/studs.php?poll=P3IRURqWSpAkMLYk 
>>
>>
>> Expecting to take a few sessions, here is the relevant ticket with the 
>> manuals mentioned in Micha's email:
>> https://git.systerserver.net/systerserver/notes/-/issues/52
>>
>> m
>>
>> On 10/1/22 10:43, bolwerK wrote:
>>>
>>> Hi we forward the answer of micha to asses our smart error.
>>> she is willing to do a session next week. who wants to be with us and 
>>> stay in the loop for making an appointment?
>>> maybe we can have a look together before we ask her advice.
>>> xm
>>>
>>>
>>> hey!
>>>
>>> sorry for being so late with the response... my first week back in 
>>> nyc has been crazy busy with work and seeing people but things are 
>>> finally calming down.
>>>
>>> SO! UNCORRECTABLE SECTORS! This is generally a really bad sign and 
>>> shows that the hard drive is starting to degrade. If you ARE NOT 
>>> using some sort of RAID system where your drives are redundant, I 
>>> would IMMEDIATELY backup the data on the drive and start finding a 
>>> replacement. It is only a matter of time before that drive dies.
>>>
>>> If you ARE using RAID and would rather keep that drive alive as long 
>>> as possible, you can isolate the blocks and attempt to reset them. 
>>> This will trigger the firmware on the hard drive to either a) repair 
>>> the blocks or b) choose to not use them for future storage. 
>>> Smartmon's documentation has a good tutorial on how to do this... I 
>>> recommend going through the whole thing and BACKING UP your drive 
>>> before attempting the fix: 
>>> https://www.smartmontools.org/wiki/BadBlockHowto
>>>
>>> Before any of this though, it may be good to run a smartmon long test 
>>> to verify the results of the short test. You can do this by running 
>>> `sudo smartctl -t long /dev/sdb`. You can get the results with `sudo 
>>> smartctl -a /dev/sdc`, noting that a long test can take a couple 
>>> hours to complete. This is a good page for basic smartmon usage: 
>>> https://www.thomas-krenn.com/en/wiki/SMART_tests_with_smartctl
>>>
>>> Also, if you are hungry for more drives, I have an extra Western 
>>> Digital RED 8TB (one of these: 
>>> https://aphnetworks.com/reviews/western-digital-red-wd80efzx-8tb). I 
>>> can ship it to you when I get back to france on the 18th.
>>>
>>> Anyway, I'd be happy to do a little debugging session together this 
>>> coming week if you'd like!
>>>
>>> Micha (she/her)
>>> -----------------------------
>>> http://micha.codes/ <http://probablemodels.com/>
>>> http://github.com/mynameisfiber/
>>>
>>>
>>> On Wed, Sep 21, 2022 at 7:00 AM bolwerK <info at ooooo.be> wrote:
>>>
>>>
>>>     hi this the mail we get on backup server at S14.
>>>     Can we have a look together next week - and maybe i can invite the
>>>     others of systerserver.
>>>     shouldn't take long but easier than trying to grasp manuals *
>>>     xm
>>>
>>>     -------- Forwarded Message --------
>>>     Subject:     SMART error (OfflineUncorrectableSector) detected on 
>>> host:
>>>     adele
>>>     Date:     Wed, 21 Sep 2022 04:19:52 +0200 (CEST)
>>>     From:     root <root at adele.mur.at> <mailto:root at adele.mur.at>
>>>     To:     root at adele.mur.at
>>>
>>>
>>>
>>>     This message was generated by the smartd daemon running on:
>>>
>>>     host name: adele
>>>     DNS domain: mur.at <http://mur.at>
>>>
>>>     The following warning/error was logged by the smartd daemon:
>>>
>>>     Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors
>>>
>>>     Device info:
>>>     HP SSD S700 Pro 1TB, S/N:HBSA49011400028, WWN:a-000000-000000000,
>>>     FW:R0201B, 1.02 TB
>>>
>>>     For details see host's SYSLOG.
>>>
>>>     You can also use the smartctl utility for further investigation.
>>>     The original message about this issue was sent at Sun Jun 5 01:11:45
>>>     2022 CEST
>>>     Another message will be sent in 24 hours if the problem persists.
>>>
>>>
>>> _______________________________________________
>>> Adminsysters mailing list
>>> Adminsysters at lists.genderchangers.org
>>> https://lists.genderchangers.org/mailman/listinfo/adminsysters
>>
> _______________________________________________
> Adminsysters mailing list
> Adminsysters at lists.genderchangers.org
> https://lists.genderchangers.org/mailman/listinfo/adminsysters


More information about the Adminsysters mailing list