Announcement

Collapse
No announcement yet.

042stab127.2 problem

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 042stab127.2 problem

    Applied kernelcare to two OpenVZ nodes yesterday.

    042stab127.2

    One of them seems ok but the other one had problems overnight. Really high swap and load even though there was still plenty of unused memory. Individual containers had all their vswap used. Not all containers are using vswap. I rebooted the containers that have vswap and that seemed to correct the swap/load. Its too much of a coincidence that this happened the evening I updated the kernel. Nothing else was done and this server has been running continuously for months

    The other node is still ok. Hopefully restarting the containers corrected it and I wont have to roll back the kernel.

    Both nodes are Supermicro with Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz and 16GB RAM

    Both running kcare 042stab127.2

    Base kernel is different though

    The one still working is 2.6.32-042stab113.21
    The one that had a problem is 2.6.32-042stab108.8

  • #2
    BTW, they were both running the same kcare kernel before the update. The last one released before this meltdown/Spectre one.

    Comment


    • #3
      So am I the only one having this problem? It is definitely something related to Kernelcare. It happen right after I had a heavy process run that clears out all the cache and causes high load for a few minutes. Right after that, instead of the load subsiding, it skyrocketed. Load average was 200 when it is normally below 4. CPU wait state was hopping around between 50% and 95% which means its waiting for I/O.

      I disabled the heavy cron job for now but I suspect the problem is still there and could be triggered again when the heavy cron job runs.

      Comment


      • #4
        I should point out that I have updated other servers with the native 042stab127.2 kernel, not kernelcare, and havent seen this problem.

        Comment


        • #5
          Fred, please create a ticket at https://cloudlinux.zendesk.com (KernelCare department) so our support team can help you with the issue.

          Comment


          • #6
            Was there any outcome to this? Was it KC? Ive been scared to yum update kernelcare for a while now...

            Comment


            • #7
              A ticket #26416 has been created to investigate the issue.
              BTW, yum update kernelcare shouldnt bring in any problems with binary patches that KC applies.

              Comment


              • #8
                Im stupid, I thought you needed the latest KC to upgrade to stab127.2...... Seems Ive had this kernel version for ages.... as (_8(|) says..... doh

                Comment


                • #9
                  So far, the high load problem has not re-occurred since the first time where I had restart containers that use vswap.

                  Comment


                  • #10
                    Glad to hear that.

                    Feel free to let me know if you have any further questions or concerns.

                    Comment


                    • #11
                      Well it just happened again. I rebooted the 2 containers that seemed to be causing the high load and it went away again. I am going to have to upgrade this node to the real kernel and see if that takes care of it.

                      Comment


                      • #12
                        Hello.

                        We would like to check this problem.
                        Please submit a ticket to https://cloudlinux.zendesk.com, our techs will check the issue in place.

                        Comment


                        • #13
                          > Hello.
                          >
                          > We would like to check this problem.
                          > Please submit a ticket to https://cloudlinux.zendesk.com, our techs will check the issue in place.

                          You guys already did and said you couldnt find anything. It seems to be hard to reproduce. I see there was one race condition that was fixed in the latest OVZ kernel so maybe that took care of it. It started happening as soon as up updated Kernelcare and I havens seen this on other nodes not running KernelCare but running the exact same real kernel. So I think the best thing is to run the real kernel in this case.

                          Comment


                          • #14
                            I am sorry, but its hard to help you without a clear understanding whats going on on that server.
                            We really would like to check this problem in place. Please submit a ticket to https://cloudlinux.zendesk.com and our techs will investigate it further.

                            Comment

                            Working...
                            X