Looks like it works.

Edit still see some performance issues. Needs more troubleshooting

Update: Registrations re-opened We encountered a bug where people could not log in, see https://github.com/LemmyNet/lemmy/issues/3422#issuecomment-1616112264 . As a workaround we opened registrations.

Thanks

First of all, I would like to thank the Lemmy.world team and the 2 admins of other servers @stanford@discuss.as200950.com and @sunaurus@lemm.ee for their help! We did some thorough troubleshooting to get this working!

The upgrade

The upgrade itself isn’t too hard. Create a backup, and then change the image names in the docker-compose.yml and restart.

But, like the first 2 tries, after a few minutes the site started getting slow until it stopped responding. Then the troubleshooting started.

The solutions

What I had noticed previously, is that the lemmy container could reach around 1500% CPU usage, above that the site got slow. Which is weird, because the server has 64 threads, so 6400% should be the max. So we tried what @sunaurus@lemm.ee had suggested before: we created extra lemmy containers to spread the load. (And extra lemmy-ui containers). And used nginx to load balance between them.

Et voilà. That seems to work.

Also, as suggested by him, we start the lemmy containers with the scheduler disabled, and have 1 extra lemmy running with the scheduler enabled, unused for other stuff.

There will be room for improvement, and probably new bugs, but we’re very happy lemmy.world is now at 0.18.1-rc. This fixes a lot of bugs.

  • darkbaron202@lemmy.world
    link
    fedilink
    arrow-up
    11
    ·
    11 months ago

    thank you for letting us know behind-the-scene stuffs.

    me myself is a sysadmin and really like story of successfully scaling up servers. very satisfy to hear.

    once again, thank you!