Openproject 503 Service Temporarily Unavailable after UCS update

tafkaz · September 17, 2018, 5:58pm

Hi.
after the UCS update to the newest UCS openproject is not reachable anymore.
Maybe you have an idea on what is missing:

univention-app shell openproject
/var/log/apache2/error.log
[Mon Sep 17 19:53:27 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.0.1) failed
[Mon Sep 17 19:53:27 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)
[Mon Sep 17 19:53:29 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.0.1) failed
[Mon Sep 17 19:53:29 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)

 netstat -tlpen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode       PID/Program name
tcp        0      0 0.0.0.0:544             0.0.0.0:*               LISTEN      0          44221       1467/inetd
tcp        0      0 127.0.0.1:587           0.0.0.0:*               LISTEN      0          44693       -
tcp        0      0 127.0.0.1:11211         0.0.0.0:*               LISTEN      65534      42639       -
tcp        0      0 127.0.0.1:465           0.0.0.0:*               LISTEN      0          44688       -
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      0          42610       1429/sshd
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      0          44683       -
tcp6       0      0 :::80                   :::*                    LISTEN      0          45347       1751/apache2
tcp6       0      0 :::22                   :::*                    LISTEN      0          42612       1429/sshd

/var/log/openproject/web-1.log
chroot: invalid group `Domain'
chroot: invalid group `Users'
chroot: invalid group `Domain'
chroot: invalid group `Users'
chroot: invalid group `Domain'
chroot: invalid group `Users'
chroot: invalid group `Domain'
chroot: invalid group `Users'
chroot: invalid group `Domain'
chroot: invalid group `Users'

Let me know if you should need more info.
thanx
Sascha

Moritz_Bunkus · September 18, 2018, 7:39am

Hey,

does the OpenProject docker container exist? Is it running (docker ps --all)?

Is the app still installed (univention-app info and ucr search --brief appcenter/apps/openproject/)?

Kind regards
mosu

tafkaz · September 18, 2018, 7:53am

Hey Moritz,
always the pleasure

docker ps --all
CONTAINER ID        IMAGE                                                  COMMAND             CREATED             STATUS              PORTS                   NAMES
224d93632a93        docker.software-univention.de/ucs-appbox-amd64:4.1-4   "/sbin/init"        12 days ago         Up 12 hours         0.0.0.0:40001->80/tcp   compassionate_shaw

univention-app info
UCS: 4.3-2 errata234
Installed: mailserver=12.0 open-xchange-guard=2.10.0-ucs1 open-xchange-text=7.10.0-ucs1 oxseforucs=7.10.0-ucs1 4.1/openproject=7.3.1
Upgradable:

ucr search --brief appcenter/apps/openproject/
appcenter/apps/openproject/container: 224d93632a93183f21a83282cb68e44a94ca10c0df96417463602795125a024a
appcenter/apps/openproject/hostdn: cn=openp-89848117,cn=memberserver,cn=computers,dc=comceptplus,dc=com
appcenter/apps/openproject/image: docker.software-univention.de/ucs-appbox-amd64:4.1-4
appcenter/apps/openproject/ip: 172.17.0.1
appcenter/apps/openproject/ports/80: 40001
appcenter/apps/openproject/status: installed
appcenter/apps/openproject/ucs: 4.1
appcenter/apps/openproject/version: 7.3.1

seems good to me…
best
Sascha

Moritz_Bunkus · September 18, 2018, 8:02am

Hey,

your log shows that your Apache tries to connect to port 6000 whereas OpenProject should be running on Port 40001. Then again, your netstat output doesn’t show a service listening on either port.

Is your netstat output complete? Let’s try this command instead: lsof -PniTCP:40001 -sTCP:LISTEN
It looks like your Apache configuration files are not up to date. Please run the following:
```
univention-check-templates
grep -r /openproject/ /etc/apache2/
```

My guess is that you modified the templates for the Apache configuration files and you didn’t migrate those changes to the new versions of the templates created during the upgrade.

m.

tafkaz · September 18, 2018, 8:09am

ok, here you go

lsof -PniTCP:40001 -sTCP:LISTEN
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
docker-pr 3941 root    4u  IPv6  34845      0t0  TCP *:40001 (LISTEN)

univention-check-templates seems ok, no output at all.

grep -r /openproject/ /etc/apache2/
/etc/apache2/sites-available/default-ssl.conf:  ProxyPass /openproject/ http://127.0.0.1:40001/openproject/ retry=0
/etc/apache2/sites-available/default-ssl.conf:  ProxyPassReverse /openproject/ http://127.0.0.1:40001/openproject/
/etc/apache2/sites-available/000-default.conf:  ProxyPass /openproject/ http://127.0.0.1:40001/openproject/ retry=0
/etc/apache2/sites-available/000-default.conf:  ProxyPassReverse /openproject/ http://127.0.0.1:40001/openproject/

BUT:
inside the docker container (univention-app shell openproject)

root@openp-89848117:/# grep -r /openproject/ /etc/apache2/
/etc/apache2/sites-available/openproject.conf:Include /etc/openproject/addons/apache2/includes/server/*.conf
/etc/apache2/sites-available/openproject.conf:  DocumentRoot /opt/openproject/public
/etc/apache2/sites-available/openproject.conf:  Include /etc/openproject/addons/apache2/includes/vhost/*.conf
/etc/apache2/sites-available/openproject.conf:  ProxyPass /openproject/ http://127.0.0.1:6000/openproject/ retry=0
/etc/apache2/sites-available/openproject.conf:  ProxyPassReverse /openproject/ http://127.0.0.1:6000/openproject/

Moritz_Bunkus · September 18, 2018, 8:15am

Have you tried restarting the Docker container?

tafkaz · September 18, 2018, 8:29am

sure…didn’t help unfortunately.

Moritz_Bunkus · September 18, 2018, 8:35am

Alright. Let’s dig into that chroot issue a bit further. Maybe that’s the reason the unicorn process isn’t running.

In the Docker container, what’s the output of:

id openproject
getent group 'Domain Users'
cat /etc/openproject/conf.d/server

tafkaz · September 18, 2018, 8:56am

getent group 'Domain Users'
Domain Users:*:5001:user1,user2,user3,[...],user25, user26

obviously the users are real users on the system, just changed them for privacy reasons.

cat /etc/openproject/conf.d/server
export SERVER_HOSTNAME="localhost"
export SERVER_PROTOCOL="http"
export SERVER_USER="www-data"
export SERVER_GROUP="www-data"
export SERVER_PATH_PREFIX="/openproject/"

Moritz_Bunkus · September 18, 2018, 8:56am

Hey,

and what about id openproject?

m.

tafkaz · September 18, 2018, 8:59am

id openproject
uid=106(openproject) gid=112(openproject) groups=112(openproject),33(www-data),5001(Domain Users)

Moritz_Bunkus · September 18, 2018, 9:07am

Ah. Your openproject user is member of the Domain Users group, which isn’t the default (compared to one of my test systems). It looks like one of the start scripts cannot deal with group names with spaces in them.

My guess is the following:

You created a user called openproject in your UCS. That user’s primary group is Domain Users, or he’s a member of the Domain Users LDAP group.
Inside the Docker container there’s a local user also called openproject (as well as a local group called openproject).
The Docker container’s user & group management is LDAP-enabled, meaning the local and LDAP user & group information will be merged.
During the start of the unicorn process the group information is used, probably for changing ownership or ensuring access to certain files and folders. That’s most likely implemented as a shell script, or as a script that executes certain commands via the shell, and in those commands user & group names aren’t escaped properly — leading to errors that there’s no group named Domain and no group named Users.

If that’s correct and you don’t need that LDAP user openproject, simply delete it and restart the Docker container.

m.

tafkaz · September 18, 2018, 9:23am

Hi,
well, yes, there is a user openproject in LDAP:
grafik

Sure it’s safe to delete? Nobody created this one manually i was told…
But i just looked it up in our test environment, and it seems you’re right…we don’t have any openproject user here.

Moritz_Bunkus · September 18, 2018, 9:30am

Maybe it’s a remnant from an installation of a previous OpenProject version where adding such a user was still part of the installation process. I’m pretty sure it isn’t needed for the Dockerized OpenProject app. Therefore it should be safe to remove.

m.

tafkaz · September 18, 2018, 9:32am

Hey Moritz,
again…you’re the man!
The thing is now working again!

Cheers and thank you very much
Sascha

Moritz_Bunkus · September 18, 2018, 9:35am

Glad to hear it! And you’re welcome.

m.

tafkaz · October 1, 2018, 4:21pm

Hi,

don’t know what exactly triggered it, but we have that error again now.
Openproject is not reachable, because of the openproject-web-1 service is not running.
The user “openproject” that caused the service to start before is unexistant now, so i don’t really know what i should do.
Here’s what i was able to see inside the openproject docker:

root@openp-89848117:/# service openproject status
openproject-web-1 is not running

tail -f /var/log/apache2/error.log
[Mon Oct 01 18:09:57 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.      0.1) failed
[Mon Oct 01 18:09:57 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)
[Mon Oct 01 18:09:57 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.      0.1) failed
[Mon Oct 01 18:09:57 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)
[Mon Oct 01 18:09:57 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.      0.1) failed
[Mon Oct 01 18:09:57 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)
[Mon Oct 01 18:09:57 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.      0.1) failed
[Mon Oct 01 18:09:57 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)
[Mon Oct 01 18:10:48 2018] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:6000 (127.0.      0.1) failed
[Mon Oct 01 18:10:48 2018] [error] ap_proxy_connect_backend disabling worker for (127.0.0.1)

 tail -f /var/log/openproject/worker-1.log
/opt/openproject/vendor/bundle/ruby/2.4.0/gems/bundler-1.15.1/lib/bundler/cli.rb:20:in `dispatch'
/opt/openproject/vendor/bundle/ruby/2.4.0/gems/bundler-1.15.1/lib/bundler/vendor/thor/lib/thor/base.rb:444:in `start'
/opt/openproject/vendor/bundle/ruby/2.4.0/gems/bundler-1.15.1/lib/bundler/cli.rb:10:in `start'
/opt/openproject/vendor/bundle/ruby/2.4.0/gems/bundler-1.15.1/exe/bundle:35:in `block in <top (required)>'
/opt/openproject/vendor/bundle/ruby/2.4.0/gems/bundler-1.15.1/lib/bundler/friendly_errors.rb:121:in `with_friendly_errors'
/opt/openproject/vendor/bundle/ruby/2.4.0/gems/bundler-1.15.1/exe/bundle:27:in `<top (required)>'
/opt/openproject/bin/bundle:3:in `load'
/opt/openproject/bin/bundle:3:in `<main>'
Tasks: TOP => jobs:work => jobs:environment_options => environment:full
(See full trace by running task with --trace)

 tail -f /var/log/openproject/web-1.log
  /opt/openproject/vendor/bundle/ruby/2.4.0/gems/rack-2.0.3/lib/rack/builder.rb:55:in `initialize'
  config.ru:1:in `new'
  config.ru:1:in `<main>'
  /opt/openproject/vendor/bundle/ruby/2.4.0/gems/unicorn-5.3.0/lib/unicorn.rb:56:in `eval'
  /opt/openproject/vendor/bundle/ruby/2.4.0/gems/unicorn-5.3.0/lib/unicorn.rb:56:in `block in builder'
  /opt/openproject/vendor/bundle/ruby/2.4.0/gems/unicorn-5.3.0/lib/unicorn/http_server.rb:796:in `build_app!'
  /opt/openproject/vendor/bundle/ruby/2.4.0/gems/unicorn-5.3.0/lib/unicorn/http_server.rb:139:in `start'
  /opt/openproject/vendor/bundle/ruby/2.4.0/gems/unicorn-5.3.0/bin/unicorn:126:in `<top (required)>'
  /opt/openproject/vendor/bundle/ruby/2.4.0/bin/unicorn:22:in `load'
  /opt/openproject/vendor/bundle/ruby/2.4.0/bin/unicorn:22:in `<top (required)>'

bit lost here…
thanks again
Sascha

tafkaz · October 1, 2018, 4:37pm

Update:
After restarting the complete UCS openproject is now up and running again.
Note that i did try to restart the openproject services in the docker before, but they refused to do so.

Moritz_Bunkus · October 4, 2018, 9:47am

Hey,

when you post log files, please post the full error messages. Simply calling tail on the log files doesn’t suffice as stack traces are often very long. For most programming languages the important part can usually be found at the start of the stack trace, not at the end (with the notable exception of Python — but this is Ruby, not Python).

m.

tafkaz · October 4, 2018, 10:10am

You’re right of course…
I suggest to wait until this happens again (hopefully never) and then come up with more log.
thanks
Sascha