Web console not responding after letsencrypt upgrade fail

ok… a very odd issue… I just upgraded my servers tonight. after a reboot another app update was found… letsencrypt. I started the upgrade on the webconsole and at 26% there was an error fetching data and the webconsole stopped responding, I waited over an hour for any change and nothing… I tried to open a new tab to the server and the connections are all refused… I can ssh to the server, and all over services appear to be working as I can browse the LDAP structure with apache directory studio.

I’m not sure what I can check… I have attempted to restart the service with the cli:

root@ucs1:~# /etc/init.d/univention-management-console-web-server stop
[ ok ] Stopping univention-management-console-web-server (via systemctl): univention-management-console-web-server.service.
root@ucs1:~# /etc/init.d/univention-management-console-web-server start
[ ok ] Starting univention-management-console-web-server (via systemctl): univention-management-console-web-server.service.
root@ucs1:~#


syslog shows no errors and a proper restart
Jan 21 00:16:33 ucs1 univention-management-console-web-server[12493]: Stopping Univention Management Console Web Server: univention-management-console-web-server.
Jan 21 00:16:33 ucs1 systemd[1]: Stopped LSB: Univention Management Console Web Server.
Jan 21 00:16:33 ucs1 systemd[1]: Starting LSB: Univention Management Console Web Server...
Jan 21 00:16:35 ucs1 univention-management-console-web-server[12504]: Starting Univention Management Console Web Server: univention-management-console-web-server.
Jan 21 00:16:35 ucs1 systemd[1]: Started LSB: Univention Management Console Web Server.

with no change… no errors, but no change.

Is there a logfile I can provide for troubleshooting? I can’t seem to find anything wrong.

Hey,

“connection refused” sounds more like Apache isn’t running.

Apart from that I recommend you try running the upgrade again from the command line (univention-upgrade) and let it continue in case it was left incomplete before. Afterwards verify that apache2.service is running. When in doubt, reboot.

m.

Hey Moritz, I did all those things already, and rebooted for good measure. below is what I see.

Last login: Mon Jan 21 11:03:28 2019 from 10.20.30.20
root@ucs1:~# univention-upgrade

Starting univention-upgrade. Current UCS version is 4.3-3 errata407

Checking for local repository:                          none
Checking for package updates:                           none
Checking for app updates:                               none
Checking for release updates:                           none
root@ucs1:~#

aha!! - we seem to be missing the SSL cert somehow…

root@ucs1:~# service apache2 status
● apache2.service - The Apache HTTP Server
   Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2019-01-20 23:47:16 PST; 11h ago
  Process: 1465 ExecStart=/usr/sbin/apachectl start (code=exited, status=1/FAILURE)
      CPU: 655ms

Jan 20 23:47:10 ucs1 systemd[1]: Starting The Apache HTTP Server...
Jan 20 23:47:16 ucs1 apachectl[1465]: AH00526: Syntax error on line 30 of /etc/apache2/sites-enabled/univention-letsencrypt.c
Jan 20 23:47:16 ucs1 apachectl[1465]: SSLCertificateFile: file '/etc/univention/letsencrypt/signed_chain.crt' does not exist
Jan 20 23:47:16 ucs1 apachectl[1465]: Action 'start' failed.
Jan 20 23:47:16 ucs1 apachectl[1465]: The Apache error log may have more information.
Jan 20 23:47:16 ucs1 systemd[1]: apache2.service: Control process exited, code=exited status=1
Jan 20 23:47:16 ucs1 systemd[1]: Failed to start The Apache HTTP Server.
Jan 20 23:47:16 ucs1 systemd[1]: apache2.service: Unit entered failed state.
Jan 20 23:47:16 ucs1 systemd[1]: apache2.service: Failed with result 'exit-code'.

So… i guess the question now… can I cli uninstall letsencrypt with “univention-app remove letsencrypt” then try a fresh install? that should take care of that portion of the apache config, correct ??

Hey,

you’re kind of in a catch-22 situation: the web server needs the certificate to be present in order to be able to start, but in order to re-create the certificate the web server must be running.

So what I’d try first instead of reinstalling is getting the web server up and running and updating the certificates:

  1. Edit /etc/apache2/sites-available/univention-letsencrypt.conf and comment-out all VirtualHost entries completely.
  2. Try starting Apache: systemctl restart apache2 followed by systemctl status apache2
  3. If it’s running, try updating the certificates: /usr/share/univention-letsencrypt/refresh-cert-cron (post the output if this doesn’t work or doesn’t create the certificates)
  4. Re-create the configuration file modified in step 1: ucr commit /etc/apache2/sites-available/univention-letsencrypt.conf
  5. Restart Apache once more like in step 2

If all of those steps succeed, your setup should be OK again.

m.

2 Likes

Excellent, Thank you for your help… I will be trying this today and will post back no matter how it ends up.

root@ucs1:/etc/univention/letsencrypt#  systemctl restart apache2 && systemctl status apache2
● apache2.service - The Apache HTTP Server
   Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2019-01-22 08:37:23 PST; 7s ago
  Process: 994 ExecStart=/usr/sbin/apachectl start (code=exited, status=0/SUCCESS)
 Main PID: 999 (apache2)
    Tasks: 6 (limit: 4915)
   Memory: 31.2M
      CPU: 965ms
   CGroup: /system.slice/apache2.service
           ├─ 999 /usr/sbin/apache2 -k start
           ├─1000 /usr/sbin/apache2 -k start
           ├─1001 /usr/sbin/apache2 -k start
           ├─1002 /usr/sbin/apache2 -k start
           ├─1003 /usr/sbin/apache2 -k start
           └─1004 /usr/sbin/apache2 -k start

Jan 22 08:37:21 ucs1 systemd[1]: Starting The Apache HTTP Server...
Jan 22 08:37:23 ucs1 systemd[1]: Started The Apache HTTP Server.
root@ucs1:/etc/univention/letsencrypt#
root@ucs1:/etc/univention/letsencrypt# /usr/share/univention-letsencrypt/refresh-cert-cron
Tue Jan 22 08:39:34 PST 2019
Refreshing certificate for following domains:
ucs1.sgvfr.com
Parsing account key...
Parsing CSR...
Found domains: ucs1.sgvfr.com
Getting directory...
Directory found!
Registering account...
Already registered!
Creating new order...
Order created!
Verifying ucs1.sgvfr.com...
Traceback (most recent call last):
  File "/usr/share/univention-letsencrypt/acme_tiny.py", line 197, in <module>
    main(sys.argv[1:])
  File "/usr/share/univention-letsencrypt/acme_tiny.py", line 193, in main
    signed_crt = get_crt(args.account_key, args.csr, args.acme_dir, log=LOGGER, CA=args.ca, disable_check=args.disable_check, directory_url=args.directory_url, contact=args.contact)
  File "/usr/share/univention-letsencrypt/acme_tiny.py", line 149, in get_crt
    raise ValueError("Challenge did not pass for {0}: {1}".format(domain, authorization))
ValueError: Challenge did not pass for ucs1.sgvfr.com: {u'status': u'invalid', u'challenges': [{u'status': u'invalid', u'url': u'https://acme-staging-v02.api.letsencrypt.org/acme/challenge/ClloXkJ30bTVisxrS4Icom6Gjrl0qP6hSrO2T-pqYm0/226353534', u'token': u'nOQhCf_9Le09EuL_CkoTwF68JKyKCJUOrqOKSJk1I_8', u'type': u'dns-01'}, {u'status': u'invalid', u'url': u'https://acme-staging-v02.api.letsencrypt.org/acme/challenge/ClloXkJ30bTVisxrS4Icom6Gjrl0qP6hSrO2T-pqYm0/226353535', u'token': u'DYvIADo7nLHMyqkulOn0VpBX1rBSpP6ckVmP-5Bk2aE', u'type': u'tls-alpn-01'}, {u'status': u'invalid', u'validationRecord': [{u'url': u'http://ucs1.sgvfr.com/.well-known/acme-challenge/ZmkbLKoFNSMYdL9Bwb5bUKr32BERv9VBGqIzthmOH_0', u'hostname': u'ucs1.sgvfr.com', u'port': u'80'}], u'url': u'https://acme-staging-v02.api.letsencrypt.org/acme/challenge/ClloXkJ30bTVisxrS4Icom6Gjrl0qP6hSrO2T-pqYm0/226353536', u'token': u'ZmkbLKoFNSMYdL9Bwb5bUKr32BERv9VBGqIzthmOH_0', u'error': {u'status': 400, u'type': u'urn:ietf:params:acme:error:dns', u'detail': u'DNS problem: NXDOMAIN looking up A for ucs1.sgvfr.com'}, u'type': u'http-01'}], u'identifier': {u'type': u'dns', u'value': u'ucs1.sgvfr.com'}, u'expires': u'2019-01-29T16:39:52Z'}
Setting letsencrypt/status
root@ucs1:/etc/univention/letsencrypt#

Well… since this is behind my firewall… all challenges will fail since I don’t have it exposed too the internet… I could add this to my reverse proxy and firewall the crap out of it I guess. I’m going to take a wild guess this will fail a restart…

root@ucs1:/etc/univention/letsencrypt# ucr commit /etc/apache2/sites-available/univention-letsencrypt.conf
File: /etc/apache2/sites-available/univention-letsencrypt.conf
root@ucs1:/etc/univention/letsencrypt#
root@ucs1:/etc/univention/letsencrypt# systemctl restart apache2 && systemctl status apache2
Job for apache2.service failed because the control process exited with error code.
See "systemctl status apache2.service" and "journalctl -xe" for details.
root@ucs1:/etc/univention/letsencrypt#  systemctl status apache2
● apache2.service - The Apache HTTP Server
   Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2019-01-22 08:43:39 PST; 17s ago
  Process: 1389 ExecStop=/usr/sbin/apachectl stop (code=exited, status=1/FAILURE)
  Process: 1398 ExecStart=/usr/sbin/apachectl start (code=exited, status=1/FAILURE)
 Main PID: 999 (code=exited, status=0/SUCCESS)
      CPU: 179ms

Jan 22 08:43:39 ucs1 systemd[1]: Starting The Apache HTTP Server...
Jan 22 08:43:39 ucs1 apachectl[1398]: AH00526: Syntax error on line 30 of /etc/apache2/sites-enabled/univention-letsencrypt.c
Jan 22 08:43:39 ucs1 apachectl[1398]: SSLCertificateFile: file '/etc/univention/letsencrypt/signed_chain.crt' does not exist
Jan 22 08:43:39 ucs1 apachectl[1398]: Action 'start' failed.
Jan 22 08:43:39 ucs1 apachectl[1398]: The Apache error log may have more information.
Jan 22 08:43:39 ucs1 systemd[1]: apache2.service: Control process exited, code=exited status=1
Jan 22 08:43:39 ucs1 systemd[1]: Failed to start The Apache HTTP Server.
Jan 22 08:43:39 ucs1 systemd[1]: apache2.service: Unit entered failed state.
Jan 22 08:43:39 ucs1 systemd[1]: apache2.service: Failed with result 'exit-code'.
lines 1-17/17 (END)

I will take a look at setting up an RP entry on my firewall… I really don’t like exposing these servers to the internet… they are specifically only accessible via local network and VPN…

Perhaps I should just remove it… I was going to test it out on one of my UCS servers, but after this it might just be more of a pain… I can always secure the RP and keep internal non-SSL.

I’ll post again after trying a few things.

Hey,

yeah, well, if you want to use Let’s Encrypt, you’ll have to make your local web server reachable from the Let’s Encrypt servers (meaning from the internet). Until you decide to really use it, you should indeed remove the app again. You can do that from the command line, too: univention-app remove letsencrypt

After the removal the file /etc/apache2/sites-available/univention-letsencrypt.conf and the symlink in …/sites-enabled/… might still exist; in that case you should remove them manually. Then start Apache again.

m.

I did end up removing it, I was testing as an alternative to a separate web-server and would have been using a reverse proxy in front of the UCS. it caught me off guard since I’ve never had an issue with upgrading before with LE installed… it seemed odd and wanted to be safer than sorry asking her since this is a production server.

Will be rebuilding my sandbox VM network for any future testing lol.

Thanks for your help!

You’re quite welcome.

The Let’s Encrypt app is safe to have installed as long as it hasn’t been configured for a domain yet. As soon as you configure the app & enable “use certificate for Apache”, the corresponding VirtualHost entry will be added to the server configuration, and that’s when things can go wrong if certificate retrieval has never worked.

Mastodon