How to limit the amount of memory httpd is using on CentOS 7 with Cgroups

CentOS 7, introduced something called CGroups, or control groups which has been in the stable kernel since about 2006. The systemD unit is made of several parts, systemD unit resource controllers can be used to limit memory usage of a service, such as httpd control group in systemD.

# Set Memory Limits for a systemD unit
systemctl set-property httpd MemoryLimit=500MB

# Get Limits for a systemD Unit
systemctl show -p CPUShares 
systemctl show -p MemoryLimit

Please note that OS level support is not generally provided with managed infrastructure service level, however I wanted to help where I could hear, and it shouldn’t be that difficult because the new stuff introduced in SystemD and CGroups is much more powerful and convenient than using ulimit or similar.

Creating a proper Method of Retrieving, Sorting, and Parsing Rackspace CDN Access Logs

So, this has been rather a bane on the life which is lived as Adam Bull. Basically, a large customer of ours had 50+ CDN’s, and literally hundreds of gigabytes of Log Files. They were all in Rackspace Cloud Files, and the big question was ‘how do I know how busy my CDN is?’.

screen-shot-2016-11-07-at-12-41-30-pm

This is a remarkably good question, because actually, not many tools are provided here, and the customer will, much like on many other CDN services, have to download those logs, and then process them. But that is actually not easier either, and I spent a good few weeks (albeit when I had time), trying to figure out the best way to do this. I dabbled with using tree to display the most commonly used logs, I played with piwik, awstats, and many others such as goaccess, all to no avail, and even used a sophisticated AWK script from our good friends in Operations. No luck, nothing, do not pass go, or collect $200. So, I was actually forced to write something to try and achieve this, from start to finish. There are 3 problems.

1) how to easily obtain .CDN_ACCESS_LOGS from Rackspace Cloud Files to Cloud Server (or remote).
2) how to easily process these logs, in which format.
3) how to easily present these logs, using which application.

The first challenge was actually retrieving the files.

swiftly --verbose --eventlet --concurrency=100 get .CDN_ACCESS_LOGS --all-objects -o ./

Naturally to perform this step above, you will need a working, and setup swiftly environment. If you don’t know what swiftly, is or understand how to set up a swiftly envrionment, please see this article I wrote on the subject of deleting all files with swiftly (The howto explains the environment setup first! Just don’t follow the article to the end, and continue from here, once you’ve setup and installed swiftly)

Fore more info see:
https://community.rackspace.com/products/f/25/t/7190

Processing the Rackspace CDN Logs that we’ve downloaded, and organising them for further log processing
This required a lot more effort, and thought

The below script sits in the same folder as all of the containers

# ls -al 
total 196
drwxrwxr-x 36 root root  4096 Nov  7 12:33 .
drwxr-xr-x  6 root root  4096 Nov  7 12:06 ..
# used by my script
-rw-rw-r--  1 root root  1128 Nov  7 12:06 alldirs.txt

# CDN Log File containers as we downloaded them from swiftly Rackspace Cloud Files (.CDN_ACCESS_LOGS)
drwxrwxr-x  3 root root  4096 Oct 19 11:22 dev.demo.video.cdn..com
drwxrwxr-x  3 root root  4096 Oct 19 11:22 europe.assets.lon.tv
drwxrwxr-x  5 root root  4096 Oct 19 11:22 files.lon.cdn.lon.com
drwxrwxr-x  3 root root  4096 Oct 19 11:23 files.blah.cdn..com
drwxrwxr-x  5 root root  4096 Oct 19 11:24 files.demo.cdn..com
drwxrwxr-x  3 root root  4096 Oct 19 11:25 files.invesco.cdn..com
drwxrwxr-x  3 root root  4096 Oct 19 11:25 files.test.cdn..com
-rw-r--r--  1 root root   561 Nov  7 12:02 generate-report.sh
-rwxr-xr-x  1 root root  1414 Nov  7 12:15 logparser.sh

# Used by my script
drwxr-xr-x  2 root root  4096 Nov  7 12:06 parsed
drwxr-xr-x  2 root root  4096 Nov  7 12:33 parsed-combined
#!/bin/bash

# Author : Adam Bull
# Title: Rackspace CDN Log Parser
# Date: November 7th 2016

echo "Deleting previous jobs"
rm -rf parsed;
rm -rf parsed-combined

ls -ld */ | awk '{print $9}' | grep -v parsed > alldirs.txt


# Create Location for Combined File Listing for CDN LOGS
mkdir parsed

# Create Location for combined CDN or ACCESS LOGS
mkdir parsed-combined

# This just builds a list of the CDN Access Logs
echo "Building list of Downloaded .CDN_ACCESS_LOG Files"
sleep 3
while read m; do
folder=$(echo "$m" | sed 's@/@@g')
echo $folder
        echo "$m" | xargs -i find ./{} -type f -print > "parsed/$folder.log"
done < alldirs.txt

# This part cats the files and uses xargs to produce all the Log oiutput, before cut processing and redirecting to parsed-combined/$folder
echo "Combining .CDN_ACCESS_LOG Files for bulk processing and converting into NCSA format"
sleep 3
while read m; do
folder=$(echo "$m" | sed 's@/@@g')
cat "parsed/$folder.log" | xargs -i zcat {} | cut -d' ' -f1-10  > "parsed-combined/$folder"
done < alldirs.txt


# This part processes the Log files with Goaccess, generating HTML reports
echo "Generating Goaccess HTML Logs"
sleep 3
while read m; do
folder=$(echo "$m" | sed 's@/@@g')
goaccess -f "parsed-combined/$folder" -a -o "/var/www/html/$folder.html"
done < alldirs.txt

How to easily present these logs

I kind of deceived you with the last step. Actually, because I have already done it, with the above script. Though, you will naturally need to have an httpd installed, and a documentroot in /var/www/html, so make sure you install apache2:

yum install httpd awstats

De de de de de de da! da da!

screen-shot-2016-11-07-at-12-41-30-pm

Some little caveats:

Generating a master index.html file of all the sites


[root@cdn-log-parser-mother html]# pwd
/var/www/html
[root@cdn-log-parser-mother html]# ls -al | awk '{print $9}' | xargs -i echo " {}
" > index.html

I will expand the script to generate this automatically soon, but for now, leaving like this due to time constraints.

HTTPS to HTTP redirect for Rackspace Cloud Load Balancer

Hello,

So a customer reached out to us today asking about how to configure HTTPS to HTTP redirects. This is actually really easy to do.

Think of it like this;

When you enable HTTP and HTTPS (allow insecure and secure traffic), the protocol is always HTTP hitting the server.

When you ‘only allow secure traffic’ it will always be HTTPS. Think of it like, if the load balancer has certificates on it and supports both protocols (i.e. terminates SSL at the load balancer), then the requests hitting the server will always be HTTP.

This is why the X_forwarded_Proto becomes important in your server being able to determine what traffic hitting it coming from load balancer originated from HTTPS protocol, and which originated from HTTP protocol. This is what allows you to effectively do the redirection on the cloud-server side.

I hope this helps!

So the rewrite rule on the server, using x_forwarded_proto to detect the protocol instead of the usual ‘https’ directive can be (or rather will need to be replaced) with a rule that uses the header instead of the regular incoming protocol to determine redirect.

    RewriteEngine On
    RewriteCond %{HTTP:http_x_forwarded_proto} !=http
    RewriteCond %{REQUEST_URI} !^/some/path.*
    RewriteRule (.*) http://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Quite simple, but took me the better part of half an hour to get my head properly around this 😀

You might be interested in achieving this on Windows systems with iis or similar, so check this link out which explains how to do that step by step in windows;

https://support.rackspace.com/how-to/configuring-load-balanced-sites-with-ssl-offloading-using-iis/

Troubleshooting Rackspace CDN not serving files

A customer came to us with an issue with their CDN which was strange and odd. I wanted to document this so that it is understood why this happened.

The customer is using two TLS origins, and HTTP2. Why that is a problem will become evidently clear. This is a general method of troubleshooting in terms of replicating behaviour of the CDN and origin with host headers. This can be applied no matter what the problem, to understand the HTTP code given by the origin, which at least half of the time turns out to be the cause. The origin being the cloud-server your CDN is backed by.

Question
Hi,
We are currently experiencing some issues with the Cloud CDN. We are using this for our CSS and images and now everything is getting a HTTP/503 SERVICE UNAVAILABLE. If you want to test, you may test this url:
https://cdn.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css

This is supposed to deliver this file:
https://origin.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css

Is something mis-configured or are there some issues on the appliance?

Answer

First we confirm the origin is UP

# curl -I https://originserver.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css
HTTP/1.1 200 OK
Date: Tue, 11 Oct 2016 08:45:42 GMT
Server: Apache
Last-Modified: Tue, 11 Oct 2016 06:57:53 GMT
ETag: "ed26-53e91653c61d0"
Accept-Ranges: bytes
Content-Length: 60710
Vary: Accept-Encoding
Cache-Control: max-age=31536000, public
Expires: Wed, 11 Oct 2017 08:45:42 GMT
Access-Control-Allow-Origin: *
X-Frame-Options: SAMEORIGIN
Content-Type: text/css

The origin is the cloud-server where the CDN pulls from. As we can see the site is up. So what is causing the issue? The way CDN works is it provides a host header for the domain, so the site has to have a host for both domains. The reason is that the CDN uses CNAME hostnames to identify which CDN is which. I.e. which path like /media/ directs to which static origins subdomain.

The best way to look further at the situation now is to check the origin (the subdomain that you’ve associated with your CDN subdomain that raxspace gives you, when sending the host header for the CDN url we get:

root@myweb:~# curl -I https://origin.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css -H 'host: cdn.cusomerdomain.no'
HTTP/1.1 421 Misdirected Request
Date: Tue, 11 Oct 2016 08:17:38 GMT
Server: Apache
Content-Type: text/html; charset=iso-8859-1

As we can see we get this odd HTTP 421 misdirected request.

# curl -I https://origin.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css -H 'host: mycdnname1.scdn4.secure.raxcdn.com'
HTTP/1.1 421 Misdirected Request
Date: Tue, 11 Oct 2016 08:18:06 GMT
Server: Apache
Content-Type: text/html; charset=iso-8859-1

~# curl -I https://origin.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css -H 'host: cdncustomercname.cusomerdomain.com'
HTTP/1.1 421 Misdirected Request
Date: Tue, 11 Oct 2016 08:17:45 GMT
Server: Apache
Content-Type: text/html; charset=iso-8859-1

https://httpd.apache.org/docs/2.4/mod/mod_http2.html

Looking at the definition for HTTP 2, this issue was caused by different TLS configurations for your domains and mod http2 trying to reuse the same connection, which will not work if the TLS configurations are not the same on the origin cloud-server side.

You just need to disable HTTP2, or configure the TLS configurations to be the same on the apache2 side. I hope that this clarifies and makes sense to you, of course if you have additional questions, comments or concerns please don't hesitate to reach out to us, we are here to help!

As you can see the importance of debugging CDN by sending host header to the origin that the CDN uses, to replicate the issue the customer was experiencing, which was essentially, the CDN edgenodes (the machines around the world that pull from the origin for content distribution really worldwide), weren't able to retrieve the files from the origin with the host header domain that is defined in the control panel.

This customer needed to in this case adjust their Apache2 configuration. The problem was likely caused by updating Apache2 or similar.

Adding AuthType Basic HTTP Authentication for WordPress wp-admin , phpMyAdmin, etc

Hey folks, so this comes up really often with Rackspace customers, how to add AuthType Basic HTTP authentication. This is akin to an additional user/pass dialog box which appears when navigating to the url. It is really easy to create a new password:

htpasswd -c /etc/httpd/.htpasswd-private username

Please note .htpasswd-private can be anything, however it’s better to make it a hidden file by prefixing with a . dot, also important to ensure that the password file isn’t in a web documentroot, the file should have user and group root root, but that should be done automatically when you create the file with root user.

This is what I added to phpmyadmin configuration file in /etc/httpd/conf.d/phpMyAdmin.conf

 
<Location "/phpMyAdmin/">
AuthType Basic
AuthName "Private Server"
AuthUserFile /etc/httpd/.htpasswd-private
Require valid-user
</Location>

You could do the same for WordPress;

<Location "/wp-admin/">
AuthType Basic
AuthName "Private Server"
AuthUserFile /etc/httpd/.htpasswd-private
Require valid-user
</Location>

Please note that I am using Location, but you can apply the same thing within a directive instead.

Securing your WordPress with chmod 644 and chmod 755 the easy (but pro) way

Let’s say we have a document root like:

It’s interesting to note the instructions for this will vary from environment to environment, it depends on which user is looking after apache2, etc.

/var/www/mysite.com/htdocs

Make all files read/write and owned by www-data apache2 user only

root@meine:/var/www/mysite.com/htdocs# find . -type f -exec chown apache2:apache2 {} \; 
root@meine:/var/www/mysite.com/htdocs# find . -type f -exec chmod 644 {} \;

Make all folders accessible Read + Execute, but no write permissions

root@meine:/var/www/mysite.com/htdocs# find . -type d -exec chmod 755 {} \;
root@meine:/var/www/mysite.com/htdocs# find . -type d -exec chown apache2:apache2 {} \;

PLEASE NOTE THIS BREAKS YOUR WORDPRESS ABILITY TO AUTO-UPDATE ITSELF. BUT IT IS MORE SECURE 😀

Note debian users, may need to use www-data:www-data instead.

Checking requests to apache2 webserver during downtime

A customer of ours was having some serious disruptions to his webserver, with 15 minute outages happening here and there. He said he couldn’t see an increase in traffic and therefore didn’t understand why it reached maxclients. Here was a quick way to prove whether traffic really increased or not by directly grepping the access logs for the time and day in question and using wc -l to count them, and a for loop to step thru the minutes of the hour in between the events.

Proud of this simple one.. much simpler than a lot of other scripts that do the same thing I’ve seen out there!

root@anonymousbox:/var/log/apache2# for i in `seq 01 60`;  do  printf "total visits: 13:$i\n\n"; grep "12/Jul/2016:13:$i" access.log | wc -l; done

total visits: 13:1

305
total visits: 13:2

474
total visits: 13:3

421
total visits: 13:4

411
total visits: 13:5

733
total visits: 13:6

0
total visits: 13:7

0
total visits: 13:8

0
total visits: 13:9

0
total visits: 13:10

30
total visits: 13:11

36
total visits: 13:12

30
total visits: 13:13

29
total visits: 13:14

28
total visits: 13:15

26
total visits: 13:16

26
total visits: 13:17

32
total visits: 13:18

37
total visits: 13:19

31
total visits: 13:20

42
total visits: 13:21

47
total visits: 13:22

65
total visits: 13:23

51
total visits: 13:24

57
total visits: 13:25

38
total visits: 13:26

40
total visits: 13:27

51
total visits: 13:28

51
total visits: 13:29

32
total visits: 13:30

56
total visits: 13:31

37
total visits: 13:32

36
total visits: 13:33

32
total visits: 13:34

36
total visits: 13:35

36
total visits: 13:36

39
total visits: 13:37

70
total visits: 13:38

52
total visits: 13:39

27
total visits: 13:40

38
total visits: 13:41

46
total visits: 13:42

46
total visits: 13:43

47
total visits: 13:44

39
total visits: 13:45

36
total visits: 13:46

39
total visits: 13:47

49
total visits: 13:48

41
total visits: 13:49

30
total visits: 13:50

57
total visits: 13:51

68
total visits: 13:52

99
total visits: 13:53

52
total visits: 13:54

92
total visits: 13:55

66
total visits: 13:56

75
total visits: 13:57

70
total visits: 13:58

87
total visits: 13:59

67
total visits: 13:60

root@anonymousbox:/var/log/apache2# for i in `seq 01 60`; do printf “total visits: 12:$i\n\n”; grep “12/Jul/2016:12:$i” access.log | wc -l; done
total visits: 12:1

169
total visits: 12:2

248
total visits: 12:3

298
total visits: 12:4

200
total visits: 12:5

341
total visits: 12:6

0
total visits: 12:7

0
total visits: 12:8

0
total visits: 12:9

0
total visits: 12:10

13
total visits: 12:11

11
total visits: 12:12

30
total visits: 12:13

11
total visits: 12:14

11
total visits: 12:15

13
total visits: 12:16

16
total visits: 12:17

28
total visits: 12:18

26
total visits: 12:19

10
total visits: 12:20

19
total visits: 12:21

35
total visits: 12:22

12
total visits: 12:23

19
total visits: 12:24

28
total visits: 12:25

25
total visits: 12:26

30
total visits: 12:27

43
total visits: 12:28

13
total visits: 12:29

24
total visits: 12:30

39
total visits: 12:31

35
total visits: 12:32

25
total visits: 12:33

22
total visits: 12:34

33
total visits: 12:35

21
total visits: 12:36

31
total visits: 12:37

31
total visits: 12:38

22
total visits: 12:39

39
total visits: 12:40

11
total visits: 12:41

18
total visits: 12:42

11
total visits: 12:43

28
total visits: 12:44

19
total visits: 12:45

27
total visits: 12:46

18
total visits: 12:47

17
total visits: 12:48

22
total visits: 12:49

29
total visits: 12:50

22
total visits: 12:51

31
total visits: 12:52

44
total visits: 12:53

38
total visits: 12:54

38
total visits: 12:55

41
total visits: 12:56

38
total visits: 12:57

32
total visits: 12:58

26
total visits: 12:59

31
total visits: 12:60

Tuning Apache2 for high Traffic Spikes

So this came up recently where a customer was asking if we could tune their apache2 for higher traffic. The best way to do this is to benchmark the site to double the traffic expected, this should be a good measure of whether the site is going to hold up..

# Use Apachebench to test the local requests
ab -n 1000000 -c 1000 http://localhost:80/__*index.html

Benchmarking localhost (be patient)
Completed 100000 requests
Completed 200000 requests
Completed 300000 requests
Completed 400000 requests
Completed 500000 requests
Completed 600000 requests
Completed 700000 requests
Completed 800000 requests
Completed 900000 requests
Completed 1000000 requests
Finished 1000000 requests

Server Software:        Apache/2.2.15
Server Hostname:        localhost
Server Port:            80

Document Path:          /__*index.html
Document Length:        5758 bytes

Concurrency Level:      1000
Time taken for tests:   377.636 seconds
Complete requests:      1000000
Failed requests:        115
   (Connect: 0, Receive: 0, Length: 115, Exceptions: 0)
Write errors:           0
Total transferred:      6028336810 bytes
HTML transferred:       5757366620 bytes
Requests per second:    2648.05 [#/sec] (mean)
Time per request:       377.636 [ms] (mean)
Time per request:       0.378 [ms] (mean, across all concurrent requests)
Transfer rate:          15589.21 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   52  243.0     22   15036
Processing:     0  282 1898.4     27   81404
Waiting:        0  270 1780.1     24   81400
Total:          6  334 1923.7     50   82432

Percentage of the requests served within a certain time (ms)
  50%     50
  66%     57
  75%     63
  80%     67
  90%     84
  95%   1036
  98%   4773
  99%   7991
 100%  82432 (longest request)



# During the benchmark test you may wish to use sar to indicate general load and io
stdbuf -o0 paste <(sar -q 10 100) <(sar 10 100) | awk '{printf "%8s %2s %7s %7s %7s %8s %9s %8s %8s\n", $1,$2,$3,$4,$5,$11,$13,$14,$NF}'

# Make any relevant adjustments to httpd.conf threads

# diff /etc/httpd/conf/httpd.conf /home/backup/etc/httpd/conf/httpd.conf
103,108c103,108
< StartServers       2000
< MinSpareServers    500
< MaxSpareServers   900
< ServerLimit      2990
< MaxClients       2990
< MaxRequestsPerChild  20000
---
> StartServers       8
> MinSpareServers    5
> MaxSpareServers   20
> ServerLimit      256
> MaxClients       256
> MaxRequestsPerChild  4000
-----------------------------------

In this case we increased the number of startservers and minspareservers. Thanks to Jacob for this.

Whitelisting IP’s in modsecurity 1 and modsecurity 2

Hey folks, so I have noticed that in the new modsecurity CRS version 2, that ‘chained’ rules are supported. This means that whitelisting IP’s has been altered slightly.

Previously whitelisting in modsecurity v2 ip whitelisting was simpler like:
SecRule REMOTE_ADDR “^11.22.33.44” phase:1,nolog,allow,ctl:ruleEngine=off

Now in modsecurity v2 the whitelist configuration must look something like

SecRule REMOTE_ADDR "^11\.22\.33\.44$" phase:1,log,allow,ctl:ruleEngine=Off,id:999945

Now it’s kind of weird, but I hear that chains are much more secure so in that regard maybe v2 has something awesome to offer. Just was head scratching on this one for a good 20 minutes!

You might be wondering why you are receiving an error like ‘configtest failed’ when restarting apache2 using modsecurity. This is probably the fix for v2 you need.

Analyzing Error with Apache2

This customer is receiving an [[an error occurred while processing this directive]] error. This concerns SSI and required more analysis, but this is how I went about it.

I have taken a look at your server and I can see your running apache2 / version 2.2.16

$ curl -I somewebsite.com
HTTP/1.1 301 Moved Permanently
Date: Tue, 02 Feb 2016 17:08:44 GMT
Server: Apache/2.2.16 (Debian)
Location: somewebsite.com
Vary: Accept-Encoding
Content-Type: text/html; charset=iso-8859-1

Although I can’t be certain it looks like there is an issue with the configuration of the apache2, potentially with the moved permanently redirect, but I can’t say for certain from the error message that is given in the html page.

The web page is certainly responding now that the port is state OPEN/ACTIVE though, so we are making progress.

$ curl somewebsite.com


301 Moved Permanently

Moved Permanently

The document has moved here.


Apache/2.2.16 (Debian) Server at somewebsite.com Port 80

It looks like something happens after the redirect/moved permanently.

I would recommend visiting the site and tail’ing the access.log and error.log of your server. this is usually /var/log/apache2/access.log /var/log/apache2/error.log and /var/log/httpd/access.log and /var/log/httpd/error.log, depending on the distribution and configuration this can sometimes be different.

Run something like this:

tail -f /var/log/apache2/error.log

and then load the website up in your browser and observe the errors seen by apache2 on the server, it’s most likely the output will be far more verbose on the backend logs than in the front end error given on the web page.

For more information on SSI (server side includes), please see the apache foundation here (I haven’t discovered exactly where the code is wrong yet and am waiting for customer to give me their virtualhosts configuration/ httpd.conf)

https://httpd.apache.org/docs/2.2/howto/ssi.html
is point, if you require additional assistance we’d benefit from seeing your virtualhosts configuration from within httpd.conf.