Ghetto but simple Log Parser for testing website performance

So… I got fedup with constantly writing my own stuff for basic things. I’m going to turn this into something more spectacular that accepts commandline input, and also, allows you to define which days, and months, ranges, and stuff like that.

It’s a no-frills-ghetto log parser.

#!/bin/bash

echo "Total HITS: MARCH"
grep "/Mar/2017" /var/log/httpd/somewebsite.com-access_log | wc -l;

for i in 0{1..9} {10..24};

do echo "      > 9th March 2017, hits this $i hour";
grep "09/Mar/2017:$i" /var/log/httpd/somesite.com-access_log | wc -l;

        # break down the minutes in a nested visual way thats AWsome
for j in 0{1..9} {10..60};
do echo "                  >>hits at $i:$j";
grep "09/Mar/2017:$i:$j" /var/log/httpd/somesite.com-access_log | wc -l;
done

done

It’s not perfect, it’s just a proof of concept, really.

Migrating a Plesk site after moving keeps going to default plesk page

So today a customer had this really weird issue where we could see that the website domain that had been moved from one server to a new plesk server, wasn’t correctly loading. It actually turned out to be simple, and when trying to access a file on the domain like I would get the phpinfo.php file.

curl http://www.customerswebsite.com/info.php 

This suggested to me the website documentroot was working, and the only thing missing was probably the index. This is what it actually did turn out to me.

I wanted to test though that info.php really was in this documentroot, and not some other virtualhost documentroot, so I moved the info.php file to randomnumbers12313.php and the page still loaded, this confirms by adding that file on the filesystem that all is well, and that I found correct site, important when troubleshooting vast configurations.

I also found a really handy one liner for troubleshooting which file it comes out, this might not be great on a really busy server, but you could still grep for your IP address as well.

Visit the broken/affected website we will troubleshoot

curl -I somecustomerswebsite.com

Give all visitors to all apache websites occurring now whilst we visit it ourselves for testing

tail -f /var/log/httpd/*.log 

This will show us which virtualhost and/or path is being accessed, from where.

Give only visitors to all apache websites occurring on a given IP

tail -f /var/log/httpd/*.log  | grep 4.2.2.4

Where 4.2.2.4 is your IP address your using to visit the site. If you don’t know what your Ip is type icanhazip into google, or ‘what is my ip’, job done.

Fixing the Plesk website without a directory index

[root@mehcakes-App1 conf]# plesk bin domain --update somecustomerswebsite.com -nginx-serve-php true -apache-directory-index index.php

Simple enough… but could be a pain if you don’t know what your looking for.

PHP5 newrelic agent not collecting data

So today I had a newrelic customer who was having issues after installing the newrelic php plugin. He couldn’t understand why it wasn’t collecting data. For it to collecting data you need to make sure newrelic-daemon process is running by using ps auxfwww | grep newrelic-daemon.

We check the process of the daemon is running

[root@rtd-production-1 ~]# ps -ef | grep newrelic-daemon
root     26007 18914  0 09:59 pts/0    00:00:00 grep newrelic-daemon

We check the status of the daemon process

[root@rtd-production-1 ~]# service newrelic-daemon status
newrelic-daemon is stopped...

Copy basic NewRelic configuration template to correct location

[root@rtd-production-1 ~]# cp /etc/newrelic/newrelic.cfg.template /etc/newrelic/newrelic.cfg

Start the daemon

[root@rtd-production-1 ~]# service newrelic-daemon start
Starting newrelic-daemon:                                  [  OK  ]

QID 150004 : Path-Based Vulnerability

A customer of ours had an issue with some paths like theirwebsite.com/images returning a 200 OK, and although the page was completely blank, and exposed no information it was detected as a positive indicator of exposed data, because of the 200 OK.

more detail: https://community.qualys.com/thread/16746-qid-150004-path-based-vulnerability

Actually in this case it was a ‘whitescreen’, or just a blank index page, to prevent the Options +indexes in the apache httpd configuration showing the images path. You probably don’t want this and can just set your Option indexes.

Change from:

Options +Indexes
# in older versions it may be defined as
Options Indexes

Change to:

Options -Indexes

This explicitly forbids, but older versions of apache2 might need this written as:

Options Indexes

To prevent an attack on .htaccess you could also add this to httpd.conf to ensure the httpd.conf is enforced and takes precedence over any hacker or user that adds indexing incorrectly/mistakenly/wrongly;

<Directory />
    Options FollowSymLinks
    AllowOverride None
</Directory>

Simple enough.

/etc/apache2/conf.d/security – Ubuntu 12.04.1 – Default File exploitable

In Ubuntu 12.04.1 there were some rather naughty security updates in specific, /etc/apache2/conf.d/security file has important lines commented out:

#<Directory />
#        AllowOverride None
#        Order Deny,Allow
#        Deny from all
#</Directory>

These above lines set the policy for the /var/www/ directory to forbid all access, then being commented out means that the policy is not forbidding access by default.

This is not good. In our customers case, they also had A listen 443 directive in their ports.conf, however they hadn’t added any default virtualhosts using port 443 SSL. This actually means that the /var/www directory becomes the ‘/’ for default HTTPS negotiation to the site. NOT GOOD since if directory listing is also available it will list the contents of /var/www as well, as exposing files that can be directly accessed, the directory listing option will make it possible to see all files listed, instead of just opening up the files in /var/www/ for access via http://somehost.com/somefileinvarwww.sql naturally its much harder if the attacker has to guess the files, but still, not good!

NOT GOOD AT ALL. If the customer has a /var/www/vhosts/sites and is using /var/www for their database dumps or other files it means those files could be retrieved.

The fix is simple, remove these lines from /etc/apache2/ports.conf,

Change from

Listen 443
NameVirtualHost *:443 

Change to

#Listen 443
#NameVirtualHost *:443 

Also make sure that the secure file (/etc/apache2/conf.d/secure) doesn’t have these lines commented as Ubuntu 12.04.1 and 12.10 may suffer; this is actually the main issue

Change from:

#<Directory />
# AllowOverride None
# Order Deny,Allow
# Deny from all
#</Directory>

Change to:

<Directory />
 AllowOverride None
 Order Deny,Allow
 Deny from all
</Directory>

Restart your apache2

# Most other OS
service apache2 restart
/etc/init.d/apache2 restart

# CentOS 7/RHEL7
systemctl restart apache2

This exploitable/vulnerable configuration was changed in later updates to the apache configurations in Ubuntu, however it appears for some people there are packages being held back for a couple of reasons. First, it appears that this server was initially deployed on Ubuntu 12.10, which is a short-term release that reached end of life May 16, 2014. As the dist-upgrade path for Ubuntu 12.10 would be to the LTS Trusty Tahr release, which reaches end of life this May.

I suspect that a significant contributor to the issue was that the releases were unsupported by the vendor at the time of being affected. The customer also used the vulnerable ports.conf file it appears with a deployment of chef.

For more information see:

http://mixeduperic.com/downloads/org-files/ubuntu/etcapache2confdsecurity-ubuntu-12041-default-file.html

Creating a New user with permissions for MySQL

As it says on the tin

Create the user

mysql> create user mysqlusergoeshere@localhost identified by 'somepasswordcangoheremakeitsecure';
Query OK, 0 rows affected (0.03 sec)

Create the Permissions for the user

mysql> GRANT ALL PRIVILEGES ON database_name.* to mysqlusergoeshere@localhost;
Query OK, 0 rows affected (0.01 sec)

Simples.

Site keeps on going down because of spiders

So a Rackspace customer was consistently having an issue with their site going down, even after the number of workers were increased. It looked like in this customers case they were being hit really hard by yahoo slurp, google bot, a href bot, and many many others.

So I checked the hour the customer was affected, and found that over that hour just yahoo slurp and google bot accounted for 415 of the requests. This made up like 25% of all the requests to the site so it was certainly a possibility the max workers were being reached due to spikes in traffic from bots, in parallel with potential spikes in usual visitors.

[root@www logs]#  grep '01/Mar/2017:10:' access_log | egrep -i 'www.google.com/bot.html|http://help.yahoo.com/help/us/ysearch/slurp' |  wc -l
415

It wasn’t a complete theory, but was the best with all the available information I had, since everything else had been checked. The only thing that remains is the number of retransmits for that machine. All in all it was a victory, and this was so awesome, I’m now thinking of making a tool that will do this in more automated way.

I don’t know if this is the best way to find google bot and yahoo bot spiders, but it seems like a good method to start.

Count number of IP’s over a given time period in Apache Log

So, a customer had an outage, and wasn’t sure what caused it. It looked like some IP’s were hammering the site, so I wrote this quite one liner just to sort the IP’s numerically, so that uniq -c can count the duplicate requests, this way we can count exactly how many times a given IP makes a request in any given minute or hour:

Any given minute

# grep '24/Feb/2017:10:03' /var/www/html/website.com/access.log | awk '{print $1}' | sort -k2nr | uniq -c

Any given hour

# grep '24/Feb/2017:10:' /var/www/html/website.com/access.log | awk '{print $1}' | sort -k2nr | uniq -c

Any Given day

# grep '24/Feb/2017:' /var/www/html/website.com/access.log | awk '{print $1}' | sort -k2nr | uniq -c

Any Given Month

# grep '/Feb/2017:' /var/www/html/website.com/access.log | awk '{print $1}' | sort -k2nr | uniq -c

Any Given Year

# grep '/2017:' /var/www/html/website.com/access.log | awk '{print $1}' | sort -k2nr | uniq -c

Any given year might cause dupes though, and I’m sure there is a better way of doing that which is more specific

How To Enable GZIP Compression/mod_deflate in Apache

This is something quite easy to do.

Check that the deflate module is installed

# apachectl -M | grep deflate
 deflate_module (shared)
Syntax OK

Check the content encoding is enabled by using curl on the site

# curl -I -H 'Accept-Encoding: gzip,deflate' http://somewebsite.com/
HTTP/1.1 200 OK
Date: Fri, 23 Jan 2017 11:02:16 GMT
Server: Apache
X-Pingback: http://somewebsite.com/somesite.php
X-Powered-By: rabbits
Connection: close
Content-Type: text/html; charset=UTF-8

Create a file called /etc/httpd/conf.d/deflate.conf

<ifmodule mod_deflate.c>
    AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript application/javascript
    DeflateCompressionLevel 8
</ifmodule>

Restart apache2

/etc/init.d/httpd reload
Reloading httpd:

retest the gzip content-encoding directive with curl using Accept-Encoding: gzip directive.


# curl -I -H 'Accept-Encoding: gzip,deflate' http://somesite.com/
HTTP/1.1 200 OK
Date: Fri, 25 Sep 2015 00:04:26 GMT
Server: Apache
X-Pingback: http://somesite.com/somesite.php
X-Powered-By: muppets
Vary: Accept-Encoding
Content-Encoding: gzip    <---- this line indicates that compression is active
Content-Length: 20
Connection: close
Content-Type: text/html; charset=UTF-8

For this, thanks to odin.com