SEMalt is Skewing Your Stats

There is a Ukranian bot out there that is crawling and distorting the stats for millions of  websites, and to some degree, is affecting every single site I’ve looked at this past week.

Go to Google Analytics, look at your referrers, and I’ll bet you a beer that SEMalt and crawler.semalt.com are both listed with dozens of visits over this past month.  In some cases we’re seeing a history with them dating back to January 2 of this year.

What is SEMalt Up To?

According to one of their employees: 

What is SE Malt?

Then he answered again, and fed me this load of BS: 

This is a lie

An accident? That’s a lie.

In a page on their website it says this:

Semalt crawler bots visit website and gather statistical data for our service simulating real user behavior: unique IP, browser, display resolution etc. This information is used exclusively within the Semalt.com project and isn’t revealed to a third party.

On their “about” page (their menu is in their footer) they claim to offer various tools, like keyword ranking, brand monitoring, reports, competitor explorer, website analyzer and a report system.

Presumably, these crawls are feeding their “competitor explorer” with info they then provide to their paying subscribers, but I don’t know that to be true.

Here on SEMpdx,  here’s what the referrals looked like for March, where they visited nearly every single day for a total of 94 times.

SEMalt referrals for March

Does it Matter?

If you’re a medium-sized website you probably didn’t notice, but if you’re a local business that only gets a few hundred visitors a month, you may just find that they are your number one referrer, and that’s severely distorting your stats.

They appeared to be friendly enough when I first Tweeted at them the other day, but I’ve done some digging now,  and I distrust them…

Why do I distrust them?

  • They visit sites from no consistent IP address or IP range
  • They are stealing your bandwidth
  • They are using your server resources
  • They are skewing your stats
  • They do not follow robots.txt

Depending on the size of your site,  may be drastically skewing your overall statistics from your overall visitor count, to your conversion percentages and bounce rates.

For example, for one new local client with only about 300 visitors last month, SEMalt accounted for over 70 visits, which is more than 20% of their all their  traffic!

A Special Message

SEMalt has managed to anger this site owner so much, that they added a special message just for them in his website header:

Special message to SE Maltt

How can SEMalt be stopped?

They put up a page where you can supposedly list your domain for removal, but again, I don’t think I trust them.  Here’s a link to their “removal” tool

Since I’ve discovered that blocking them via robots.txt didn’t work, and  found that blocking their IP wasn’t possible, I began looking for the best way to edit our .htaccess file, and I had to try a couple of options before I found something that would work on the SEMpdx server.

Rather than provide you with .htaccess code here, which may or may not work for you in your hosting environment, I’ll refer you to a very useful post, where there are a lot of folks discussing the semalt situation and that’s where they show several options for .htaccess editing.

If they would simply obey a sites robots.txt, I think a lot of people would not worry about it, and might even try their service. Until they do though, we’re aggressively blocking them.

Related Articles