One of my students has been studying the relationship between the various "news spam" malware pieces, and has found some interesting patterns linking the spam campaigns together by the proven relationship between the spam messages.
Tonight I decided to look at the relationships using the "open SQL query" interface to our UAB Spam Data Mine. The advanced data clustering algorithms do some incredible things, but tonight I just wanted to see what IP addresses had sent us spam email for the "CNN.com Daily Top 10" campaign, and then ask, "So what other spam do we have in the Data Mine that comes from those IP addresses?"
The query is actually very simple for this type of question:
select a.message_id, a.subject, a.sender_ip, b.machine, b.path
from spam a, spam_link b
where (a.message_id = b.message_id)
and a.sender_ip in
(select sender_ip from spam where
subject like '%CNN.com Daily Top 10%')
order by a.sender_ip, a.subject;
Which says, find all the IP addresses that sent us spam where the subject includes the string "CNN.com Daily Top 10". Then make us a list of all the messages sent by those same IP addresses, and show the subject, and URLs (machine + path) from those messages, ordered by IP address and then subject.
We had emails in the CNN group from 4,875 unique IP addresses. Those IP addresses sent us a total of 11,809 emails.
10 emails in November
102 emails in December
51 emails in January
191 emails in February
162 emails in March
213 emails in April
363 emails in May
403 emails in June
2,892 emails in July
7,421 emails in August
Browsing the subjects, it was clear that most of the emails before very late June were an assortment of pills, watches, and enlargement promises. A clear "news trend" started at the very end of June.
Looking at only paths spammed by this group in July and August, these IP addresses spammed the following paths:
(several variations of previous)
(many crazy long paths all on "livefilestore.com")
So, EVERY MAJOR "news spam" campaign we received in July can also be found by looking at emails which came from the same IP addresses as the CNN.com Daily Top 10 emails. We wrote about several of these back in July, for example:
r.html ==> Nuwar Looks for News Readers - July 7
viewmovie.html == News Headlines Still Out of Control - July 22
topnews.html == Top News in Spam = Old News - July 26
I've placed the list of IP addresses used in this spam in a text file on my UAB website:
The list of all 2,255 URLs which were spammed in those emails is also available on my UAB website:
If you have a similar list, I'd love to compare notes!
Director of Research in Computer Forensics
The University of Alabama at Birmingham