Identifying anonymous email senders
In an attempt to help police with investigations, researchers at a
Canadian university have developed a technique for unveiling the
identities of anonymous email senders.
In an attempt to help police with investigations, researchers at a Canadian university have developed a technique for unveiling the identities of anonymous email senders.
Fingerprint-based identification has been the oldest biometric technique successfully used in a conventional crime investigation. The unique patterns of a fingerprint can help crime investigators infer the identities of suspects.
However, circumstances have changed since the emergence and rapid proliferation of cybercrime. Unlike conventional crimes, there are no fingerprints to be found in cybercrime.
Cybercriminals often use anonymously-sent email to commit crimes, but new digital forensic methods may help identify senders so that they can be prosecuted in court. A team of researchers from Concordia University in Montreal, Canada, has developed an effective new technique to determine the authorship of anonymous emails.
Tests showed their method has a high level of accuracy and unlike many other methods of ascertaining authorship, it can provide presentable evidence in courts of law.
In the past few years, we`ve seen an alarming increase in the number of cybercrimes involving anonymous emails, explained study co-author Benjamin Fung, a professor of information systems engineering at the university and an expert in data mining extracting useful, previously unknown knowledge from a large volume of raw data.
These emails can transmit threats or child pornography, cyber bullying, facilitate communications between criminals or carry viruses.
He explained that an email contains two parts, header and body. An anonymous email is an email without the header information and without a name or signature at the end of the email.
Mr Fung said that while police can often use the IP address to locate the house or business where an email originated, they may find many people at that address and until now it has been impossible to say which resident or worker was responsible; they need a reliable, effective way to determine which of several suspects has written the emails under investigation.
Mr Fung and his colleagues developed a novel method of authorship attribution to meet this need, based on techniques used in speech recognition and data mining. Their approach relies on the identification of frequent patterns unique combinations of features that recur in a suspects emails.
Aspects of the writing looked at include:
Vocabulary richness
Length of sentence
Use of function words
Layout of paragraphs
Key words
To determine whether a suspect has authored the target email, they first identify the patterns found in emails written by the subject. Then, they filter out any of these patterns which are also found in the emails of other suspects.
The remaining frequent patterns are unique to the author of the emails being analysed. They constitute the suspects write-print, a distinctive identifier like a fingerprint.
Lets say the anonymous email contains typos or grammatical mistakes, or is written entirely in lower case letters, said Mr Fung. We use those special characteristics to create a write-print. Using this method, we can even determine with a high degree of accuracy who wrote a given email and infer the gender, nationality and education level of the author.
To test the accuracy of their technique, Mr Fung and his colleagues examined the Enron Email Dataset, a collection which contains over 200,000 real-life emails from 158 employees of the Enron Corporation. Using a sample of ten emails written by each of ten subjects 100 emails in all they were able to identify authorship with an accuracy of 80 per cent to 90 per cent.
Our technique was designed to provide credible evidence that can be presented in a court of law, said Mr Fung. For evidence to be admissible, investigators need to explain how they have reached their conclusions. Our method allows them to do this.
The a