S-Box Hashing for Text Mining

Seker, Sadi Evren; Mert, Cihan

DSpace Home
→
Epoka University
→
Conferences
→
ISCIM - International Symposium on Computing in Informatics and Mathematics
→
ISCIM 2013
→
View Item

dc.contributor.author	Seker, Sadi Evren
dc.contributor.author	Mert, Cihan
dc.date.accessioned	2013-12-19T14:38:44Z
dc.date.accessioned	2015-11-19T12:50:33Z
dc.date.available	2013-12-19T14:38:44Z
dc.date.available	2015-11-19T12:50:33Z
dc.date.issued	2013-12-19
dc.identifier.uri	http://dspace.epoka.edu.al/handle/1/851
dc.description.abstract	One of the crucial points in the text mining studies is the feature hashing step. Most of the text mining studies starts with a text data source and processes a feature extraction methodology over the text. Most of the time the feature extraction method should be decided wisely, because, most of the times, it directly effects the results and performance. Another well-known approach is using any feature extraction method, together with the feature hashing. By the way, the feature extraction can be executed without worrying about the performance and the feature hashing reduces the size of the extracted feature vector. Today, one of the widely used hashing algorithms in text mining is the modern hashing algorithms like MD5 or SHA1, which are built over substitution permutation networks (SPN) or Fiestel Networks. The common property of most of the modern hashing algorithms is the implicitly implemented s-boxes. One of the drawbacks of the modern hashing algorithms is the collision free purpose of the algorithm. The permutation step in most of the time is implemented for this purpose and the correlation between the input text and output bits is completely obfuscated. This study focuses on the possible implementations of the s-boxes for the feature hashing. The purpose feature hashing in this study is reducing the feature vector, while keeping the correlation between the input text and the output bits.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	paper_15;
dc.subject	Data Mining	en_US
dc.subject	Feature Extraction	en_US
dc.subject	Feature Hashing	en_US
dc.subject	KNN	en_US
dc.subject	Hashing	en_US
dc.subject	Text Mining	en_US
dc.title	S-Box Hashing for Text Mining	en_US
dc.type	Book chapter	en_US

Files in this item

Name: paper_15.pdf

Size: 65.15Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

ISCIM 2013
2nd International Symposium on Computing in Informatics and Mathematics

S-Box Hashing for Text Mining

DSpace Repository

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account