S-Box Hashing for Text Mining

DSpace Repository

Show simple item record

dc.contributor.author Seker, Sadi Evren
dc.contributor.author Mert, Cihan
dc.date.accessioned 2013-12-19T14:38:44Z
dc.date.accessioned 2015-11-19T12:50:33Z
dc.date.available 2013-12-19T14:38:44Z
dc.date.available 2015-11-19T12:50:33Z
dc.date.issued 2013-12-19
dc.identifier.uri http://dspace.epoka.edu.al/handle/1/851
dc.description.abstract One of the crucial points in the text mining studies is the feature hashing step. Most of the text mining studies starts with a text data source and processes a feature extraction methodology over the text. Most of the time the feature extraction method should be decided wisely, because, most of the times, it directly effects the results and performance. Another well-known approach is using any feature extraction method, together with the feature hashing. By the way, the feature extraction can be executed without worrying about the performance and the feature hashing reduces the size of the extracted feature vector. Today, one of the widely used hashing algorithms in text mining is the modern hashing algorithms like MD5 or SHA1, which are built over substitution permutation networks (SPN) or Fiestel Networks. The common property of most of the modern hashing algorithms is the implicitly implemented s-boxes. One of the drawbacks of the modern hashing algorithms is the collision free purpose of the algorithm. The permutation step in most of the time is implemented for this purpose and the correlation between the input text and output bits is completely obfuscated. This study focuses on the possible implementations of the s-boxes for the feature hashing. The purpose feature hashing in this study is reducing the feature vector, while keeping the correlation between the input text and the output bits. en_US
dc.language.iso en en_US
dc.relation.ispartofseries paper_15;
dc.subject Data Mining en_US
dc.subject Feature Extraction en_US
dc.subject Feature Hashing en_US
dc.subject KNN en_US
dc.subject Hashing en_US
dc.subject Text Mining en_US
dc.title S-Box Hashing for Text Mining en_US
dc.type Book chapter en_US

Files in this item

This item appears in the following Collection(s)

  • ISCIM 2013
    2nd International Symposium on Computing in Informatics and Mathematics

Show simple item record

Search DSpace


My Account