Title: A Comparative Analysis of Machine Learning Models for Domain Adaptation in Multiclass Sentiment Classification
Cover Date: 2025-04-01
Cover Display Date: April 2025
DOI: 10.37936/ecti-cit.2025192.258824
Description: This study presents a comparative evaluation of machine learning models for domain adaptation in multiclass sentiment classification. While sentiment analysis aims to categorize opinions as positive, neutral, or negative, adapting models across domains remains a significant challenge due to differences in vocabulary, writing style, and sentiment expression. Models trained on a specific domain often fail to generalize effectively to others. To solve this problem, we evaluate how well six models-logistic regression, support vector machine (SVM) with a linear kernel, random forest, convolutional neural network (CNN), long short-term memory (LSTM), and BERT-perform on sentiment data from books, beauty & personal care, and automotive categories. The evaluation uses Amazon review data and measures performance via accuracy, F1 score, and Area Under the ROC Curve (AUC). Results indicate that BERT consistently outperforms all other models due to its attention-based transformer architecture, which captures nuanced contextual information across diverse domains. CNN and LSTM models also perform well, particularly in domain-specific settings, with CNN excelling in extracting local features and LSTM in modeling sequential relationships. Traditional models, such as logistic regression and SVM, show limitations in generalizability, while random forest demonstrates stable yet moderate performance. These findings highlight the strengths and trade-offs of each approach for effective cross-domain sentiment classification.
Citations: 0
Aggregation Type: Journal
-------------------


Title: A web pornography patrol system by content based analysis: In particular text and image
Cover Date: 2008-12-01
Cover Display Date: 2008
DOI: 10.1109/ICSMC.2008.4811326
Description: A problem of children being exposed to pornographic web sites on the internet has led to their safety issues. To prevent the children from these inappropriate materials, an effective web filtering system is essential. Content based web filtering is one of the important techniques to handle and filter inappropriate information on the web. In this paper, we examine a content based analysis technique to filter the pornographic web sites. Then, our system consists of two primary content based filtering techniques such as text and image. For text analysis, the Support Vector Machine (SVM) algorithm and N gram model based on Bayes' theorem is applied and experimented to filter pornographic text for both Thai and English language web sites. Meanwhile, we build and examine an image filtering system with a hierarchical image filtering method. It consists of two main processes such as normalized R/G ratio which is using the pixel ratios (red and green color channels) and human composition matrix (HCM) based on skin detection. The empirical results show that our analysis methods of text and image are more effective for pornographic web filtering. Finally, we have modeled a pornographic web filter using content based analysis into our Anti X system. © 2008 IEEE.
Citations: 25
Aggregation Type: Conference Proceeding
-------------------


Title: A web pornography patrol system based on hierarchical image filtering techniques
Cover Date: 2006-12-01
Cover Display Date: 2006
DOI: 10.2991/jcis.2006.268
Description: Due to the flood of pornographic web sites on the internet, content-based web filtering has become an important technique to detect and filter inappropriate information on the web. This is because pornographic web sites contain many sexually oriented texts, images, and other information that can be helpful to filter them. In this paper, we build and examine a system to filter web pornography based on image content. Our system consists of three main processes: (i) normalized R/G ratio, (ii) histogram, and (iii) human composition matrix (HCM) based on skin detection. The first process is using the pixel ratios (red and green color channels) for image filtering. The second process, histogram analysis, is to estimate frequency intensities of an image. If an image falls within the range of training set results, it is likely to be a pornographic image. The last process is HCM based on human skin detection. The experimental results show an effective accuracy after testing. This would demonstrate that our hierarchical image filtering techniques can achieve substantial improvements.
Citations: 2
Aggregation Type: Conference Proceeding
-------------------


Title: Content-based text classifiers for pornographic web filtering
Cover Date: 2006-01-01
Cover Display Date: 2006
DOI: 10.1109/ICSMC.2006.384926
Description: Due to the flood of pornographic web sites on the internet, effective web filtering systems are essential. Web filtering based on content has become one of the important techniques to handle and filter inappropriate information on the web. We examine two machine learning algorithms (Support Vector Machines and Naïve Bayes) for pornographic web filtering based on text content. We then focus initially on Thai-language and English-language web sites. In this paper, we aim to investigate whether machine learning algorithms are suitable for web sites classification. The empirical results show that the classifier based Support Vector Machines are more effective for pornographic web filtering than Naïve Bayes classifier after testing, especially an effectiveness for the over-blocking problem. ©2006 IEEE.
Citations: 17
Aggregation Type: Conference Proceeding
-------------------


Title: An effective pornographic WEB filtering system using a probabilistic classifier
Cover Date: 2005-01-01
Cover Display Date: 2005
DOI: 10.1142/9789812701534_0136
Description: Due to the flood of pornographic web sites on the internet, an effective web filtering system is essential. Web filtering has become one of the important techniques to handle and filter inappropriate information on the web. In this paper, we introduce a web filtering system based on contents. The system uses a probabilistic text classifier to filter pornographic information on the WWW. We focus initially only on Thai and English language web sites. The first process is to parse the web sites collection to extract unique words and to reduce stop-words. Afterwards, these features are transformed into a structurized "bag of words". The next process is calculating the probabilities of each category in the naïve bayes classifier (as a pornographic web filter). Finally, we have implemented and experimented on our techniques. After testing by the F-measure, the experimental results of our system show high accuracy. This demonstrates that naïve bayes can provide more effectiveness for web filtering based on text content. © 2005 World Scientific Publishing Co. Pte. Ltd.
Citations: 0
Aggregation Type: Conference Proceeding
-------------------