Abstract:
With the influx of myriad news accompanied with busy lifestyle, there is a pressing need to classify news according to the requirements of an individual. People are generally more interested what is going on, in their immediate surroundings. In this paper, we model this problem by classifying the news articles based on cities and providing the entity with the collection of city specific news. We have developed our own web crawler for content extraction from the HTML pages of news articles. Random Forests, Naive Bayes and SVM classifiers have been employed and their accuracy has been noted. Results exhibit that machine learning techniques can be harnessed to achieve our goal and thus calls for further research to improve the efficiency of solving this issue.
Download