|Sentiment analysis of financial news|
What doest it mean to get sentiment of the text? There are a lot of scientific researches on this subject. Most of them are focused on getting sentiment of text that is "some review of something". For example, movie reviews.
Some technologies allows to categorize texts for private "emotional" texts and just facts. And sentiments are recognized only for private texts that contains someone's thoughts, that really can be sentimental.
I am interested in sentimental analysis of business news.
In the WEB there are couple of tools that do sentiment analysis of business news. Most of that tools shows list of latest news colored with red or green. Red- means "bad" message, "green" - good message.
I think such categorization of news for 2 classes is wrong. Because news messages can not be good or bad. Each message always is related to couple of things. For some things the message can be "good" and for another - "bad".
That is why it is not correct to say "This message has bad news" or "This message has good news" .
Simple example of message : "Today US Dollar grows against Euro". This message is nor good nor bad. This message is good for dollar, but is bad for euro.
To make sentimental analysis really useful for analyzing news it is needed to have list of subject for the message. One of possible names for these objects is - text tags.
For example, the text "Today US Dollar grows against Euro" has 2 tags - dollar and euro.
Automatic selection of tags for text is another interesting problem of text mining science. I will not care about this, because for business news it is possible to use tags created by someone other if news are scraped from online resources. For example, Google News allow to get list of tags for each message.
Each message in Google News has list of "Related" subjects, Locations and companies list (list if stock indexes)
So it is not difficult to get list of tags for business news using Google News service.
To find sentiments for news message means to find sentiment for each of tags .
When we calculated sentiment (positive or negative ) for all tags then we can say that we do sentiment analysis for news message.
So our task now is to calculate sentiment related to tag in the text.
The first step of this is creating the list of synonyms for the tag in the text. The problem is that usually a tag can be written with few different ways in the text. Example, we have tag "Dow Jones Industrial Average". It can be also written as "Dow Jones Index", "Dow Jones", "Dow Jones Average" and few different forms.
When we have list of possible forms (synonyms ) for each tag we can do sentimental analysis them.
Sentimental analysis techniques that i use for tags will be described in next posts.
|Last Updated on Tuesday, 19 October 2010 14:37|