Difference between revisions of "Data Science"
Adelo Vieira (talk | contribs) (→Methodology) |
Adelo Vieira (talk | contribs) (→Methodology) |
||
Line 21: | Line 21: | ||
:'''3.''' The third part of the project will be the analysis of the data. Here is where Machine learning will be implement. | :'''3.''' The third part of the project will be the analysis of the data. Here is where Machine learning will be implement. | ||
:* In this part we first need to decide what we want to analysis. There are many examples: | :* In this part we first need to decide what we want to analysis. There are many examples: | ||
− | ::* '''Business:''' companies use opinion mining tools to find out what consumers think of their product, service, brand, marketing campaigns or competitors. | + | ::* '''Example 1 - Business:''' companies use opinion mining tools to find out what consumers think of their product, service, brand, marketing campaigns or competitors. |
− | ::* '''Politics:''' In politics, sentiment analysis is used to keep track of society’s opinions on the government, politicians, statements, policy changes, or event to predict results of the election. | + | ::* '''Example 2 - Politics:''' In politics, sentiment analysis is used to keep track of society’s opinions on the government, politicians, statements, policy changes, or event to predict results of the election. |
− | ::* '''Public actions:''' opinion analysis is used to analyze online reactions to social and cultural phenomena, for example Pokemon Go, the premiere episode of the Game of Thrones, or Oscars. | + | ::* '''Example 3 - Public actions:''' opinion analysis is used to analyze online reactions to social and cultural phenomena, for example Pokemon Go, the premiere episode of the Game of Thrones, or Oscars. |
::* here is a nice work I found «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis | ::* here is a nice work I found «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis | ||
Line 31: | Line 31: | ||
[[File:Brazilian_elections_2014.png|950px|thumb|center|]] | [[File:Brazilian_elections_2014.png|950px|thumb|center|]] | ||
− | |||
− | |||
::* <span style="background:#D8BFD8">'''But it will be up to us (as a team) to determine what we want to analyses. Actually, we don't need to decide know what we are going to analyses. A big part of the methodology can be done without knowing what we are going to analyses. We have enough time to think about a nice one...'''</span> | ::* <span style="background:#D8BFD8">'''But it will be up to us (as a team) to determine what we want to analyses. Actually, we don't need to decide know what we are going to analyses. A big part of the methodology can be done without knowing what we are going to analyses. We have enough time to think about a nice one...'''</span> |
Revision as of 17:26, 2 October 2018
Contents
Social Media Sentiment Analysis
Sentiment analysis, also known as opinion mining, opinion extraction, sentiment mining or subjectivity analysis, is the process of analyzing if a piece of online writing (social media mentions, blog posts, news sites, or any other piece) expresses positive, negative, or neutral attitude.
Motivation
Social media has almost become synonymous with «big data» due to the sheer amount of user-generated content.
Mining this rich data can prove unprecedented ways to keep a pulse on opinions, trends, and public sentiment. Facebook, Twitter, YouTube, WeChat... etc.
Social media data will become even more relevant for marketing, branding, and business as a whole.
As you see, this kind of analysis is a tool that will become more and more important in the coming years.
Methodology
- 1. The first part of the project will be: Mining Social Media data
- To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to mining Twitter Data with Python : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/
- 2. Secondly, we will need to store the data.
- 3. The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
- In this part we first need to decide what we want to analysis. There are many examples:
- Example 1 - Business: companies use opinion mining tools to find out what consumers think of their product, service, brand, marketing campaigns or competitors.
- Example 2 - Politics: In politics, sentiment analysis is used to keep track of society’s opinions on the government, politicians, statements, policy changes, or event to predict results of the election.
- Example 3 - Public actions: opinion analysis is used to analyze online reactions to social and cultural phenomena, for example Pokemon Go, the premiere episode of the Game of Thrones, or Oscars.
- here is a nice work I found «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis
- In essence, this guy analyses Twitter data for the days prior to the election and got this mapa:
- But it will be up to us (as a team) to determine what we want to analyses. Actually, we don't need to decide know what we are going to analyses. A big part of the methodology can be done without knowing what we are going to analyses. We have enough time to think about a nice one...
https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397
https://elitedatascience.com/machine-learning-projects-for-beginners#social-media
Remote development
Eclipse - Connect to a remote file system
https://us.informatiweb.net/tutorials/it/6-web/148--eclipse-connect-to-a-remote-file-system.html
Mount a remote filesystem in your local machine
https://stackoverflow.com/questions/32747819/remote-java-development-using-intellij-or-eclipse
https://serverfault.com/questions/306796/sshfs-problem-when-losing-connection
https://askubuntu.com/questions/358906/sshfs-messes-up-everything-if-i-lose-connection
https://askubuntu.com/questions/716612/sshfs-auto-reconnect
root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o reconnect,ServerAliveInterval=5,ServerAliveCountMax=3 root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o allow_other root@sinfronteras.ws: /home/adelo/1-system/3-cloud
faster way to mount a remote file system than sshfs:
https://superuser.com/questions/344255/faster-way-to-mount-a-remote-file-system-than-sshfs
Anaconda
Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)
Installation
https://www.anaconda.com/download/#linux
https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/
Jupyter Notebook
https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook