Difference between revisions of "Data Science"
Adelo Vieira (talk | contribs) |
Adelo Vieira (talk | contribs) (→Social Media Sentiment Analysis) |
||
Line 9: | Line 9: | ||
As you see, this kind of analysis is a tool that will become more and more important in the coming years. | As you see, this kind of analysis is a tool that will become more and more important in the coming years. | ||
+ | ===Methodology=== | ||
* The first part of the project will be: '''Mining Social Media data''' | * The first part of the project will be: '''Mining Social Media data''' | ||
− | ** To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python''' | + | ** To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python''' : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/ |
* Secondly, we will need to store the data. | * Secondly, we will need to store the data. | ||
* The third part of the project will be the analysis of the data. Here is where Machine learning will be implement. | * The third part of the project will be the analysis of the data. Here is where Machine learning will be implement. | ||
− | ** In this part we first need to decide what we want to | + | ** In this part we first need to decide what we want to analysis. There are many examples, here is a nice work I found: «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis |
+ | :: In essence, this guy analyses Twitter data for the days prior to the election and got this mapa: | ||
+ | |||
+ | [[File:Brazilian_elections_2014.png|950px|thumb|center|]] | ||
+ | |||
+ | |||
+ | |||
Revision as of 16:57, 2 October 2018
Contents
Social Media Sentiment Analysis
Motivation
Social media has almost become synonymous with «big data» due to the sheer amount of user-generated content.
Mining this rich data can prove unprecedented ways to keep a pulse on opinions, trends, and public sentiment. Facebook, Twitter, YouTube, WeChat... etc.
Social media data will become even more relevant for marketing, branding, and business as a whole.
As you see, this kind of analysis is a tool that will become more and more important in the coming years.
Methodology
- The first part of the project will be: Mining Social Media data
- To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to mining Twitter Data with Python : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/
- Secondly, we will need to store the data.
- The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
- In this part we first need to decide what we want to analysis. There are many examples, here is a nice work I found: «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis
- In essence, this guy analyses Twitter data for the days prior to the election and got this mapa:
https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397
https://elitedatascience.com/machine-learning-projects-for-beginners#social-media
Remote development
Eclipse - Connect to a remote file system
https://us.informatiweb.net/tutorials/it/6-web/148--eclipse-connect-to-a-remote-file-system.html
Mount a remote filesystem in your local machine
https://stackoverflow.com/questions/32747819/remote-java-development-using-intellij-or-eclipse
https://serverfault.com/questions/306796/sshfs-problem-when-losing-connection
https://askubuntu.com/questions/358906/sshfs-messes-up-everything-if-i-lose-connection
https://askubuntu.com/questions/716612/sshfs-auto-reconnect
root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o reconnect,ServerAliveInterval=5,ServerAliveCountMax=3 root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o allow_other root@sinfronteras.ws: /home/adelo/1-system/3-cloud
faster way to mount a remote file system than sshfs:
https://superuser.com/questions/344255/faster-way-to-mount-a-remote-file-system-than-sshfs
Anaconda
Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)
Installation
https://www.anaconda.com/download/#linux
https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/
Jupyter Notebook
https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook