Difference between revisions of "Data Science"
Adelo Vieira (talk | contribs) (→Methodology) |
Adelo Vieira (talk | contribs) (→Methodology) |
||
Line 10: | Line 10: | ||
===Methodology=== | ===Methodology=== | ||
− | + | :'''1.''' The first part of the project will be: '''Mining Social Media data''' | |
:* To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python''' : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/ | :* To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python''' : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/ | ||
− | + | :'''2.''' Secondly, we will need to store the data. | |
− | + | :'''3.''' The third part of the project will be the analysis of the data. Here is where Machine learning will be implement. | |
:* In this part we first need to decide what we want to analysis. | :* In this part we first need to decide what we want to analysis. | ||
::* There are many examples, here is a nice work I found: «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis | ::* There are many examples, here is a nice work I found: «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis |
Revision as of 17:10, 2 October 2018
Contents
Social Media Sentiment Analysis
Motivation
Social media has almost become synonymous with «big data» due to the sheer amount of user-generated content.
Mining this rich data can prove unprecedented ways to keep a pulse on opinions, trends, and public sentiment. Facebook, Twitter, YouTube, WeChat... etc.
Social media data will become even more relevant for marketing, branding, and business as a whole.
As you see, this kind of analysis is a tool that will become more and more important in the coming years.
Methodology
- 1. The first part of the project will be: Mining Social Media data
- To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to mining Twitter Data with Python : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/
- 2. Secondly, we will need to store the data.
- 3. The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
- In this part we first need to decide what we want to analysis.
- There are many examples, here is a nice work I found: «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis
- In essence, this guy analyses Twitter data for the days prior to the election and got this mapa:
- Another example will be:
- But it will be up to us (as a team) to determine what we want to analyses.
https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397
https://elitedatascience.com/machine-learning-projects-for-beginners#social-media
Remote development
Eclipse - Connect to a remote file system
https://us.informatiweb.net/tutorials/it/6-web/148--eclipse-connect-to-a-remote-file-system.html
Mount a remote filesystem in your local machine
https://stackoverflow.com/questions/32747819/remote-java-development-using-intellij-or-eclipse
https://serverfault.com/questions/306796/sshfs-problem-when-losing-connection
https://askubuntu.com/questions/358906/sshfs-messes-up-everything-if-i-lose-connection
https://askubuntu.com/questions/716612/sshfs-auto-reconnect
root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o reconnect,ServerAliveInterval=5,ServerAliveCountMax=3 root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o allow_other root@sinfronteras.ws: /home/adelo/1-system/3-cloud
faster way to mount a remote file system than sshfs:
https://superuser.com/questions/344255/faster-way-to-mount-a-remote-file-system-than-sshfs
Anaconda
Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)
Installation
https://www.anaconda.com/download/#linux
https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/
Jupyter Notebook
https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook