Difference between revisions of "Data Science"
Adelo Vieira (talk | contribs) |
Adelo Vieira (talk | contribs) |
||
Line 1: | Line 1: | ||
− | ==Sentiment | + | ==Social Media Sentiment Analysis== |
+ | ===Motivation=== | ||
+ | Social media has almost become synonymous with «big data» due to the sheer amount of user-generated content. | ||
+ | |||
+ | Mining this rich data can prove unprecedented ways to keep a pulse on opinions, trends, and public sentiment. Facebook, Twitter, YouTube, WeChat... etc. | ||
+ | |||
+ | Social media data will become even more relevant for marketing, branding, and business as a whole. | ||
+ | |||
+ | As you see, this kind of analysis is a tool that will become more and more important in the coming years. | ||
+ | |||
+ | * The first part of the project will be: '''Mining Social Media data''' | ||
+ | ** To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python''' | ||
+ | * Secondly, we will need to store the data. | ||
+ | * The third part of the project will be the analysis of the data. Here is where Machine learning will be implement. | ||
+ | ** In this part we first need to decide what we want to analyses. There are many examples, here is a nice work I fount: «This article describes the techniques I employed for a proof-of-concept that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis | ||
+ | |||
+ | |||
+ | |||
https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397 | https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397 | ||
Revision as of 16:45, 2 October 2018
Contents
Social Media Sentiment Analysis
Motivation
Social media has almost become synonymous with «big data» due to the sheer amount of user-generated content.
Mining this rich data can prove unprecedented ways to keep a pulse on opinions, trends, and public sentiment. Facebook, Twitter, YouTube, WeChat... etc.
Social media data will become even more relevant for marketing, branding, and business as a whole.
As you see, this kind of analysis is a tool that will become more and more important in the coming years.
- The first part of the project will be: Mining Social Media data
- To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to mining Twitter Data with Python
- Secondly, we will need to store the data.
- The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
- In this part we first need to decide what we want to analyses. There are many examples, here is a nice work I fount: «This article describes the techniques I employed for a proof-of-concept that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis
https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397
https://elitedatascience.com/machine-learning-projects-for-beginners#social-media
Remote development
Eclipse - Connect to a remote file system
https://us.informatiweb.net/tutorials/it/6-web/148--eclipse-connect-to-a-remote-file-system.html
Mount a remote filesystem in your local machine
https://stackoverflow.com/questions/32747819/remote-java-development-using-intellij-or-eclipse
https://serverfault.com/questions/306796/sshfs-problem-when-losing-connection
https://askubuntu.com/questions/358906/sshfs-messes-up-everything-if-i-lose-connection
https://askubuntu.com/questions/716612/sshfs-auto-reconnect
root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o reconnect,ServerAliveInterval=5,ServerAliveCountMax=3 root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o allow_other root@sinfronteras.ws: /home/adelo/1-system/3-cloud
faster way to mount a remote file system than sshfs:
https://superuser.com/questions/344255/faster-way-to-mount-a-remote-file-system-than-sshfs
Anaconda
Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)
Installation
https://www.anaconda.com/download/#linux
https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/
Jupyter Notebook
https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook