Difference between revisions of "Data Science"

From Sinfronteras
Jump to: navigation, search
(Social Media Sentiment Analysis)
Line 9: Line 9:
 
As you see, this kind of analysis is a tool that will become more and more important in the coming years.
 
As you see, this kind of analysis is a tool that will become more and more important in the coming years.
  
 +
===Methodology===
 
* The first part of the project will be: '''Mining Social Media data'''
 
* The first part of the project will be: '''Mining Social Media data'''
** To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python'''
+
** To start the project we first need to choose where we are going to get the data from. I have seen in many sources that to start working on it, Twitter is the classic entry point for practicing. Here you can see a tutorial about how to '''mining Twitter Data with Python''' : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/
 
* Secondly, we will need to store the data.
 
* Secondly, we will need to store the data.
 
* The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
 
* The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
** In this part we first need to decide what we want to analyses. There are many examples, here is a nice work I fount: «This article describes the techniques I employed for a proof-of-concept that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis
+
** In this part we first need to decide what we want to analysis. There are many examples, here is a nice work I found: «This article describes the techniques that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election» : https://www.toptal.com/data-science/social-network-data-mining-for-predictive-analysis
 +
:: In essence, this guy analyses Twitter data for the days prior to the election and got this mapa:
 +
 
 +
[[File:Brazilian_elections_2014.png|950px|thumb|center|]]
 +
 
 +
 
 +
 
  
  

Revision as of 17:57, 2 October 2018

Social Media Sentiment Analysis

Motivation

Social media has almost become synonymous with «big data» due to the sheer amount of user-generated content.

Mining this rich data can prove unprecedented ways to keep a pulse on opinions, trends, and public sentiment. Facebook, Twitter, YouTube, WeChat... etc.

Social media data will become even more relevant for marketing, branding, and business as a whole.

As you see, this kind of analysis is a tool that will become more and more important in the coming years.

Methodology

  • The first part of the project will be: Mining Social Media data
  • Secondly, we will need to store the data.
  • The third part of the project will be the analysis of the data. Here is where Machine learning will be implement.
In essence, this guy analyses Twitter data for the days prior to the election and got this mapa:
Brazilian elections 2014.png




https://www.dezyre.com/article/top-10-machine-learning-projects-for-beginners/397

https://elitedatascience.com/machine-learning-projects-for-beginners#social-media

https://en.wikipedia.org/wiki/Sentiment_analysis

https://en.wikipedia.org/wiki/Social_media_mining

Remote development

Eclipse - Connect to a remote file system

https://us.informatiweb.net/tutorials/it/6-web/148--eclipse-connect-to-a-remote-file-system.html

Mount a remote filesystem in your local machine

https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh

https://stackoverflow.com/questions/32747819/remote-java-development-using-intellij-or-eclipse

https://serverfault.com/questions/306796/sshfs-problem-when-losing-connection

https://askubuntu.com/questions/358906/sshfs-messes-up-everything-if-i-lose-connection

https://askubuntu.com/questions/716612/sshfs-auto-reconnect

root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o reconnect,ServerAliveInterval=5,ServerAliveCountMax=3 root@sinfronteras.ws: /home/adelo/1-system/3-cloud
sshfs -o allow_other root@sinfronteras.ws: /home/adelo/1-system/3-cloud


faster way to mount a remote file system than sshfs: https://superuser.com/questions/344255/faster-way-to-mount-a-remote-file-system-than-sshfs

Anaconda

Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)

Installation

https://www.anaconda.com/download/#linux

https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/

https://www.digitalocean.com/community/tutorials/how-to-install-the-anaconda-python-distribution-on-ubuntu-18-04

Jupyter Notebook

https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook

Cursos

eu.udacity.com

https://classroom.udacity.com/courses/ud120

www.coursera.org

https://www.coursera.org/learn/machine-learning/home/welcome

Otros

https://www.udemy.com/machine-learning-course-with-python/

https://stackoverflow.com/questions/19181999/how-to-create-a-keyboard-shortcut-for-sublimerepl