Difference between revisions of "Python for Data Science"

Latest revision as of 15:47, 11 September 2024

For a standard Python tutorial go to Python

Courses

Udemy - Python for Data Science and Machine Learning Bootcamp

https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/

Anaconda

Anaconda is a free and open source distribution of the Python and R programming languages for data science and machine learning related applications (large-scale data processing, predictive analytics, scientific computing), that aims to simplify package management and deployment. Package versions are managed by the package management system conda. https://en.wikipedia.org/wiki/Anaconda_(Python_distribution)

En otras palabras, Anaconda puede ser visto como un paquete (a distribution) que incluye no solo Python (or R) but many libraries that are used in Data Science, as well as its own virtual environment system. It's an "all-in-one" install that is extremely popular in data science and Machine Learning.Creating sample array for the following examples:

Installation

Installation from the official Anaconda Web site: https://docs.anaconda.com/anaconda/install/

Anaconda comes with a few IDE

Jupyter Lab
Jupyter Notebook
Spyder
Qtconsole
and others

Anaconda Navigator

Anaconda Navigator is a GUI that helps you to easily start important applications and manage the packages in your local Anaconda installation

You can open the Anaconda Navigator from the Terminal:

anaconda-navigator

Jupyter

Jupyter comes with Anaconda.

It is a development environment (IDE) where we can write codes; but it also allows us to display images, and write down markdown notes.

It is the most popular IDE in data science for exploring and analyzing data.

Other famoues IDE for Python are Sublime Text and PyCharm.

There is Jupyter Lab and Jupyter Notebook

Remote connection

https://jupyter-notebook.readthedocs.io/en/stable/public_server.html

A**1

(base) adelo@vmi346715:~/.jupyter$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mykey.key -out mycert.pem
Generating a RSA private key
......................................+++++
....................................+++++
writing new private key to 'mykey.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:IE	
State or Province Name (full name) [Some-State]:Dublin
Locality Name (eg, city) []:Dublin
Organization Name (eg, company) [Internet Widgits Pty Ltd]:.
Organizational Unit Name (eg, section) []:.
Common Name (e.g. server FQDN or YOUR name) []:sinfronteras    
Email Address []:adeloaleman@gmail.com

Share Jupyter Notebook online

GitHub:

https://docs.github.com/en/github/managing-files-in-a-repository/working-with-jupyter-notebook-files-on-github

Example: https://github.com/adeloaleman/AmazonLaptopsDashboard/blob/master/DataAnalysis/data_analysis2.ipynb

'Nbviewer

https://nbviewer.jupyter.org/

Example: https://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blob/main/tutorial/06%20-%20Linking%20and%20Interactions.ipynb

Customize Jupyter

Themes

https://github.com/dunovank/jupyter-themes

Ver el tema que muestran en esta página: https://gist.github.com/pierrejoubert73/902cc94d79424356a8d20be2b382e1ab

jt   -t oceans16     -cellw 98%   -lineh 120   -fs 14   -nfs 14   -dfs 14   -ofs 14

https://www.kaggle.com/getting-started/97540

jt   -t monokai      -cellw 98%   -lineh 120   -fs 14   -nfs 14   -dfs 14   -ofs 14   -f fira   -nf ptsans   -N   -kl   -cursw 2   -cursc r   -T

Extensions

This post mention so nice extension and configuration that can be done: https://towardsdatascience.com/bringing-the-best-out-of-jupyter-notebooks-for-data-science-f0871519ca29

Unofficial Jupyter Notebook Extensions

https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/index.html

This is very important. There are very nice extensions in this package:

toc2: https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/toc2/README.html
Collapsible Headings
... etc

Installation

https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/install.html

I had some issues to install it. La format indicada por defecto:

pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user

A través de la forma anterior no pude instalar el paquete de forma correcta. La instalación no retornó errorres, y la extensión se mostraba en Jupyter-notebook pero no podía activar "enable" las extensiones.

Al parecer es un problema con la ubicación de la instalación. Yo estaba usando conda pero conda está presentando problemas. La instalación de los paquestes demora muchísimo y luego el paquete parece no estar disponible.

En el siguiente post encontré una solución para instalar nbextension usando pip: https://github.com/ipython-contrib/jupyter_contrib_nbextensions/issues/1127

pip install --upgrade jupyter_contrib_nbextensions
jupyter contrib nbextension install  --sys-prefix  --symlink

«--symlink» creo que lo usé pero no estoy completamente seguro. También realicé el --upgrade pero creo que la diferencia la hicieron las opciones --sys-prefix --symlink

Si no se muestra la Nbextensions tab (), try to reinstall the https://github.com/Jupyter-contrib/jupyter_nbextensions_configurator

pip install jupyter_nbextensions_configurator

or

conda install -c conda-forge jupyter_nbextensions_configurator

CustomJS and CustonCSS files

This is a good post: https://forums.fast.ai/t/jupyter-notebook-enhancements-tips-and-tricks/17064

Keyboard Shortcut Customization: https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Custom%20Keyboard%20Shortcuts.html

custom.js

/** Mis configuraciones */ 

// This is to enable syntax highlighting for SQL code: 
// https://stackoverflow.com/questions/43641362/adding-syntax-highlighting-to-jupyter-notebook-cell-magic
require(['notebook/js/codecell'], function(codecell) {
  codecell.CodeCell.options_default.highlight_modes['magic_text/x-mssql'] = {'reg':[/^%%sql/]} ;
  Jupyter.notebook.events.one('kernel_ready.Kernel', function(){
  Jupyter.notebook.get_cells().map(function(cell){
      if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;
  });
});


// My plain theme
// This is a good post where I took some ideas to write the following fuction: https://forums.fast.ai/t/jupyter-notebook-enhancements-tips-and-tricks/17064
function plainTheme() {
    var input_promp_fields = document.getElementsByClassName("prompt_container");
    var text_render_fields = document.getElementsByClassName("text_cell_render");

    if (input_promp_fields[0].style.visibility == "collapse"){
        action = "visible";
        input_marginLeft = "0px";
        border_top  = "3px";
        prompt_width = "74px";
        padding_top = "0px";
        output_margin = "40px";
    }else{
        action = "collapse";
        input_marginLeft = "74px";
        border_top  = '0px';
        prompt_width = "74px";
        padding_top = "40px";
        output_margin = "40px";
    }

    // Si queremos usar !important debemos hacerlo de esta forma utilizando JQuery:
    // https://makitweb.com/how-to-add-important-to-css-property-with-jquery/
    var text_cell_fields = document.getElementsByClassName("text_cell");
    $(text_cell_fields).ready(function(){
        $('.input_prompt').css({
            'cssText': `width: 40px !important; max-width: ${prompt_width} !important; min-width: ${prompt_width} !important;`
        });
    });

    $(document).ready(function(){
        $(".prompt_container").css(
            'visibility', `${action}`
        );
        
        $(".input").css(
            'padding-left', `${input_marginLeft}`
        );
        
        $(".output_subarea").css(
            'margin-left', `${output_margin}`
        );
                    
        $('.cell').css({
            'cssText': `border-top-width: ${border_top} !important; border-bottom-width: ${border_top} !important;`
        });
        
        $(".collapsible_headings_ellipsis").css({
            'cssText': `padding-top:${padding_top} !important; border-top-width: ${border_top} !important; border-bottom-width: ${border_top} !important;`
        });

        $(".text_cell_render").css({
            'cssText': `margin-left: -10px;`
        });
    });            
}

Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Alt-Ctrl-Q', {
    help : '...',
    help_index : 'zz',
    handler : function (event) {
        plainTheme();
    return false;
    }}
);

Jupyter.keyboard_manager.edit_shortcuts.add_shortcut('Alt-Ctrl-Q', {
    help : '...',
    help_index : 'zz',
    handler : function (event) {
        plainTheme();
    return false;
    }}
);


// This could be very usefull. It allows to add text automatically into a cell
// https://forums.fast.ai/t/jupyter-notebook-enhancements-tips-and-tricks/17064/27
Jupyter.keyboard_manager.edit_shortcuts.add_shortcut('Ctrl-Shift-J', {
    help : '...',
    help_index : 'zz',
    handler : function (event) {
        document.body.style.background = 'blue'
        var target = Jupyter.notebook.get_selected_cell()
        var cursor = target.code_mirror.getCursor()
        var before = target.get_pre_cursor()
        var after = target.get_post_cursor()
        target.set_text(before + 'from IPython.core.display import display, HTML; \n\taverrrdisplay(HTML("<style>.container { width:98% !important;}</style>"))' + after)
        cursor.ch += 20 // where to put your cursor
        target.code_mirror.setCursor(cursor)
        return false;
    }}
);


// To get the real value of a css field: https://stackoverflow.com/questions/26074476/document-body-style-backgroundcolor-doesnt-work-with-external-css-style-sheet
// window.getComputedStyle(document.body).backgroundColor
// window.getComputedStyle(document.getElementsByClassName("input_area")[0]).backgroundColor

custom.css

/*  Mis configuraciones  */

.container { width:98% !important; }
/* document.getElementById("notebook-container").style.minWidth = "50%"; */
/* document.getElementById("notebook-container").style.maxWidth = "50%"; */

#notebook-container {
 width:98% !important;
}

.CodeMirror-gutters {
 background-color: transparent !important;
 background: transparent !important;
}

.CodeMirror-linenumber {
 margin-left: -20px !important;
}

.output_subarea {
 margin-left: 40px !important;
}

#toc .fa-fw {
 color: blue !important;
}

#toc .highlight_on_scroll {
 margin-left: -4px !important;
 
}

#toc {
 padding-left: 10px !important;
}

/*  I have also changed the color
/*  #a6e22e   by   #388bfd 
 *  in the entire custom.css
 */

/* I have also chenged some of the properties of the toc directly above in the code: 

#toc-wrapper {
 z-index: 90;
 position: fixed !important;
 display: flex;
 flex-direction: column;
 overflow: hidden;
 padding: 10px;
 padding-top: 40px !important;
 border-style: solid;
 border-width: thin;
 border-right-width: medium !important;
 background-color: #1e1e1e !important;
}
#toc-wrapper.ui-draggable.ui-resizable.sidebar-wrapper {
 border-color: rgba(93,92,82,.25) !important;
}
#toc a,
#navigate_menu a,
.toc {
 color: #f8f8f0 !important;
 font-size: 16pt !important;
}
#toc li > span:hover {
 background-color: rgba(93,92,82,.25) !important;
}
#toc a:hover,
#navigate_menu a:hover,
.toc {
 color: #DAA520 !important;
 font-size: 16pt !important;
}
#toc-wrapper .toc-item-num {
 color: #388bfd !important;
 font-size: 16pt !important;
}
*/

Configurations from the Juniper notebook

from IPython.core.display import display, HTML; 

display(HTML("<style>.container { width:98% !important;}</style>"<))

display(HTML('<style>.prompt.input_prompt{display:none !important;}</style>'))
display(HTML('<style>.prompt.input_prompt{visibility: visible !important;</style>'))
display(HTML('<style>.prompt.input_prompt{margin-left8kmclustering.ipynb 50px}</style>'))
display(HTML('<style>.prompt.input_prompt{visibility: visible !important; width: 0px !important; min-width: 0px !important}</style>'))  

display(HTML('<style>.input_area{margin-left: -50px;}</style>'))
display(HTML('<style>.input{margin-left: -20px;}</style>'))

display(HTML('<style>.output_area{margin-left: 55px}</style>'))

# display(HTML('<style>.cell{margin-bottom: -5px !important; margin-top: -5px !important;}</style>'))
# display(HTML('<style>.code_cell{margin-bottom: -5px !important; margin-top: -5px !important;}</style>'))

# display(HTML('<style>.output_wrapper{margin-bottom: 0px !important; margin-top: 0px !important;}</style>'))

Online Jupyter

There are many sites that provides solutions to run your Jupyter Notebook in the cloud: https://www.dataschool.io/cloud-services-for-jupyter-notebook/

I have tried:

https://cocalc.com/app

https://cocalc.com/projects/595bf475-61a7-47fa-af69-ba804c3f23f9/files/?session=default

Parece bueno, pero tiene opciones que no son gratis

https://www.kaggle.com/

https://www.kaggle.com/adeloaleman/kernel1917a91630/edit

Parece bueno pero no encontré la forma adicionar una TOC

https://drive.google.com

https://colab.research.google.com

Es el que estoy utilizando ahora

Some remarks

Executing Terminal Commands in Jupyter Notebooks

https://support.anaconda.com/hc/en-us/articles/360023858254-Executing-Terminal-Commands-in-Jupyter-Notebooks

If we are in the Notebook, and we want to run a shell command rather than a notebook command we use the ! or %

Try, for example:

%ls 
!pwd

It's the same as if you opened up a terminal and typed it without the !

Creating Presentations in Jupyter Notebook with RevealJS

Some of the most popular Python Data Science Libraries

NumPy
SciPy
Pandas
Seaborn
SciKit'Learn
MatplotLib
Plotly
PySpartk

Using SQL in Jupyter

Connecting to a database in Jupyter

https://pypi.org/project/ipython-sql/

https://stackoverflow.com/questions/454854/no-module-named-mysqldb

https://stackoverflow.com/questions/5178292/pip-install-mysql-python-fails-with-environmenterror-mysql-config-not-found

https://docs.kyso.io/guides/sql-interface-within-jupyterlab

https://www.datacamp.com/community/tutorials/sql-interface-within-jupyterlab

https://stackoverflow.com/questions/43641362/adding-syntax-highlighting-to-jupyter-notebook-cell-magic

https://www.sqlshack.com/learn-jupyter-notebooks-for-sql-server/

Verificar las fuentes above. Creo que lo único que tuve que hacer la última vez que lo instalé fue basado en las 3 primeras sources:

pip install ipython-sql

sudo apt install default-libmysqlclient-dev

pip install mysqlclient

sudo apt-get install python3-mysqldb

Luego adding SQL syntax highlighting to Jupyter as describe above in the corrrespoinding source.

Difference between revisions of "Python for Data Science"

Latest revision as of 15:47, 11 September 2024

Contents

Courses

Anaconda

Installation

Anaconda comes with a few IDE

Anaconda Navigator

Jupyter

Remote connection

Share Jupyter Notebook online

Customize Jupyter

Themes

Extensions

Unofficial Jupyter Notebook Extensions

Installation

CustomJS and CustonCSS files

Configurations from the Juniper notebook

Online Jupyter

Some remarks

Executing Terminal Commands in Jupyter Notebooks

Creating Presentations in Jupyter Notebook with RevealJS

Some of the most popular Python Data Science Libraries

NumPy and Pandas

Data Visualization with Python

Natural Language Processing

Dash - Plotly

Scrapy

Using SQL in Jupyter

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Tools

@@ Line 1: / Line 1: @@
+<br />
 For a standard Python tutorial go to [[Python]]
+<br />
+==Courses==
+*Udemy - Python for Data Science and Machine Learning Bootcamp
+:https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/
@@ Line 11: / Line 20: @@
 <br />
 ===Installation===
-https://linuxize.com/post/how-to-install-anaconda-on-ubuntu-18-04/
+Installation from the official Anaconda Web site: https://docs.anaconda.com/anaconda/install/
-https://www.digitalocean.com/community/tutorials/how-to-install-the-anaconda-python-distribution-on-ubuntu-18-04
+<br />
-<br />
 ===Anaconda comes with a few IDE===
@@ Line 35: / Line 43: @@
 <br />
 ==Jupyter==
 Jupyter comes with Anaconda.
@@ Line 48: / Line 57: @@
 <br />
-===Online Jupyter===
+===Remote connection===
-There are many sites that provides solutions to run your Jupyter Notebook in the cloud: https://www.dataschool.io/cloud-services-for-jupyter-notebook/
+https://jupyter-notebook.readthedocs.io/en/stable/public_server.html
-I have tried:
-*https://cocalc.com/app
-::https://cocalc.com/projects/595bf475-61a7-47fa-af69-ba804c3f23f9/files/?session=default
+A**1
-::Parece bueno, pero tiene opciones que no son gratis
-*https://www.kaggle.com/
+<syntaxhighlight lang="shell">
+(base) adelo@vmi346715:~/.jupyter$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mykey.key -out mycert.pem
+Generating a RSA private key
+......................................+++++
+....................................+++++
+writing new private key to 'mykey.key'
+-----
+You are about to be asked to enter information that will be incorporated
+into your certificate request.
+What you are about to enter is what is called a Distinguished Name or a DN.
+There are quite a few fields but you can leave some blank
+For some fields there will be a default value,
+If you enter '.', the field will be left blank.
+-----
+Country Name (2 letter code) [AU]:IE
+State or Province Name (full name) [Some-State]:Dublin
+Locality Name (eg, city) []:Dublin
+Organization Name (eg, company) [Internet Widgits Pty Ltd]:.
+Organizational Unit Name (eg, section) []:.
+Common Name (e.g. server FQDN or YOUR name) []:sinfronteras
+Email Address []:adeloaleman@gmail.com
+</syntaxhighlight>
-::https://www.kaggle.com/adeloaleman/kernel1917a91630/edit
-::Parece bueno pero no encontré la forma adicionar una TOC
+<br />
+===Share Jupyter Notebook online===
+* '''GitHub:'''
+: https://docs.github.com/en/github/managing-files-in-a-repository/working-with-jupyter-notebook-files-on-github
+: Example: https://github.com/adeloaleman/AmazonLaptopsDashboard/blob/master/DataAnalysis/data_analysis2.ipynb
-*https://drive.google.com
-:*https://colab.research.google.com
+* '''Nbviewer''
-::Es el que estoy utilizando ahora
+: https://nbviewer.jupyter.org/
+: Example: https://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blob/main/tutorial/06%20-%20Linking%20and%20Interactions.ipynb
 <br />
-==Courses==
-*Udemy - Python for Data Science and Machine Learning Bootcamp
-:https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/
+===Customize Jupyter===
 <br />
-==Most popular Python Data Science Libraries===
+====Themes====
+https://github.com/dunovank/jupyter-themes
-*NumPy
+Ver el tema que muestran en esta página: https://gist.github.com/pierrejoubert73/902cc94d79424356a8d20be2b382e1ab
-*SciPy
-*Pandas
-*Seaborn
-*SciKit'Learn
-*MatplotLib
-*Plotly
-*PySpartk
-<br />
+ jt   -t oceans16     -cellw 98%   -lineh 120   -fs 14   -nfs 14   -dfs 14   -ofs 14
-==NumPy==
-*NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.
-*Numpy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use Arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).
+https://www.kaggle.com/getting-started/97540
+ jt   -t monokai      -cellw 98%   -lineh 120   -fs 14   -nfs 14   -dfs 14   -ofs 14   -f fira   -nf ptsans   -N   -kl   -cursw 2   -cursc r   -T
 <br />
-===Installation===
-It is highly recommended you install Python using the Anaconda distribution to make sure all underlying dependencies (such as Linear Algebra libraries) all sync up with the use of a conda install.
+====Extensions====
+This post mention so nice extension and configuration that can be done: https://towardsdatascience.com/bringing-the-best-out-of-jupyter-notebooks-for-data-science-f0871519ca29
-If you have Anaconda, install NumPy by:
+<br />
+=====Unofficial Jupyter Notebook Extensions=====
+https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/index.html
- conda install numpy
+<span style="color: green">'''This is very important. There are very nice extensions in this package:'''</span>
-<br />If you are not using Anaconda distribution:
-*
+* toc2: https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/toc2/README.html
+* Collapsible Headings
+* ... etc
- pip install numpy
+<br />
+======Installation======
+https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/install.html
+<span style="color: red">'''I had some issues to install it. La format indicada por defecto:'''</span>
+ pip install jupyter_contrib_nbextensions
+ jupyter contrib nbextension install --user
-Then, to use it:<syntaxhighlight lang="python3">
+<span style="color: red">'''A través de la forma anterior no pude instalar el paquete de forma correcta. La instalación no retornó errorres, y la extensión se mostraba en Jupyter-notebook pero no podía activar "enable" las extensiones.'''</span>
-import numpy as np
-arr = np.arange(0,10)
-</syntaxhighlight>
-===Arrays===
-{| class="wikitable"
-! colspan="2" rowspan="2" |
-! colspan="2" rowspan="2" |Method/Operation
-! rowspan="2" |Description/Comments
-!Example
-|-
-!<syntaxhighlight lang="python3">
-import numpy as np
-</syntaxhighlight>
-|-
-! rowspan="10" |<h5 style="text-align:left">Methods for creating NumPy Arrays</h5>
-|<h5 style="text-align:left">From a Python List</h5>
-| colspan="2" |'''''<code>array()</code>'''''
-|We can create an array by directly converting a list or list of lists.
-|<code>my_list = [1,2,3]</code>
-<code>np.array(my_list)</code>
+<span style="color: red">'''Al parecer es un problema con la ubicación de la instalación. Yo estaba usando conda pero conda está presentando problemas. La instalación de los paquestes demora muchísimo y luego el paquete parece no estar disponible.'''</span>
-<code>my_matrix = [[1,2,3],[4,5,6],[7,8,9]]</code>
-<code>np.array(my_matrix)</code>
+<span style="color: red">'''En el siguiente post encontré una solución para instalar nbextension usando pip:'''</span>
-|-
+https://github.com/ipython-contrib/jupyter_contrib_nbextensions/issues/1127
-| rowspan="9" |<h5 style="text-align:left">From Built-in NumPy Methods</h5>
-| colspan="2" |'''''<code>arange()</code>'''''
-|Return evenly spaced values within a given interval.
-|<code>np.arange(0,10)</code>
-<code>np.arange(0,11,2)</code>
-|-
-| colspan="2" |'''''<code>zeros()</code>'''''
-|Generate arrays of zeros.
-|<code>np.zeros(3)</code>
-<code>np.zeros((5,5))</code>
-|-
-| colspan="2" |'''''<code>ones()</code>'''''
-|Generate arrays of ones.
-|<code>np.ones(3)</code>
-<code>np.ones((3,3))</code>
-|-
-| colspan="2" |'''''<code>linspace()</code>'''''
-|Return evenly spaced numbers over a specified interval.
-|<code>np.linspace(0,10,3)</code>
-<code>np.linspace(0,10,50)</code>
-|-
-| colspan="2" |'''''<code>eye()</code>'''''
-|Creates an identity matrix.
-|<code>np.linspace(0,10,50)</code>
-|-
-| rowspan="4" |'''''<code>random</code>'''''
-|'''''<code>rand()</code>'''''
-|Create an array of the given shape and populate it with random samples from a uniform distribution over <code>[0, 1)</code>.
-|<syntaxhighlight lang="python3">
-np.random.rand(2)
-np.random.rand(5,5)
+ pip install --upgrade jupyter_contrib_nbextensions
+ jupyter contrib nbextension install  --sys-prefix  --symlink
-# Another way to invoke a function:
+<span style="color: red">'''«--symlink» creo que lo usé pero no estoy completamente seguro. También realicé el --upgrade pero creo que la diferencia la hicieron las opciones --sys-prefix  --symlink'''</span>
-from numpy.random import rand
-# Then you can call the function directly
-rand(5,5)
-</syntaxhighlight><br />
-|-
-|'''''<code>randn()</code>'''''
-|Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform.
-|<code>np.random.randn(2)</code>
-<code>np.random.randn(5,5)</code>
-|-
-|'''''<code>randint()</code>'''''
-|Return random integers from <code>low</code> (inclusive) to <code>high</code> (exclusive).
-|<code>np.random.randint(1,100)</code>
-<code>np.random.randint(1,100,10)</code>
-|-
-|'''<code>seed()</code>'''
-|sets the random seed of the NumPy pseudo-random number generator.  It provides an essential input that enables NumPy to generate pseudo-random numbers for random processes. See [[wikipedia:Random_seed|s1]] and [https://www.sharpsightlabs.com/blog/numpy-random-seed/ s2]. for explanation.
-|<code>np.random.seed(101)</code>
-|-
-! rowspan="4" |<h5 style="text-align:left">Others Array Attributes and Methods</h5>
-| rowspan="4" |
-| colspan="2" |''<code>'''reshape()'''</code>''
-|Returns an array containing the same data with a new shape.
-|<code>arr.reshape(5,5)</code>
-|-
-| colspan="2" |'''''<code>max()</code>, <code>min()</code>, <code>argmax()</code>, <code>argmin()</code>'''''
-|Finding max or min values. Or to find their index locations using argmin or argmax.
-|<code>arr.max()</code>
-<code>arr.argmax()</code>
-|-
-| colspan="2" |''<code>'''shape()'''</code>''
-|Shape is an attribute that arrays have (not a method).
-|NO LO ENTENDI.. REVISAR!
-<nowiki>#</nowiki>Length of array
-arr_length = arr2d.shape[1]
+Si no se muestra la '''Nbextensions''' tab (), try to reinstall the https://github.com/Jupyter-contrib/jupyter_nbextensions_configurator
-<br />
-|-
-| colspan="2" |''<code>'''dtype()'''</code>''
-|You can also grab the data type of the object in the array.
-|<code>arr.dtype</code>
-|-
-!<nowiki>-</nowiki>
-!-
-! colspan="2" |-
-!-
-!-
-|-
-! rowspan="8" |<h5 style="text-align:left">Indexing and Selection</h5>
-<div style="text-align:left">
+ pip install jupyter_nbextensions_configurator
-*How to select elements or groups of elements from an array.
+or
-*The general format is '''arr_2d[row][col]''' or '''arr_2d[row,col]'''. I recommend usually using the comma notation for clarity.
+ conda install -c conda-forge jupyter_nbextensions_configurator
-</div>
-|
-| colspan="2" |
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-'''Creating sample array for the following examples:'''
-<div class="mw-collapsible-content">
-<syntaxhighlight lang="python3">
-import numpy as np
-arr = np.arange(0,10)
-# 1D Array:
-arr = np.arange(0,11)
-#Show
-arr
-Output: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
-# 2D Array
-arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))
-#Show
-arr_2d
-Output:
-array([[ 5, 10, 15],
-       [20, 25, 30],
-       [35, 40, 45]])
-</syntaxhighlight>
-</div>
-</div>
-|-
-| rowspan="2" |<h5 style="text-align:left">Bracket Indexing and Selection (Slicing)</h5>
-| colspan="2" |
-|Note: When we create a sub-array slicing an array (slice_of_arr = arr[0:6]), data is not copied, it's a view of the original array! This avoids memory problems! To get a copy, need to use the method '''copy()'''. See important note below.
-|<syntaxhighlight lang="python3">
-#Get a value at an index
-arr[8]
-#Get values in a range
+<br />
-arr[1:5]
-slice_of_arr = arr[0:6]
+====CustomJS and CustonCSS files====
+This is a good post: https://forums.fast.ai/t/jupyter-notebook-enhancements-tips-and-tricks/17064
-#2D
+Keyboard Shortcut Customization: https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Custom%20Keyboard%20Shortcuts.html
-arr_2d[1]
-arr_2d[1][0]
-arr_2d[1,0] # The same that above
-#Shape (2,2) from top right corner
-arr_2d[:2,1:]
-#Output:
-array([[10, 15],
-       [25, 30]])
-#Shape bottom row
+<br />
-arr_2d[2,:]
+ custom.js
-</syntaxhighlight><br />
+<syntaxhighlight lang="js">
-|-
+/** Mis configuraciones */
-| colspan="2" |
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-'''Fancy Indexing''':
-<div class="mw-collapsible-content">
-Fancy indexing allows you to select entire rows or columns out of order.
-Example:<syntaxhighlight lang="python3">
-# Set up matrix
-arr2d = np.zeros((10,10))
-# Length of array
+// This is to enable syntax highlighting for SQL code:
-arr_length = arr2d.shape[1]
+// https://stackoverflow.com/questions/43641362/adding-syntax-highlighting-to-jupyter-notebook-cell-magic
+require(['notebook/js/codecell'], function(codecell) {
+  codecell.CodeCell.options_default.highlight_modes['magic_text/x-mssql'] = {'reg':[/^%%sql/]} ;
+  Jupyter.notebook.events.one('kernel_ready.Kernel', function(){
+  Jupyter.notebook.get_cells().map(function(cell){
+      if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;
+  });
+});
-# Set up array
-for i in range(arr_length):
-    arr2d[i] = i
-arr2d
-# Output:
-array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
-       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
-       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
-       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
-       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
-       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
-       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
-       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
-       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
-       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])
-# Fancy indexing allows the following
+// My plain theme
-arr2d[[6,4,2,7]]
+// This is a good post where I took some ideas to write the following fuction: https://forums.fast.ai/t/jupyter-notebook-enhancements-tips-and-tricks/17064
-# Output:
+function plainTheme() {
-array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
+    var input_promp_fields = document.getElementsByClassName("prompt_container");
-       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
+    var text_render_fields = document.getElementsByClassName("text_cell_render");
-       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
-       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])
-</syntaxhighlight><br />
-</div>
-</div>
-|-
-| rowspan="2" |<h5 style="text-align:left">Broadcasting</h5>
+    if (input_promp_fields[0].style.visibility == "collapse"){
+        action = "visible";
+        input_marginLeft = "0px";
+        border_top  = "3px";
+        prompt_width = "74px";
+        padding_top = "0px";
+        output_margin = "40px";
+    }else{
+        action = "collapse";
+        input_marginLeft = "74px";
+        border_top  = '0px';
+        prompt_width = "74px";
+        padding_top = "40px";
+        output_margin = "40px";
+    }
-(Setting a value with index range)
+    // Si queremos usar !important debemos hacerlo de esta forma utilizando JQuery:
-| colspan="2" rowspan="2" |
+    // https://makitweb.com/how-to-add-important-to-css-property-with-jquery/
-| rowspan="2" |Setting a value with index range:
+    var text_cell_fields = document.getElementsByClassName("text_cell");
-Numpy arrays differ from a normal Python list because of their ability to broadcast.
+    $(text_cell_fields).ready(function(){
-|arr[0:5]=100<br />'''#'''Show
+        $('.input_prompt').css({
-arr
+            'cssText': `width: 40px !important; max-width: ${prompt_width} !important; min-width: ${prompt_width} !important;`
+        });
+    });
-Output: array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])
+    $(document).ready(function(){
-|-
+        $(".prompt_container").css(
-|'''#'''Setting all the values of an Array
+            'visibility', `${action}`
-arr[:]=99
+        );
-|-
-|<h5 style="text-align:left">Get a copy of an Array</h5>
+        $(".input").css(
-| colspan="2" |'''<code>copy''()''</code>'''
+            'padding-left', `${input_marginLeft}`
-|Note: When we create a sub-array slicing an array (slice_of_arr = arr[0:6]), data is not copied, it's a view of the original array! This avoids memory problems! To get a copy, need to use the method '''copy()'''. See important note below.
+        );
-|arr_copy = arr.copy()
-|-
+        $(".output_subarea").css(
-|<h5 style="text-align:left">Important notes on Slices</h5>
+            'margin-left', `${output_margin}`
-| colspan="2" |
+        );
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style=""><syntaxhighlight lang="python3">
-slice_of_arr = arr[0:6]
+        $('.cell').css({
-#Show slice
+            'cssText': `border-top-width: ${border_top} !important; border-bottom-width: ${border_top} !important;`
-slice_of_arr
+        });
-Output: array([0, 1, 2, 3, 4, 5])
+        $(".collapsible_headings_ellipsis").css({
+            'cssText': `padding-top:${padding_top} !important; border-top-width: ${border_top} !important; border-bottom-width: ${border_top} !important;`
+        });
-#Making changes in slice_of_arr
+        $(".text_cell_render").css({
-slice_of_arr[:]=99
+            'cssText': `margin-left: -10px;`
-#Show slice
+        });
-slice_of_arr
+    });
-Output: array([99, 99, 99, 99, 99, 99])
+}
-#Now note the changes also occur in our original array!
+Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Alt-Ctrl-Q', {
-#Show
+    help : '...',
-arr
+    help_index : 'zz',
-Output: array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
+    handler : function (event) {
+        plainTheme();
+    return false;
+    }}
+);
-#When we create a sub-array slicing an array (slice_of_arr = arr[0:6]), data is not copied, it's a view of the original array! This avoids memory problems!
+Jupyter.keyboard_manager.edit_shortcuts.add_shortcut('Alt-Ctrl-Q', {
+    help : '...',
+    help_index : 'zz',
+    handler : function (event) {
+        plainTheme();
+    return false;
+    }}
+);
-#To get a copy, need to use the method copy()
-</syntaxhighlight>
-</div>
-|-
-|<h5 style="text-align:left">Using brackets for selection based on comparison operators and booleans</h5>
-| colspan="2" |
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style=""><syntaxhighlight lang="python3">
-arr = np.arange(1,11)
-arr > 4
-# Output:
-array([False, False, False, False,  True,  True,  True,  True,  True,
-        True])
-bool_arr = arr>4
+// This could be very usefull. It allows to add text automatically into a cell
-bool_arr
+// https://forums.fast.ai/t/jupyter-notebook-enhancements-tips-and-tricks/17064/27
-# Output:
+Jupyter.keyboard_manager.edit_shortcuts.add_shortcut('Ctrl-Shift-J', {
-array([False, False, False, False,  True,  True,  True,  True,  True,
+    help : '...',
-         True])
+    help_index : 'zz',
+    handler : function (event) {
+        document.body.style.background = 'blue'
+        var target = Jupyter.notebook.get_selected_cell()
+        var cursor = target.code_mirror.getCursor()
+        var before = target.get_pre_cursor()
+        var after = target.get_post_cursor()
+        target.set_text(before + 'from IPython.core.display import display, HTML; \n\taverrrdisplay(HTML("<style>.container { width:98% !important;}</style>"))' + after)
+        cursor.ch += 20 // where to put your cursor
+        target.code_mirror.setCursor(cursor)
+         return false;
+    }}
+);
-arr[bool_arr]
-# Output:
-array([ 5,  6,  7,  8,  9, 10])
-arr[arr>2]
+// To get the real value of a css field: https://stackoverflow.com/questions/26074476/document-body-style-backgroundcolor-doesnt-work-with-external-css-style-sheet
-# Output:
+// window.getComputedStyle(document.body).backgroundColor
-array([ 3,  4,  5,  6,  7,  8,  9, 10])
+// window.getComputedStyle(document.getElementsByClassName("input_area")[0]).backgroundColor
-x = 2
-arr[arr>x]
-# Output:
-array([ 3,  4,  5,  6,  7,  8,  9, 10])
 </syntaxhighlight>
-</div>
-|-
-!-
-!-
-! colspan="2" |-
-!-
-!-
-|-
-!<h5 style="text-align:left">Arithmetic operations</h5>
-|
-| colspan="2" |<code>arr + arr</code>
-<code>arr - arr</code>
-<code>arr * arr</code>
-<code>arr/arr</code>
-<code>1/arr</code>
-<code>arr**3</code>
-|Warning on division by zero, but not an error!
-<code>0/0 -> nan</code>
-<code>1/0 -> inf</code>
-|<syntaxhighlight lang="python3">
-import numpy as np
-arr = np.arange(0,10)
-arr + arr
-# Output:
-array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])
-arr**3
-# Output:
-array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])
-</syntaxhighlight>
-|-
-! rowspan="5" |<h5 style="text-align:left">[https://docs.scipy.org/doc/numpy/reference/ufuncs.html Universal Array Functions]</h5>
-| rowspan="5" |
-| colspan="2" |<code>np.sqrt(arr)</code>
-|Taking Square Roots
-| rowspan="5" |<syntaxhighlight lang="python3">
-np.sin(arr)
-# Output:
-array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
-       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])
-</syntaxhighlight>
-|-
-| colspan="2" |<code>np.exp(arr)</code>
-|Calcualting exponential (e^)
-|-
-| colspan="2" |<code>np.max(arr)</code>
-same as <code>arr.max()</code>
-|Max
-|-
-| colspan="2" |<code>np.sin(arr)</code>
-|Sin
-|-
-| colspan="2" |<code>np.log(arr)</code>
-|Natural logarithm
-|}
-<br />
-==Pandas==
-You can think of pandas as an extremely powerful version of Excel, with a lot more features. In this section of the course, you should go through the notebooks in this order:
 <br />
-===Series===
+ custom.css
-A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.
+<syntaxhighlight lang="css">
+/*  Mis configuraciones  */
-{| class="wikitable"
+.container { width:98% !important; }
-! rowspan="2" |
+/* document.getElementById("notebook-container").style.minWidth = "50%"; */
-! rowspan="2" |
+/* document.getElementById("notebook-container").style.maxWidth = "50%"; */
-! rowspan="2" |Method/Operator
-! rowspan="2" |Description/Comments
-!Example
-|-
-!<syntaxhighlight lang="python3">
-import pandas as pd
-</syntaxhighlight>
-|-
-! rowspan="3" |<h4 style="text-align:left">Creating Pandas Series</h4>
+#notebook-container {
+ width:98% !important;
+}
-<div style="text-align:left">
+.CodeMirror-gutters {
-You can convert a <code>list</code>, <code>numpy array</code>, or <code>dictionary</code> to a Series.
+ background-color: transparent !important;
-</div>
+ background: transparent !important;
-|<h5 style="text-align:left">From a List</h5>
+}
-|<code>pd.Series(my_list)</code>
-| colspan="2" rowspan="3" |<syntaxhighlight lang="python3">
-# Creating some test data:
-labels = ['a','b','c']
-my_list = [10,20,30]
-arr = np.array([10,20,30])
-d = {'a':10,'b':20,'c':30}
+.CodeMirror-linenumber {
+ margin-left: -20px !important;
+}
-pd.Series(data=my_list)
+.output_subarea {
-pd.Series(my_list)
+ margin-left: 40px !important;
-pd.Series(arr)
+}
-# Output:
-    10
-    20
-    30
-dtype: int64
-pd.Series(data=my_list,index=labels)
+#toc .fa-fw {
-pd.Series(my_list,labels)
+ color: blue !important;
-pd.Series(arr,labels)
+}
-pd.Series(d)
-# Output:
-a    10
-b    20
-c    30
-dtype: int64
-</syntaxhighlight>
-|-
-|<h5 style="text-align:left">From a NumPy Array</h5>
-|<code>pd.Series(arr)</code>
-|-
-|<h5 style="text-align:left">From a Dectionary</h5>
-|<code>pd.Series(d)</code>
-|-
-!<h4 style="text-align:left">Data in a Series</h4>
-|
+#toc .highlight_on_scroll {
-|
+ margin-left: -4px !important;
-| colspan="2" |A pandas Series can hold a variety of object types. Even functions (although unlikely that you will use this)<syntaxhighlight lang="python3">
-pd.Series(data=labels)
+}
-# Output:
-    a
-    b
-    c
-dtype: object
-# Holding «functions» into a Series
+#toc {
-# Output:
+ padding-left: 10px !important;
-pd.Series([sum,print,len])
+}
-      <built-in function sum>
-      <built-in function print>
-      <built-in function len>
-dtype: object
-</syntaxhighlight>
-|-
-!<h4 style="text-align:left">Index in Series</h4>
-|
-|
-| colspan="2" |The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (works like a hash table or dictionary).<syntaxhighlight lang="python3">
-ser1 = pd.Series([1,2,3,4],index = ['USA', 'Germany','USSR', 'Japan'])
-ser1
-# Output:
-USA        1
-Germany    2
-USSR       3
-Japan      4
-dtype: int64
-ser2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan'])
+/*  I have also changed the color
+/*  #a6e22e   by   #388bfd
+ *  in the entire custom.css
+ */
-ser1['USA']
+/* I have also chenged some of the properties of the toc directly above in the code:
-# Output:
-# Operations are then also done based off of index:
+#toc-wrapper {
-ser1 + ser2
+ z-index: 90;
-# Output:
+ position: fixed !important;
-Germany    4.0
+ display: flex;
-Italy      NaN
+ flex-direction: column;
-Japan      8.0
+ overflow: hidden;
-USA        2.0
+ padding: 10px;
-USSR       NaN
+ padding-top: 40px !important;
-dtype: float64
+ border-style: solid;
+ border-width: thin;
+ border-right-width: medium !important;
+ background-color: #1e1e1e !important;
+}
+#toc-wrapper.ui-draggable.ui-resizable.sidebar-wrapper {
+ border-color: rgba(93,92,82,.25) !important;
+}
+#toc a,
+#navigate_menu a,
+.toc {
+ color: #f8f8f0 !important;
+ font-size: 16pt !important;
+}
+#toc li > span:hover {
+ background-color: rgba(93,92,82,.25) !important;
+}
+#toc a:hover,
+#navigate_menu a:hover,
+.toc {
+ color: #DAA520 !important;
+ font-size: 16pt !important;
+}
+#toc-wrapper .toc-item-num {
+ color: #388bfd !important;
+ font-size: 16pt !important;
+}
+*/
 </syntaxhighlight>
-|}
 <br />
-===DataFrames===
+====Configurations from the Juniper notebook====
-DataFrames are the workhorse of pandas and are directly inspired by the R programming language. We can think of a DataFrame as a bunch of Series objects put together to share the same index. Let's use pandas to explore this topic!
+<syntaxhighlight lang="python3">
+from IPython.core.display import display, HTML;
-<syntaxhighlight lang="python">
+display(HTML("<style>.container { width:98% !important;}</style>"<))
-import pandas as pd
-import numpy as np
-from numpy.random import randn
+display(HTML('<style>.prompt.input_prompt{display:none !important;}</style>'))
-np.random.seed(101)
+display(HTML('<style>.prompt.input_prompt{visibility: visible !important;</style>'))
+display(HTML('<style>.prompt.input_prompt{margin-left8kmclustering.ipynb 50px}</style>'))
+display(HTML('<style>.prompt.input_prompt{visibility: visible !important; width: 0px !important; min-width: 0px !important}</style>'))
-df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())
+display(HTML('<style>.input_area{margin-left: -50px;}</style>'))
+display(HTML('<style>.input{margin-left: -20px;}</style>'))
-df
+display(HTML('<style>.output_area{margin-left: 55px}</style>'))
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-C  -2.018168    0.740122    0.528813   -0.589001
-D   0.188695   -0.758872   -0.933237    0.955057
-E   0.190794    1.978757    2.605967    0.683509
-</syntaxhighlight>
+# display(HTML('<style>.cell{margin-bottom: -5px !important; margin-top: -5px !important;}</style>'))
+# display(HTML('<style>.code_cell{margin-bottom: -5px !important; margin-top: -5px !important;}</style>'))
+# display(HTML('<style>.output_wrapper{margin-bottom: 0px !important; margin-top: 0px !important;}</style>'))
-'''DataFrame Columns are just Series:'''<syntaxhighlight lang="python3">
-type(df['W'])
-# Output:
-pandas.core.series.Series
 </syntaxhighlight>
-{| class="wikitable"
-!
-!
-!Method/
-Operator
-!Description/Comments
-!Example
-|-
-! rowspan="5" |<h4 style="text-align:left">Selection and Indexing</h4>
-<div style="text-align:left">
+<br />
-Let's learn the various
-methods to grab data
+===Online Jupyter===
+There are many sites that provides solutions to run your Jupyter Notebook in the cloud: https://www.dataschool.io/cloud-services-for-jupyter-notebook/
-from a DataFrame
+I have tried:
-</div>
-|<h5 style="text-align:left">Standard systax</h5>
+*https://cocalc.com/app
-|<code>'''df[<nowiki>''</nowiki>]'''</code>
-|
-| rowspan="2" |<syntaxhighlight lang="python3">
-# Pass a list of column names:
-df[['W','Z']]
-           W           Z
+::https://cocalc.com/projects/595bf475-61a7-47fa-af69-ba804c3f23f9/files/?session=default
-A   2.706850    0.503826
+::Parece bueno, pero tiene opciones que no son gratis
-B   0.651118    0.605965
-C  -2.018168   -0.589001
-D   0.188695    0.955057
-E   0.190794    0.683509
-</syntaxhighlight>
-|-
-|<h5 style="text-align:left">SQL syntax</h5>
-(NOT RECOMMENDED!)
-|<code>'''df.W'''</code>
-|
-|-
-|<h5 style="text-align:left">Selecting Rows</h5>
-|'''<code>df.loc[<nowiki>''</nowiki>]</code>'''
-|
-|<syntaxhighlight lang="python3">
-df.loc['A']
-# Or select based off of position instead of label :
-df.iloc[2]
-# Output:
-W    2.706850
-X    0.628133
-Y    0.907969
-Z    0.503826
-Name: A, dtype: float64
-</syntaxhighlight>
-|-
-|<h5 style="text-align:left">Selecting subset of rows and columns</h5>
-|'''<code>df.loc[<nowiki>''</nowiki>,<nowiki>''</nowiki>]</code>'''
-|
-|<syntaxhighlight lang="python3">
-df.loc['B','Y']
-# Output:
--0.84807698340363147
-df.loc[['A','B'],['W','Y']]
-# Output:
-           W           Y
-A   2.706850    0.907969
-B   0.651118   -0.848077
-</syntaxhighlight>
-|-
-|<h5 style="text-align:left">Conditional Selection</h5>
-|
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-An important feature of pandas is conditional selection using bracket notation, very similar to numpy:
-<div class="mw-collapsible-content">
-<syntaxhighlight lang="python3">
-df
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-C  -2.018168    0.740122    0.528813   -0.589001
-D   0.188695   -0.758872   -0.933237    0.955057
-E   0.190794    1.978757    2.605967    0.683509
-df>0
+*https://www.kaggle.com/
-# Output:
-    W       X       Y       Z
-A   True    True    True    True
-B   True    False   False   True
-C   False   True    True    False
-D   True    False   False   True
-E   True    True    True    True
-df[df>0]
+::https://www.kaggle.com/adeloaleman/kernel1917a91630/edit
-# Output:
+::Parece bueno pero no encontré la forma adicionar una TOC
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118    NaN         NaN         0.605965
-C   NaN         0.740122    0.528813    NaN
-D   0.188695    NaN         NaN         0.955057
-E   0.190794    1.978757    2.605967    0.683509
-df[df['W']>0]
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-D   0.188695   -0.758872   -0.933237    0.955057
-E   0.190794    1.978757    2.605967    0.683509
-df[df['W']>0]['Y']
+*https://drive.google.com
-# Output:
-A    0.907969
-B   -0.848077
-D   -0.933237
-E    2.605967
-Name: Y, dtype: float64
-df[df['W']>0][['Y','X']]
+:*https://colab.research.google.com
-# Output:
+::Es el que estoy utilizando ahora
-           Y           X
-A   0.907969    0.628133
-B  -0.848077   -0.319318
-D  -0.933237   -0.758872
-E   2.605967    1.978757
-# For two conditions you can use | and & with parenthesis:
-df[(df['W']>0) & (df['Y'] > 1)]
-# Output:
-           W           X           Y           Z
-E   0.190794    1.978757    2.605967    0.683509
-</syntaxhighlight>
-</div>
-</div>
-|-
-!<h4 style="text-align:left">Creating a new column</h4>
-|
-|
-|
-|<syntaxhighlight lang="python3">
-df['new'] = df['W'] + df['Y']
-</syntaxhighlight>
-|-
-!<h4 style="text-align:left">Removing Columns</h4>
-|
-|'''<code>df.drop()</code>'''
-| colspan="2" |
-<div class="mw-collapsible mw-collapsed" style="">
-<syntaxhighlight lang="python3">
-df.drop('new',axis=1)
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-C  -2.018168    0.740122    0.528813   -0.589001
-D   0.188695   -0.758872   -0.933237    0.955057
-E   0.190794    1.978757    2.605967    0.683509
-# Not inplace unless specified!
+<br />
-df
+===Some remarks===
-# Output:
-           W           X           Y           Z         new
-A   2.706850    0.628133    0.907969    0.503826    3.614819
-B   0.651118   -0.319318   -0.848077    0.605965   -0.196959
-C  -2.018168    0.740122    0.528813   -0.589001   -1.489355
-D   0.188695   -0.758872   -0.933237    0.955057   -0.744542
-E   0.190794    1.978757    2.605967    0.683509    2.796762
-df.drop('new',axis=1,inplace=True)
-df
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-C  -2.018168    0.740122    0.528813   -0.589001
-D   0.188695   -0.758872   -0.933237    0.955057
-E   0.190794    1.978757    2.605967    0.683509
+<br />
+====Executing Terminal Commands in Jupyter Notebooks====
+https://support.anaconda.com/hc/en-us/articles/360023858254-Executing-Terminal-Commands-in-Jupyter-Notebooks
-# Can also drop rows this way:
+If we are in the Notebook, and we want to run a shell command rather than a notebook command we use the <code>'''!''' or '''%'''</code>
-df.drop('E',axis=0)
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-C  -2.018168    0.740122    0.528813   -0.589001
-D   0.188695   -0.758872   -0.933237    0.955057
-</syntaxhighlight>
-</div>
-|-
-! rowspan="2" |<h4 style="text-align:left">Resetting the index</h4>
-|<h5 style="text-align:left">Reset to default</h5>
-(0,1...n index)
-|'''<code>df.reset_index()</code>'''
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-<syntaxhighlight lang="python3">
-df
-# Output:
-           W           X           Y           Z
-A   2.706850    0.628133    0.907969    0.503826
-B   0.651118   -0.319318   -0.848077    0.605965
-C  -2.018168    0.740122    0.528813   -0.589001
-D   0.188695   -0.758872   -0.933237    0.955057
-E   0.190794    1.978757    2.605967    0.683509
-df.reset_index()
+Try, for example:
-# Output:
+  %ls
-   index          W           X          Y          Z
+  !pwd
-      A   2.706850    0.628133   0.907969   0.503826
-      B   0.651118   -0.319318  -0.848077   0.605965
-      C  -2.018168    0.740122   0.528813  -0.589001
-      D   0.188695   -0.758872  -0.933237   0.955057
-      E   0.190794    1.978757   2.605967   0.683509
-</syntaxhighlight>
-</div>
-|-
-|<h5 style="text-align:left">Setting index to something else</h5>
-|'''<code>df.set_index(<nowiki>''</nowiki>)</code>'''
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-<syntaxhighlight lang="python3">
-newind = 'CA NY WY OR CO'.split()
-df['States'] = newind
-df
+It's the same as if you opened up a terminal and typed it without the <code>'''!'''</code>
-# Output:
-          W            X           Y          Z   States
-A   2.706850    0.628133    0.907969   0.503826       CA
-B   0.651118   -0.319318   -0.848077   0.605965       NY
-C  -2.018168    0.740122    0.528813  -0.589001       WY
-D   0.188695   -0.758872   -0.933237   0.955057       OR
-E   0.190794    1.978757    2.605967   0.683509       CO
-df.set_index('States')
-# Output:
-                W           X           Y          Z
-States
-    CA   2.706850    0.628133    0.907969   0.503826
-    NY   0.651118   -0.319318   -0.848077   0.605965
-    WY  -2.018168    0.740122    0.528813  -0.589001
-    OR   0.188695   -0.758872   -0.933237   0.955057
-    CO   0.190794    1.978757    2.605967   0.683509
-df
+<br />
-# Output:
-          W            X           Y          Z   States
-A   2.706850    0.628133    0.907969   0.503826       CA
-B   0.651118   -0.319318   -0.848077   0.605965       NY
-C  -2.018168    0.740122    0.528813  -0.589001       WY
-D   0.188695   -0.758872   -0.933237   0.955057       OR
-E   0.190794    1.978757    2.605967   0.683509       CO
-# We net to add «inplace=True»:
+===[[HTML presentation with Reveal.js#Creating Presentations in Jupyter Notebook with RevealJS|Creating Presentations in Jupyter Notebook with RevealJS]]===
-df.set_index('States',inplace=True)
-df
-# Output:
-                W           X           Y          Z
-States
-    CA   2.706850    0.628133    0.907969   0.503826
-    NY   0.651118   -0.319318   -0.848077   0.605965
-    WY  -2.018168    0.740122    0.528813  -0.589001
-    OR   0.188695   -0.758872   -0.933237   0.955057
-    CO   0.190794    1.978757    2.605967   0.683509
-</syntaxhighlight>
-</div>
-|-
-! rowspan="2" |<h4 style="text-align:left">Multi-Indexed DataFrame</h4>
-|<h5 style="text-align:left">Creating a Multi-Indexed DataFrame</h5>
-|
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-<syntaxhighlight lang="python3">
-# Index Levels
-outside = ['G1','G1','G1','G2','G2','G2']
-inside = [1,2,3,1,2,3]
-hier_index = list(zip(outside,inside))
-hier_index = pd.MultiIndex.from_tuples(hier_index)
-hier_index
-# Output:
-MultiIndex(levels=[['G1', 'G2'], [1, 2, 3]],
-           labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]])
-df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])
+<br />
-df
-# Output:
-               A          B
-G1  1   0.153661   0.167638
-  -0.765930   0.962299
-   0.902826  -0.537909
-G2  1  -1.549671   0.435253
-   1.259904  -0.447898
-   0.266207   0.412580
-</syntaxhighlight>
-</div>
-|-
-|<h5 style="text-align:left">Multi-Index and Index Hierarchy</h5>
-|
-| colspan="2" |<div class="mw-collapsible mw-collapsed" style="">
-<syntaxhighlight lang="python3">
-df.loc['G1']
-# Output:
-           A          B
-   0.153661   0.167638
-  -0.765930   0.962299
-   0.902826  -0.537909
-df.loc['G1'].loc[1]
+==Some of the most popular Python Data Science Libraries==
-# Output:
-A    0.153661
-B    0.167638
-Name: 1, dtype: float64
-df.index.names
+*NumPy
-# Output:
+*SciPy
-FrozenList([None, None])
+*Pandas
+*Seaborn
+*SciKit'Learn
+*MatplotLib
+*Plotly
+*PySpartk
-df.index.names = ['Group','Num']
-df
-# Output:
-                   A          B
-Group Num
-   G1   1   0.153661   0.167638
-  -0.765930   0.962299
-   0.902826  -0.537909
-   G2   1  -1.549671   0.435253
-   1.259904  -0.447898
-   0.266207   0.412580
-df.xs('G1')
+<br />
-# Output:
-            A            B
-Num
-    0.153661     0.167638
-   -0.765930     0.962299
-    0.902826    -0.537909
-df.xs(['G1',1])
-# Output:
-A    0.153661
-B    0.167638
-Name: (G1, 1), dtype: float64
-df.xs(1,level='Num')
-# Output:
-               A          B
-Group
-   G1   0.153661   0.167638
-   G2  -1.549671   0.435253
-</syntaxhighlight>
-</div>
-|}
+==[[NumPy and Pandas]]==
+<br />
+==[[Data Visualization with Python]]==
 <br />
-===Missing Data===
+==[[Natural Language Processing]]==
-Let's show a few convenient methods to deal with Missing Data in pandas.
-* <code>dropna()</code> method allows the user to analyze and drop Rows/Columns with Null values in different ways:
-<blockquote>
-<syntaxhighlight lang="python">
-DataFrameName.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
-</syntaxhighlight>
-</blockquote>
-* <code>fillna()</code> allows to fill Null fields with a given value:
+<br />
+==[[Dash - Plotly]]==
-<syntaxhighlight lang="python">
+<br />
-import numpy as np
+==[[Scrapy]]==
-import pandas as pd
-df = pd.DataFrame({'A':[1,2,np.nan],
+<br />
-                  'B':[5,np.nan,np.nan],
+==Using SQL in Jupyter==
-                  'C':[1,2,3]})
+Connecting to a database in Jupyter
-df
-# Output:
-      A       B     C
-   1.0     5.0     1
-   2.0     NaN     2
-   NaN     NaN     3
+https://pypi.org/project/ipython-sql/
-''By default, dropna() drop all the rows without Null values:''
+https://stackoverflow.com/questions/454854/no-module-named-mysqldb
-df.dropna()
-df.dropna(axis=0) # Same as default
-# Output:
-      A       B     C
-   1.0     5.0     1
+https://stackoverflow.com/questions/5178292/pip-install-mysql-python-fails-with-environmenterror-mysql-config-not-found
-'''If we want to display all the columns without Null values:'''
+https://docs.kyso.io/guides/sql-interface-within-jupyterlab
-df.dropna(axis=1)
+https://www.datacamp.com/community/tutorials/sql-interface-within-jupyterlab
-# If we want to display all the rows that have at least 2 non-null values:
+https://stackoverflow.com/questions/43641362/adding-syntax-highlighting-to-jupyter-notebook-cell-magic
-df.dropna(thresh=2)
-# Output:
-      A       B     C
-   1.0     5.0     1
-   2.0     NaN     2
+https://www.sqlshack.com/learn-jupyter-notebooks-for-sql-server/
-# Columns with at least 3 non-null values:
-df.dropna(thresh=3)
-# Output:
-      A       B     C
-   1.0     5.0     1
+Verificar las fuentes above. Creo que lo único que tuve que hacer la última vez que lo instalé fue basado en las 3 primeras sources:
-# To fill null fields with a given value:
+ pip install ipython-sql
-df.fillna(value='FILL VALUE')
-# Output:
+ sudo apt install default-libmysqlclient-dev
-    A            B            C
-   1            5            1
+ pip install mysqlclient
-   2            FILL VALUE   2
-   FILL VALUE   FILL VALUE   3
+ sudo apt-get install python3-mysqldb
-# But many times what we want to do is to replace these null fields with, for example, the «mean» of the columns. We can do it this way:
+Luego adding SQL syntax highlighting to Jupyter as describe above in the corrrespoinding source.
-df['A'].fillna(value=df['A'].mean())
-# Output:
-    1.0
-    2.0
-    1.5  # *
-Name: A, dtype: float64
-# * The Null field has been filled with the mean of the column
-</syntaxhighlight>
-<br />
-===GroupBy===
-<br />
-===Merging,Joining,and Concatenating===
-<br />
-===Operations===
-<br />
-===Data Input and Output===
 <br />