{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Multi Dimensional Scaling (MDS)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les méthodes de MDS consitent à projeter des données dans un espace de dimension réduite tout en conservant au mieux\n", "une distance entre les données.\n", "\n", "De nombreuses méthodes existent (voir par exemple http://en.wikipedia.org/wiki/Multidimensional_scaling, ou http://scikit-learn.org/stable/_downloads/plot_lle_digits.py) nous n'en présenterons ici que quelques une." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "L'intérêt de ces méthodes est (au moins) double :\n", "* il permet de représenter dansun espace de petite dimenstion des données a priri décrites dansun grand nombre de dimension\n", "* il permet d'associer des axes à des données uniqueent décrite par une distance. Ceci permet de faire ensite une ACP dessus pour interpréter les données, faire des régressions, ou utiliser des algorithmes uniquement prévu pour le cas euclidien (comme les $k$-means par exemple)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Données iris" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "\n", "current_palette = sns.color_palette()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | sepal_length | \n", "sepal_width | \n", "petal_length | \n", "petal_width | \n", "
---|---|---|---|---|
0 | \n", "5.1 | \n", "3.5 | \n", "1.4 | \n", "0.2 | \n", "
1 | \n", "4.9 | \n", "3.0 | \n", "1.4 | \n", "0.2 | \n", "
2 | \n", "4.7 | \n", "3.2 | \n", "1.3 | \n", "0.2 | \n", "
3 | \n", "4.6 | \n", "3.1 | \n", "1.5 | \n", "0.2 | \n", "
4 | \n", "5.0 | \n", "3.6 | \n", "1.4 | \n", "0.2 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
145 | \n", "6.7 | \n", "3.0 | \n", "5.2 | \n", "2.3 | \n", "
146 | \n", "6.3 | \n", "2.5 | \n", "5.0 | \n", "1.9 | \n", "
147 | \n", "6.5 | \n", "3.0 | \n", "5.2 | \n", "2.0 | \n", "
148 | \n", "6.2 | \n", "3.4 | \n", "5.4 | \n", "2.3 | \n", "
149 | \n", "5.9 | \n", "3.0 | \n", "5.1 | \n", "1.8 | \n", "
150 rows × 4 columns
\n", "\n", " | pourcentage | \n", "
---|---|
0 | \n", "0.924619 | \n", "
1 | \n", "0.053066 | \n", "
2 | \n", "0.017103 | \n", "
3 | \n", "0.005212 | \n", "
\n", " | 0 | \n", "1 | \n", "
---|---|---|
0 | \n", "2.818240 | \n", "5.646350 | \n", "
1 | \n", "2.788223 | \n", "5.149951 | \n", "
2 | \n", "2.613375 | \n", "5.182003 | \n", "
3 | \n", "2.757022 | \n", "5.008654 | \n", "
4 | \n", "2.773649 | \n", "5.653707 | \n", "
... | \n", "... | \n", "... | \n", "
145 | \n", "7.446475 | \n", "5.514485 | \n", "
146 | \n", "7.029532 | \n", "4.951636 | \n", "
147 | \n", "7.266711 | \n", "5.405811 | \n", "
148 | \n", "7.403307 | \n", "5.443581 | \n", "
149 | \n", "6.892554 | \n", "5.044292 | \n", "
150 rows × 2 columns
\n", "\n", " | 0 | \n", "1 | \n", "
---|---|---|
0 | \n", "-3.7 | \n", "5.1 | \n", "
1 | \n", "-3.2 | \n", "4.6 | \n", "
2 | \n", "-3.4 | \n", "4.7 | \n", "
3 | \n", "-3.3 | \n", "4.8 | \n", "
4 | \n", "-3.8 | \n", "5.2 | \n", "
... | \n", "... | \n", "... | \n", "
145 | \n", "-5.3 | \n", "10.5 | \n", "
146 | \n", "-4.4 | \n", "9.4 | \n", "
147 | \n", "-5.0 | \n", "10.2 | \n", "
148 | \n", "-5.7 | \n", "11.1 | \n", "
149 | \n", "-4.8 | \n", "9.9 | \n", "
150 rows × 2 columns
\n", "