The difference between conda install and pip install.Function comparison, etc. [Python]

2019/5/26

Until now, pip install was only recognized as a substitute when conda install wasn't available.If the package I wanted wasn't in the Anaconda repository, I installed it with pip, and at one point the environment was corrupted due to these conflicts.

The Anaconda environment was corrupted when I thought the Jupyter Notebook wouldn't start

Since it was a great opportunity, I investigated the difference between pip and conda, so I will leave it as a memorandum below.

Difference between conda and pip

What is Conda

In a nutshell, it is a package manager that comes standard with Anaconda / Miniconda and is an environmental management system.

Anaconda is a platform that provides packages for data science (if you're coming to find out the difference between conda and pip, you already know). You can install programming languages ​​for data science such as Python and R, and packages required for statistical analysis and machine learning all at once, and you can immediately build an environment that uses Python.
There is also a Miniconda that has the minimum configuration.

conda only works when installed by the Anaconda installer or the miniconda installer. Even if you install conda in the environment of python + pip, you cannot use it like on Anaconda distribution.

What is pip

A standard python package installer and package management system that comes with installing pure python.

It downloads and installs packages from the Python Package Index (PyPI), a repository for Python programming languages.

Differences in each function

The roles of conda and pip are summarized in a simple table.

FeaturesCondapip
Package installation and managementPossiblePossible
Switching Python versionPossibleNot possible (substitute with pipenv, pyenv)
Virtual environment managementPossibleNot possible (substitute with pipenv, virtualenv, venv)

Since conda is also an environment management system, you can build a virtual environment, change the Python version to 3.7, or switch to 2.7.

With pip, you will install and use packages such as pyenv (version control) and venv (virtual environment management).

Recently, pipenv has come out, and it seems that it has a function that can almost replace conda.

conda install and pip install

The commands for conda install and pip install are similar, but the mechanism for installing packages seems to be quite different. There is an easy-to-understand comparison table on the anaconda.com site, so I will quote it (Japanese & modified to make it easier for me to understand).

install installpip install
Package formatbinarywheel or source
compileMust not必要
Package typeOther languages ​​are also possiblepython only
Virtual environment management, version managementPossibleNot possible (substitute for virtualenv, venv)
Dependency checkYesNone
Package download sourceanaconda repository, anaconda cloudPyPl

・ Conda supports languages ​​other than python

Anaconda / Miniconda is a cross-platform that allows you to install multiple programming languages ​​such as Python, R, Ruby, Java, JavaScript, C / C ++, FORTRAN, and their packages.

conda can install software packages written in various languages ​​on its cross-platform.

pip can only be installed on Python packages.

-Whether compilation is required

More than 1000 packages that can be installed with the conda command are stored in a dedicated repository called Anaconda cloud.

Since these packages are compiled binary files, they can be downloaded and installed without the need for a compiler.

The files you install with pip contain the source files, which you will need to compile on the client side.

This may cause problems depending on the environment, and it seems to be one of the causes that often stumbles when building an environment.maybe.It may require external dependencies.

・ Whether or not there is a dependency check

It seems that pip does not ensure that all dependencies are satisfied at the same time each time a package is installed.In this case, conflicts will occur if the installed packages have different versions of the packages they depend on.

conda supports this by providing a SAT solver that collects metadata for all packages to understand dependencies.When installing, quickly understand the complex dependencies between packages and perform appropriate updates and installations.

・ Difference in execution speed

There was also information that matrix operations with numpy installed with conda are faster.It depends on the type of calculation, but it seems that the difference is more than double.

There are various implementation methods for BLAS (Basic Linear Algebra Subprograms) that actually handle the matrix operations called by NumPy, and one of them is Intel MKL (Math Kernel Library) developed by Intel. ).Actually, BLAS called from NumPy installed by Anaconda is MKL, but when NumPy is installed with pip, BLAS called OpenBLAS is usually used, so there is a possibility that there will be a difference in performance here. is.

Difference in NumPy speed between Anaconda's NumPy and pip-Orizuru

BLAS is a library that performs basic matrix and vector calculations. There are more live ants in the numpy library and I'm not sure ...

"How much the calculation processing speed actually changes depending on the difference in BLAS" and "How to find out which BLAS is used in your environment" are summarized in detail on the following pages.Please refer if you like.

It seems that the calculation speed changes depending on the BLAS used for Numpy [Python]

That's the difference between conda and pip.

reference


Understanding Conda and Pip
https://conda.io/en/latest/
Compare the speeds of Anaconda's NumPy and PyPI's NumPy
Stop Installing Tensorflow using pip for performance sake!