Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about adding also conda environments to the mix #8

Open
tritemio opened this issue Jan 28, 2016 · 9 comments
Open

What about adding also conda environments to the mix #8

tritemio opened this issue Jan 28, 2016 · 9 comments

Comments

@tritemio
Copy link

Docker containers are great, but oftentimes the software is simple enough that the environment can be reproduced with conda. This approach has also the benefit to be multi-platform.

It is a slightly more fragile approach but I think is good enough in many cases and a big improvement from manual installation.

In the spirit of "not changing you tools" it makes sense to add conda to the mix.

@betatim
Copy link
Member

betatim commented Jan 28, 2016

Good point. Also thinking about vagrant.

Could you elaborate on multi-platform? For me docker solves this very nicely, in the sense that it works on linux, windows and mac (the latter two need virtualbox but fine).

I don't use conda envs enough for sharing environments but someone once told me that conda env export > environment.yml would not produce something that can be easily shared across platforms. The only data point I have to contribute is that several third-party conda recipes aren't perfect and break on platforms that aren't linux.

My favourite approach is to use conda inside docker. I have a shell script which does the equivalent of source activate this-projects-env but drops me into the docker container. This is one of these "trivial" tools that should be part of what this project produces and then publicises.

@tritemio
Copy link
Author

In general you can create conda environment that are platform specific if you depends on packages that are available only on one platform. Otherwise they should work across OS X, Windows and Linux.

As a non completely trivial (but simple) example, I created a demo on mybinder using only a conda environment, and the environment was not created on linux:

https://github.com/Photon-HDF5/Photon-HDF5-Converter

I currently don't run virtual machines but work on Windows, OS X and Linux using only conda. A lightweight approach that works if you use mostly standard and homebuild packages.

@tritemio
Copy link
Author

Found this slide (and the next 2 going down) on using snakemake with conda:

http://slides.com/johanneskoester/snakemake-broad-2015#/23/3

@ctb
Copy link
Member

ctb commented Feb 27, 2016

-1 on specifically saying we'll support conda in the prototype; +1 on saying that we're open to including it.

@betatim
Copy link
Member

betatim commented Feb 27, 2016

From talking to people it seems the pragmatists say: conda is nice but (often) isn't enough so you need something like docker (mainly related to lots of software not having conda packages but you can install them on a linux OS pretty easily). So I would vote for us saying "docker is the starting point, inside it you (the scientists) can do as you please"

@tritemio
Copy link
Author

On Sat, Feb 27, 2016 at 2:36 PM, Tim Head notifications@github.com wrote:

From talking to people it seems the pragmatists say: conda is nice but
(often) isn't enough so you need something like docker (mainly related to
lots of software not having conda packages but you can install them on a
linux OS pretty easily). So I would vote for us saying "docker is the
starting point, inside it you (the scientists) can do as you please"

I use conda environments on 3 platforms for python code+cython extensions
and did not encounter any fundamental problem. The only issue I had with
Anaconda is the occasional (temporary) breakage of packages so that you
have to revert to an old version. But this does not affect environments: if
it worked once it will keep working.

If you depend on C/C++ libraries not included in Anaconda then yes, there
are better tools. But for the majority of researchers python + R are
enough. Aren't those peoples the main target of this proposal?

I think docker is great but I fear that making it absolutely necessary will
complicate the simple workflow. You need to setup virtual machines at the
very minimum unless you run linux which is unlikely for entry-level users.

For the current proposal I don't think this detail is an issue but, in
general, I would rather have the option to setup a "paper" using simple
conda environments.

@betatim
Copy link
Member

betatim commented Feb 28, 2016

On Sun, Feb 28, 2016 at 1:40 AM Antonino Ingargiola <
notifications@github.com> wrote:

On Sat, Feb 27, 2016 at 2:36 PM, Tim Head notifications@github.com
wrote:

From talking to people it seems the pragmatists say: conda is nice but
(often) isn't enough so you need something like docker (mainly related to
lots of software not having conda packages but you can install them on a
linux OS pretty easily). So I would vote for us saying "docker is the
starting point, inside it you (the scientists) can do as you please"

I use conda environments on 3 platforms for python code+cython extensions
and did not encounter any fundamental problem. The only issue I had with
Anaconda is the occasional (temporary) breakage of packages so that you
have to revert to an old version. But this does not affect environments: if
it worked once it will keep working.

If you depend on C/C++ libraries not included in Anaconda then yes, there
are better tools. But for the majority of researchers python + R are
enough. Aren't those peoples the main target of this proposal?

In my biased world view (coming from particle physics) nothing goes without
supporting large and/or custom C++ environments. A lot of people (for good
and bad reasons) write large parts of their paper pipeline in C++ and to
access our data stored in a custom binary format you need [ROOT](//
root.cern.ch). So given we will have to spawn docker containers for people
anyway from an ops point of view I'd suggest we allow people to modify
those as well.

I think docker is great but I fear that making it absolutely necessary will
complicate the simple workflow. You need to setup virtual machines at the
very minimum unless you run linux which is unlikely for entry-level users.

I fully agree that conda create -n nobelprize python is easier than
getting started with and executing things inside a container. This is why I
think we need to build some commandline tools/UI so that it becomes as easy.

@ctb
Copy link
Member

ctb commented Feb 28, 2016

I think docker is great but I fear that making it absolutely necessary will
complicate the simple workflow. You need to setup virtual machines at the
very minimum unless you run linux which is unlikely for entry-level users.

I fully agree that conda create -n nobelprize python is easier than
getting started with and executing things inside a container. This is why I
think we need to build some commandline tools/UI so that it becomes as easy.

Absolutely - I still expect to see many people using their own paper
environment and I would like to give them the tools to encapsulate
their paper execution and rendering environment in such a way that it
can be run in a clean environment. Docker can be our demo and cloud
environment but it should not be the only way you can run things!

@khinsen
Copy link
Collaborator

khinsen commented Feb 29, 2016

+1

We need to start with something concrete, which is Docker containers. But we can add other options later. In fact, we have, if we want this to last for longer than the latest fad in computing technology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants