Python is a great tool to have available for all sorts of tasks, including data analysis or machine learning. It’s a great language to start off with if you’re a beginner, and there are loads of tutorials out there. So, if you’re a neophyte Pythonista, head over there and come back here later.
Additionally, plenty of great developers have been working on tools that just get the job done, including pandas for wrangling your data (and turning it into something that looks like a spreadsheet), as well as Scikit-Learn for running anything from basic statistics to more complex learning algorithms on your data.
I’ve used Python for long enough to have made a lot of the mistakes there are with it, but the best piece of advice I have for anyone getting started is to use a virtual environment. You see, Python has some built-in tools that let you download and use other people’s code so you can leverage their work in your own analyses. Most of the time, this happens without a problem. But sometimes, say when a developer changes and updates his or her package in a way that breaks the way you’re using it, you’ll want to stick with the old version until you can try out the upgrade. Virtual environments provide a sandbox that allows you to keep different versions of Python modules separate so they can’t conflict with one another.
In fact, I’d actually suggest you do this in almost all contexts.
- Are you starting a new project and have no idea where it’s heading? Use a virtual environment.
- Are you setting up a production server so you can deploy and run your Python code? Use a virtualenvironment
- Are you writing a research paper that analyzes some data that you’ll eventually publish? Use a virtual environment and then share how it’s set up with other people so they can reproduce your results
So how do you go about using a virtual environment? If you’re using Mac OS or a Linux distribution, one of my favorite tools is pyenv, which works quite seamlessly after you’ve install some dependencies (like some tools that actually build Python). Here’s the original guide I started off with and I still use it as a reference if I run into any issues. The thing I really love about pyenv is that it allows you to install and manage different Python versions as well.
One Windows, the experience is a bit different but I think this guide is great for getting started. That article focuses on setting up virtual environments but the earlier ones should help out with the installation process. Looking around, it seems that Python 3.3 has a tool that allows Windows users to switch between different Python versions, which is very nice to have. I haven’t checked it out yet but I look forward to.
In essence, if you haven’t tried out using virtual environments yet, get started as soon as you can. It’s worth the time invested in getting it set up and understanding a few things under the hood (like how the command-line PATH variable and the PYTHON_PATH work). All in all, I’ve never regretted setting one up for even the simplest of tasks and have almost always cursed myself when I didn’t use on.