Monday, April 15, 2013

Installing latest Python and Scrapy on Planetlab

This post handles multiple issues, which need to be applicable to PlanetLab alone. I was following many instruction references from CentOS, so I believe the instructions are applicable there too.

Running Latest version of Python(>=2.7) on PlanetLab:
PlanetLab uses fedora 8, and yum update installs Python 2.5. So we need to install Python from source. Initially install the dependancies.
sudo yum groupinstall "Development tools"
sudo yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel

Create a directory for installing Python in your home directory. Then download and extract the Python source.
mkdir ~/python
wget http://www.python.org/ftp/python/2.7.2/Python-2.7.2.tgz
tar zxfv Python-2.7.2.tgz
find ~/python -type d | xargs chmod 0755

Install from the source code.
cd Python-2.7.2
./configure --prefix=/home/user/python
make
sudo make install

Then edit the PATH environment variable, so Python always refers to the one installed in the home directory. Make sure the addition is before, the default /usr/bin (containing the old Python) should have lesser preference than the one in our home directory.
vi ~/.bashrc
export PATH=/home/user/python/bin/:$PATH
source ~/.bashrc

NOTE: Sometimes you might need to logout and login to observe the change in PATH variable.

Installing easy_install for the version in local home directory:
The easiest way to install "easy_install" is using the egg script. Download the egg script and run it using "sh".


su
wget https://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg
sh setuptools-0.6c11-py2.7.egg

Now when you type "which python" and "which easy_install" you should be able to see they point to the versions in the home directory.

Installing Scrapy on PlanetLab:
Scrapy depends on many packages (installation using yum) which makes it hard to run using latest Python2.7 that we installed locally. The trick is to setup a "virtualenv" - python virtual environment, which enables us to have multiple Python installations run parallelly and let us use different installations for different projects.

Install virtualenv, and create a project space.
easy_install virtualenv
virtualenv --distribute project_name
source project_name/bin/activate

After the last instruction, your shell prompt will look like "(project_name) user@server$", indicating you are now in the virtual environment. Also, there should be a local folder titled "project_name".

Now install the Scrapy dependencies. We cannot simply use "easy_install Scrapy", because easy_install would then install latest versions of dependencies (like pyOpenSSL) which do not work on PlanetLab. When I say they do not work, I meant I couldn't make them work. Installing latest version of pyOpenSSL gave a gcc error saying some symbols are missing like in [2]. So we use a hack - install earlier version. These are sufficient for using Scrapy, so we dont break any functionality (atleast I did not come across any case so far).

Install pyOpenSSL 0.12 (latest is 0.13) from egg file.
wget https://pypi.python.org/packages/2.7/p/pyOpenSSL/pyOpenSSL-0.12-py2.7-win32.egg#md5=c343e3833b725e060c094bbf33349349
easy_install pyOpenSSL-0.12-py2.7-win32.egg

Install other dependencies. I guess now yum uses Python 2.7.2 since we are in the virtualenv. Because of that libxml2 being installed was for Python 2.7.
sudo yum install libxml2-devel
sudo yum install libxslt-devel

Now install Scrapy.
easy_install Scrapy

Now to test if it is properly installed, type "python" on the shell prompt, and when you launch Python type "import scrapy" after the python prompt ">>>" to test the import the successful.

References:
1. http://toomuchdata.com/2012/06/25/how-to-install-python-2-7-3-on-centos-6-2/
2. http://stackoverflow.com/questions/11084863/istalling-scrapy-openssl
3. https://pypi.python.org/pypi/pyOpenSSL/0.12
4. https://pypi.python.org/pypi/setuptools#rpm-based-systems
5. http://stackoverflow.com/questions/10927492/getting-gcc-failed-error-while-installing-scrapy

No comments:

Post a Comment