- no title specified

Cloud infrastructure in a nutshell

1Purpose of this guide

In this guide you can find a complete walk-through about how to install and configure a cloud manager and run distributed jobs through a simple queue manager.

The main components are built around :

Operating System : Scientific Linux 6.5 (http://linux.web.cern.ch/linux/scientific 6/)
Cloud Manager : OpenNebula 4.8 (http://opennebula.org/)
Jobs Manager : Torque PBS 2.5 (http://www.adaptivecomputing.com/products/open-source/torque)

2Infrastructure

All the next step are presented for the following kind of infrastructures:

3Create the first virtual machine for OpenNebula manager (nebula-manager-groupxx)

As first step we start the installation of OpenNebula software on a persistent virtual machine created on jnws024 hypervisor.

1. 3.1Start virtual-machine-manager on your virtual Desktop

Connection to your virtual desktop at url http://YOURHOST .ihep.ac.cn:8080/vnc.html?host=YOURHOST .ihep.ac.cn&port=8080

Open a Terminal (Application>>System Tools>>Terminal) and install Virtual Machine Manager with “yum install virt-manager”. Accept the installation of some package and the instruction from Virtual Machine Manager. At the end of installation restart libvirtd daemon

# service libvirtd restart

After this operation, you can start virt-manager :

LANG=C virt-manager

Create a new connection (File >> Add connection …) to jnws024 as user “root” (password ihep;test), and use this connection for next steps.

1. 3.2Create a Virtual Machine

Click on the appropriate icon to create a new VM, give it a name according to your group number : nebula-manager-groupYOUR_GROUP_NUMBER, and choose for a Network Install

Enter the URL: http://202.122.33.67/yum/scientific/6.5/x86_64/os/

Set the size of RAM to 2048MB and 1 CPU

Select a new disk image of 20GB entirely allocated

Verify all the parameter are correct, expecially the bridged networking ….

Click on finish and switch to console tab.

1. 3.3Start the installation of operating system

Choose English language and us keyboard type and then choose for manual configuration (use arrow keys and spacebar):

input network parameter of your group :

Move on the next screen and confirm

Select Yes, discard any data

Give it a name: nebula-manager-groupYOUR_GROUP_NUMBER.ihep.ac.cn (remember the FQDN !!)

Set root password : ihep;test

Continue by pressing Next

Confirm the disk layout and press Next

Confirm you want to Write changes to disk …

Basic Server layout is enough for us

wait some minutes and when the installation is finished click on Reboot ..

Congratulation your operating system is ready to start ... after reboot try to access your new Nebula Manager virtual machine via ssh from your laptop (via Putty,MobaXterm,...):

ssh -l root 192.168.64.XXX

Configure NTP server, delete all the server line into /etc/ntp.conf and add this one :

server ntp1.ihep.ac.cn

and adjust the date:

# service ntpd stop

# ntpdate ntp1.ihep.ac.cn

# service ntpd start

Disable selinux immediately:

# echo 0 > /selinux/enforce

and make persistent this modification opening the file /etc/selinux/config and modifying the line SELINUX=xxxxx into:

SELINUX=disabled

1. 3.4Configure a global proxy to gain Internet access

Add these three lines at the end of file /root/.bashrc in your nebula manager :

export http_proxy=202.122.33.53:3128

export ftp_proxy=202.122.33.53:3128

export https_proxy=202.122.33.53:3128

And test immediately .. if the curl command output is like this it works

# source .bashrc

# curl https://wtfismyip.com/text

202.122.33.53

1. 3.5Start the installation of OpenNebula on the nebula manager node

Detailed info can be found at http://docs.opennebula.org/4.8/design_and_installation/building_your_cloud/ignc.html

Add OpenNebula repository in file /etc/yum.repos.d/opennebula.repo:

[opennebula]

name=opennebula

baseurl=http://downloads.opennebula.org/repo/4.8/CentOS/6/x86_64

enabled=1

gpgcheck=0

Add epel repository and accept to import all GPG keys :

# yum install epel-release

and install the software confirming epel GPG key :

# yum install opennebula-server opennebula-sunstone opennebula-ruby

Complete the installation with ruby gems selecting 0 :

# /usr/share/one/install_gems

Select your distribution or press enter to continue without

installing dependencies.

0. CentOS/RedHat

1. Ubuntu/Debian

2. SUSE

...

Building native extensions. This could take a while...

Successfully installed curb-0.8.8

Successfully installed builder-3.2.2

Successfully installed trollop-2.1.2

Successfully installed polyglot-0.3.5

Successfully installed treetop-1.6.2

Successfully installed parse-cron-0.1.4

Successfully installed multi_json-1.11.1

Successfully installed jmespath-1.0.2

Successfully installed aws-sdk-core-2.1.1

Successfully installed aws-sdk-resources-2.1.1

Successfully installed aws-sdk-2.1.1

Building native extensions. This could take a while...

Successfully installed ox-2.2.0

16 gems installed

Start OpenNebula and verify the installation:

# su - oneadmin

$ one start

$ onevm list

ID USER GROUP NAME STAT UCPU UMEM HOST TIME

Annotate the oneadmin password (it will be useful later …):

$ cd .one

$ ll

total 4

-rw-r--r--. 1 oneadmin oneadmin 42 Jun 23 20:40 one_auth

$ cat one_auth

oneadmin:4689b636b7bde5565a0d95e865bc4609

4Configure your host to be an hypervisor

First of all we need to realize what virtual network infrastructure wants. A simple solution could be using a virtual bridge to connect VMs to physical network

Connect to your hypervisor via virtual desktop http://YOURHOST .ihep.ac.cn:8080/vnc.html?host=YOURHOST .ihep.ac.cn&port=8080 (or via ssh client from your laptop)

Disable NetworManager permanently :

# service NetworkManager stop

# chkconfig NetworkManager off

Launch virt-manager :

# LANG=C virt-manager

And connect to localhost (add new connection if necessary) !! not jnws024 !!

Configure a virtual bridge (bridged) on this host using the virtual manager (Edit >> Connection details >> Network interfaces)

Add a network interface (+) and chose Bridged :

click on Forward and complete the form as follow

Click Configure near IP settings and add network parameter of your hypervisor

Before click on Finish ask for a check by the teacher ...

Complete the operation by pressing Finish.

Verify that bridge is correctly configured via brctl command and setting AGEING parameter to 0 (if you miss this point you can experience random problems in the future, remember to give this command in case of reboot of hypervisor):

# brctl setageing bridged 0

1. 4.1Create a Virtual Machine via virt-manager into your hypervisor

Follow the step done for the installation of the nebula-manager: create a Virtual Machine, use network installation, (pay attention on step 3.2 that now we want to use QCOW2 disk image format and change jnws024 to localhost !!!) give it the name templatevm, and install a base system.

To specify a qcow2 image you need to click on 'Select managed or other existing storage', go into default under Storage Pools and then click New Volume :

Click on Browse.. and New volume to create a qcow2 image, fill the form as follow, click Finish and Choose Volume

Now you can go on in the process of installation as usual.

After the reboot connect to this new virtual machine:

ssh -l root <IP_NEW_VM>

Configure proxy as done before and install the openebula required packages for the contextualization (http://dev.opennebula.org/projects/opennebula/files )

# wget http://dev.opennebula.org/attachments/download/804/one-context_4.8.0.rpm

# yum localinstall one-context_4.8.0.rpm

Now we can shutdown this virtual machine and use it to create the “golden image”, which we can use to instantiate new virtual machines in the future. This process can be performed after some other steps ..

# init 0

1. 4.2Configure your hypervisor (jnws0XX) to became a suitable host for nebula

More info can be found at http://docs.opennebula.org/4.8/design_and_installation/building_your_cloud/ignc.html#step-5-node-installation

Configure proxy server if needed and install some specific software from OpenNebula repositories. First step add a new repository into file /etc/yum.repos.d/opennebula.repo

[opennebula]

name=opennebula

baseurl=http://downloads.opennebula.org/repo/4.8/CentOS/6/x86_64

enabled=1

gpgcheck=0

# yum install opennebula-node-kvm

Restart libvirtd ignore errors on stopping:

# service libvirtd restart

Add nebula manager IP address to file /etc/hosts:

# echo 192.168.64.XX nebula-manager-groupxx >> /etc/hosts

and verify with ping nebule-manager-groupxx command

1. 4.3Configure SSH passwordless access between nebula-manager-groupXX and your hypervisor

# su – oneadmin

$ ssh-keygen

Add some ssh client options in /var/lib/one/.ssh/config file:

ConnectTimeout 5

Host *

StrictHostKeyChecking no

And set the right permission :

$ chmod 700 /var/lib/one/.ssh/config

Disable http_proxy for user oneadmin and chose an editor by adding these line to .bashrc file

unset http_proxy

unset ftp_proxy

unset https_proxy

EDITOR=nano

export EDITOR

Self trust oneadmin public key :

$ cat /var/lib/one/.ssh/id_rsa.pub >> /var/lib/one/.ssh/authorized_keys

Create a package of ssh settings and move to your hypervisor:

$ tar -Pcvf /tmp/ssh_oneadmin.tar /var/lib/one/.ssh

/var/lib/one/.ssh/

/var/lib/one/.ssh/id_rsa.pub

/var/lib/one/.ssh/id_rsa

/var/lib/one/.ssh/authorized_keys

/var/lib/one/.ssh/known_hosts

/var/lib/one/.ssh/config

/var/lib/one/.ssh/id_dsa

/var/lib/one/.ssh/id_dsa.pub

$ scp /tmp/ssh_oneadmin.tar root@jnws0XX:/tmp/

$ ssh root@jnws0XX tar -Pxvf /tmp/ssh_oneadmin.tar

Test the passwordless connection:

$ ssh jnws0XX uname –a

Linux jnws0XX 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 04:15:08 CET 2013 x86_64 x86_64 x86_64 GNU/Linux

1. 4.4Mount Datastore on nebula-manager-groupxx and hypervisor

On both nebula-manager-groupxx and your hypervisor jnws0XX mount the NFS datastore in the same position:

# echo jnws024:/Datastore/groupxx /var/lib/one/datastores nfs mountvers=3,defaults,_netdev 0 0 >> /etc/fstab

# mkdir -p /var/lib/one/datastores

# mount -a

# cd /var/lib/one/datastores

# mkdir -p 0

# mkdir -p 1

# chown oneadmin. .

# chown oneadmin. 0 1

5Configure Datastore on nebula-manager-groupxx and complete the cloud manager setup

Edit default datastore (ID=1) with right parameters as user “oneadmin”:

$ onedatastore update 1

BASE_PATH="/var/lib/one//datastores/"

CLONE_TARGET="SYSTEM"

DISK_TYPE="FILE"

DS_MAD="fs"

LN_TARGET="NONE"

TM_MAD="qcow2"

TYPE="IMAGE_DS"

Add your hypervisor jnws0XX under the control of nebula-manager-groupxx

$ onehost create jnws0XX -i kvm -v kvm -n dummy

ID: 0

$ onehost list

ID NAME CLUSTER RVM ALLOCATED_CPU ALLOCATED_MEM STAT

0 jnws0XX - 0 - - init

$ onehost list

ID NAME CLUSTER RVM ALLOCATED_CPU ALLOCATED_MEM STAT

0 jnws0XX - 0 0 / 1600 (0%) 0K / 11.7G (0%) on

Configure a virtual network (http://docs.opennebula.org/4.8/user/virtual_resource_management/vgg.html), create a file named vnet.txt with these info, pay attention to IP information and verify with yours parameters :

NAME = "Private Network"

DESCRIPTION = "A private network for VM inter-communication"

BRIDGE = "bridged"

# Context attributes

NETWORK_ADDRESS = "10.10.0.0"

NETWORK_MASK = "255.255.255.0"

DNS = "XXX.XXX.XXX.XXX"

GATEWAY = "10.10.0.1"

#Address Ranges, only these addresses will be assigned to the VMs

AR=[

TYPE = "IP4",

IP = "10.10.0.xx",

SIZE = "15"

]

Use this file to define a virtual network for your cloud ...

$ onevnet create vnet.txt

ID: 0

$ onevnet list

ID USER GROUP NAME CLUSTER BRIDGE LEASES

0 oneadmin oneadmin Private Network - bridged 0

Add your first VM image (created some time ago with virt-manager …)

(http://docs.opennebula.org/4.8/user/virtual_resource_management/img_guide.html)

Copy the disk image from standard path for libvirtd on your hypervisor to the nebula-manager-groupxx

$ scp root@jnws0XX:/var/lib/libvirt/images/templateVM.img /tmp/

Import VM disk as VM image file. Create a file named slc65_img.one like this :

$ cat > slc65_img.one

NAME = "SLC65"

PATH = /tmp/templateVM.img

TYPE = OS

DESCRIPTION = "SLC65 base installation."

And use it to define an image of operating system

$ oneimage create slc65_img.one --datastore default

Wait for copy to end :

$ dstat

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--

usr sys idl wai hiq siq| read writ| recv send| in out | int csw

2 1 95 1 0 0| 104k 208k| 0 0 | 0 0 | 208 841

1 18 0 76 0 5| 0 48k| 45M 74M| 0 0 |5541 6016

4 37 0 50 0 9|2112k 152k| 102M 13M| 0 0 |6803 9070

3 32 0 53 0 11| 0 0 | 82M 51M| 0 0 |6446 8045

...

Create your first template (http://docs.opennebula.org/4.8/user/virtual_resource_management/vm_guide.html). Create a file named vm.txt with these information, check with your parameters :

NAME = vm<YOUR GROUPNUMBER>

MEMORY = 1024

CPU = 1

DISK = [ IMAGE_ID=7, DRIVER="qcow2", BUS="virtio", DEV_PREFIX="vd", TARGET="vda" ]

NIC = [ NETWORK = "Private Network", NETWORK_UNAME="oneadmin" ]

CONTEXT=[ HOSTNAME="vm<YOUR GROUPNUMBER>-$VMID.ihep.ac.cn",FILES="/var/lib/one/init.sh", NETWORK=YES ]

GRAPHICS = [

TYPE = "vnc",

LISTEN = "0.0.0.0"]

Use this file to define your first template :

$ onetemplate create vm.txt

And now let's try to instantiate your first VM in cloud ...

$ onetemplate instantiate 0

$ onevm list

ID USER GROUP NAME STAT UCPU UMEM HOST TIME

0 oneadmin oneadmin test-vm-0 runn 0 0K jnws0XX 0d 00h00

1. 5.1Enable the web interface - Sunstone

In order to enable the Sunstone web interface, install additional software on nebula-manager-groupxx (more info on http://docs.opennebula.org/4.8/administration/sunstone_gui/sunstone.html )

# /usr/share/one/install_gems sunstone

# yum install novnc

Edit /etc/one/sunstone-server.conf and modify the host line as follows:

:host: 0.0.0.0

Disable firewall:

# iptables -F

# chkconfig iptables off

Start Sunstone as user oneadmin :

$ sunstone-server start

Connect to Sunstone web interface from your laptop using the oneadmin password previously saved:

http://<IP ADDRESS OF YOUR NEBULA MANAGER>:9869/

password : 4689b636b7bde5565a0d95e865bc4609

6Contextualization

The Contextualization step can be done via one external script executed on the VM during the boot procedure (more information at http://docs.opennebula.org/4.8/user/virtual_machine_setup/cong.html). In this script software installation and custom configuration can be performed, and many other tasks can be automated without creating a new image and/or a new template. In this way, your changes to VM's configuration can be applied much faster.

Create a custom init.sh script as user oneadmin at nebula-manager-groupxx, to install CVMFS at first boot (check CVMFS_HTTP_PROXY !!!):

#!/bin/bash

setup_cvmfs(){

# fuse

yum -y install fuse

yum clean all

# cvmfs repo

cd /etc/yum.repos.d/

wget http://cvmrepo.web.cern.ch/cvmrepo/yum/cernvm.repo

# repo key

cd /etc/pki/rpm-gpg/

wget http://cvmrepo.web.cern.ch/cvmrepo/yum/RPM-GPG-KEY-CernVM

# install cvmfs stuff

yum -y install cvmfs cvmfs-init-scripts cvmfs-auto-setup cvmfs-keys

# configure cvmfs

touch /etc/cvmfs/default.local

cat > /etc/cvmfs/default.local <<EOF

mkdir /var/cvmfs-cache

CVMFS_REPOSITORIES=boss.cern.ch

CVMFS_HTTP_PROXY="http://10.10.0.99:8080"

# CVMFS_CACHE_DIR=/scratch/cvmfs/boss

CVMFS_CACHE_DIR=/var/cvmfs-cache

CVMFS_CACHE_BASE=/var/cvmfs-cache

# CVMFS_QUOTA_LIMIT=30720

CVMFS_QUOTA_LIMIT=5700

EOF

touch /etc/cvmfs/config.d/boss.cern.ch.conf

chmod ugo+x /etc/cvmfs/config.d/boss.cern.ch.conf

mkdir -p /var/cvmfs-cache

chown cvmfs. /var/cvmfs-cache -R

cat > /etc/cvmfs/config.d/boss.cern.ch.conf <<EOF

#!/bin/sh

repository_start() {

[ ! -L /opt/boss ] && ln -s /cvmfs/boss.cern.ch /opt/boss

}

repository_stop() {

[ -L /opt/boss ] && rm -f /opt/boss

}

EOF

ln -s /cvmfs/boss.cern.ch /opt/boss

cvmfs_config reload

# restart autofs

echo "Start cvmfs services"

service autofs restart

cvmfs_config probe

# avoid automatic updates of cvmfs

echo "exclude=cvmfs*" >> /etc/yum.conf

}

(

export http_proxy=202.122.33.53:3128
export ftp_proxy=202.122.33.53:3128

export https_proxy=202.122.33.53:3128

echo "remove selinux"

echo 0 > /selinux/enforce

echo "*************Setup CVMFS"

date

# cvmfs

setup_cvmfs

) 2>&1 | tee -a /var/log/context.log

Edit your template adding FILES option in the CONTEXT section :

$ onetemplate update 0

...

CONTEXT=[ HOSTNAME="vm<YOUR GROUPNUMBER>-$VMID.ihep.ac.cn",FILES="/var/lib/one/init.sh", NETWORK=YES ]

...

Instantiate a new VM and check if CVMFS working. Connect via ssh to the new VM and enter this commands :

# cd /opt/boss

# df -h .

Filesystem Size Used Avail Use% Mounted on

cvmfs2 5.6G 46M 5.6G 1% /cvmfs/boss.cern.ch

7Install torque pbs server on nebula-manager-groupxx

In order to install pbs server on nebula-manager-groupxx, one needs to install some additional software:

# yum install libxml2-devel openssl-devel gcc gcc-c++ boost-devel

Now download source software of torque pbs (http://wpfilebase.s3.amazonaws.com/torque/torque-2.5.13.tar.gz), and proceed with build and install:

# wget http://wpfilebase.s3.amazonaws.com/torque/torque-2.5.13.tar.gz

# tar -zxf torque-2.5.13.tar.gz

# cd torque-2.5.13/

# ./configure

# make

# make install

Add a line with IP address of your nebula-manager-groupxx to /etc/hosts:

<IP ADDRESS OF YOUR NEBULA MANAGER> nebula-manager-groupxx.ihep.ac.cn nebula-manager-groupxx

Save all configuration parameter in a file /var/spool/torque/server.conf editing lines with server managers and server operators:

delete queue long

create queue long

set queue long queue_type = Execution

set queue long resources_max.cput = 72:00:00

set queue long resources_max.walltime = 72:00:00

set queue long Priority = 100

# equal to max number of core

# can be modified later : qmgr -c “set queue long max_running = XXX”

set queue long max_running = 6

set queue long enabled = True

set queue long started = True

# Set server attributes.

set server scheduling = True

set server managers = root@nebula-manager-groupxx.ihep.ac.cn

set server operators = root@nebula-manager-groupxx.ihep.ac.cn

set server log_events = 511

set server mail_from = adm

set server query_other_jobs = True

set server scheduler_iteration = 600

set server node_check_rate = 150

set server tcp_timeout = 6

set server node_pack = False

Apply the configuration, and start pbs server:

# pbs_server -t create

# qterm -t quick

# pbs_server

# pbs_sched

# pbs_mom

# cat /var/spool/torque/server.conf | qmgr

Max open servers: 10239

qmgr obj=long svr=default: Unknown queue

Check the server status:

# qmgr -c "list server"

Server nebula-manager-groupxx.ihep.ac.cn

server_state = Active

scheduling = True

total_jobs = 0

state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0

acl_hosts = nebula-manager

managers = root@nebula-manager-groupxx

operators = root@nebula-manager-groupxx

log_events = 511

mail_from = adm

query_other_jobs = True

scheduler_iteration = 600

node_check_rate = 150

tcp_timeout = 6

node_pack = False

pbs_version = 2.5.13

next_job_number = 0

net_counter = 2 2 2

Check the queue configuration (queue name : long):

# qmgr -c "list queue long"

Queue long

queue_type = Execution

Priority = 100

total_jobs = 0

state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0

max_running = 6

resources_max.cput = 72:00:00

resources_max.walltime = 72:00:00

mtime = Mon Jul 13 21:36:56 2015

enabled = True

started = True

Check the queue status:

# qstat -q

server: nebula-manager

Queue Memory CPU Time Walltime Node Run Que Lm State

---------------- ------ -------- -------- ---- --- --- -- -----

long -- 72:00:00 72:00:00 -- 0 0 6 E R

----- -----

0 0

1. 7.1Modify init.sh to install torque pbs client automatically

Installation on client require some more work: create some users to run jobs, mount shared /home (exported by storage1 NFS server), and install torque pbs and start client components.

#!/bin/bash

setup_cvmfs(){

# fuse

yum -y install fuse

yum clean all

# cvmfs repo

cd /etc/yum.repos.d/

wget http://cvmrepo.web.cern.ch/cvmrepo/yum/cernvm.repo

# repo key

cd /etc/pki/rpm-gpg/

wget http://cvmrepo.web.cern.ch/cvmrepo/yum/RPM-GPG-KEY-CernVM

# install cvmfs stuff

yum -y install cvmfs cvmfs-init-scripts cvmfs-auto-setup cvmfs-keys

# configure cvmfs

touch /etc/cvmfs/default.local

cat > /etc/cvmfs/default.local <<EOF

mkdir /var/cvmfs-cache

CVMFS_REPOSITORIES=boss.cern.ch

CVMFS_HTTP_PROXY="http://10.10.0.XX:8080"

# CVMFS_CACHE_DIR=/scratch/cvmfs/boss

CVMFS_CACHE_DIR=/var/cvmfs-cache

CVMFS_CACHE_BASE=/var/cvmfs-cache

# CVMFS_QUOTA_LIMIT=30720

CVMFS_QUOTA_LIMIT=5700

EOF

touch /etc/cvmfs/config.d/boss.cern.ch.conf

chmod ugo+x /etc/cvmfs/config.d/boss.cern.ch.conf

mkdir -p /var/cvmfs-cache

chown cvmfs. /var/cvmfs-cache -R

cat > /etc/cvmfs/config.d/boss.cern.ch.conf <<EOF

#!/bin/sh

repository_start() {

[ ! -L /opt/boss ] && ln -s /cvmfs/boss.cern.ch /opt/boss

}

repository_stop() {

[ -L /opt/boss ] && rm -f /opt/boss

}

EOF

ln -s /cvmfs/boss.cern.ch /opt/boss

cvmfs_config reload

# restart autofs

echo "Start cvmfs services"

service autofs restart

cvmfs_config probe

# avoid automatic updates of cvmfs

echo "exclude=cvmfs*" >> /etc/yum.conf

}

setup_pbs() {

# install software

yum -y install libxml2-devel openssl-devel gcc gcc-c++ boost-devel

wget http://wpfilebase.s3.amazonaws.com/torque/torque-2.5.13.tar.gz

tar -zxf torque-2.5.13.tar.gz

cd torque-2.5.13/

./configure

make

make install

# add entry into /etc/hosts ... replace DNS

echo "<IP ADDRESS OF YOUR NEBULA MANAGER> nebula-manager-groupxx.ihep.ac.cn nebula-manager-groupxx" >> /etc/hosts

# config client

cat > /var/spool/torque/mom_priv/config <<EOF

\$pbsserver nebula-manager-groupxx

\$logevent 255

EOF

# start pbs_mom

/usr/local/sbin/pbs_mom

}

function other()

{

# switch off iptables ;(

service iptables stop

chkconfig iptables off

# switch off selinux

setenforce 0

# remove autoupdate

/sbin/service yum-autoupdate stop

/sbin/chkconfig --del yum-autoupdate

# setting hostname

hostname $HOSTNAME

# add entry to hosts

echo $ETH0_IP $HOSTNAME >> /etc/hosts

# make some batch user

groupadd -g 500 pluto

useradd -u 500 -g pluto -M pippo

# mount /home

echo "jnws024:/SharedHome/groupxx /home nfs mountvers=3,defaults,_netdev 0 0 " >> /etc/fstab

mount -a

}

(

export http_proxy=202.122.33.53:3128
export ftp_proxy=202.122.33.53:3128

export https_proxy=202.122.33.53:3128

echo "remove selinux"

echo 0 > /selinux/enforce

echo "*************Setup CVMFS"

date

# disable autoupdate

other

# cvmfs

setup_cvmfs

# pbs

setup_pbs

# print env

env

) 2>&1 | tee -a /var/log/context.log

Instantiate a new VM, and verify that boot properly, CVMFS is working and /home is NFS mounted.

Add the new node on server (nebula-manager-groupxx) /etc/hosts file:

Add the node to the pbs server list adding a line to /var/spool/torque/server_priv/nodes

vm<YOUR GROUPNUMBER>-<NAME OF YOUR NEW VM> np=1

Restart torque pbs server:

# killall -9 pbs_server

# killall -9 pbs_sched

# killall -9 pbs_mom

# pbs_server

# pbs_sched

# pbs_mom

Test the visibility of the torque pbs server/client on the server side (eventually restart it on the client):

# pbsnodes

vmXX

state = free

np = 1

ntype = cluster

status = rectime=1436821133,varattr=,jobs=,state=free,netload=225458219,gres=,loadave=0.00,ncpus=1,physmem=1020392kb,availmem=2956520kb,totmem=3117536kb,idletime=1160,nusers=0,nsessions=? 0,sessions=? 0,uname=Linux vm26 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 04:15:08 CET 2013 x86_64,opsys=linux

gpus = 0

1. 7.2Play with torque PBS

Mount /home on your nebula manager, and create a batch user to run jobs:

# echo "jnws024:/SharedHome/groupXX /home nfs mountvers=3,defaults,_netdev 0 0 " >> /etc/fstab

# mount -a

# groupadd -g 500 pluto

# useradd -u 500 -g pluto -m pippo

Create a RSA key pair for user pippo and trust the user public key and host key:

$ ssh-keygen

$ cd .ssh

$ cat id_rsa.pub > authorized_keys

$ chmod 644 authorized_keys

$ ssh pippo@nebula-manager-groupXX.ihep.ac.cn

The authenticity of host 'nebula-manager-groupXX.ihep.ac.cn (192.168.64.145)' can't be established.

RSA key fingerprint is 4d:44:b6:52:3b:35:c2:68:9f:b1:55:dd:07:de:b1:fa.

Are you sure you want to continue connecting (yes/no)? yes

Create a test job using the user pippo. create a file named test.sh like this:

#!/bin/bash

sleep 100s

pwd

hostname

give it the right permission :

$ chmod 755 test.sh

Send many jobs of the same kind:

$ for var in `seq 1 10` ; do qsub -q long test.sh -j oe; sleep 1; done

12.nebula-manager-groupxx

13.nebula-manager-groupxx

14.nebula-manager-groupxx

15.nebula-manager-groupxx

16.nebula-manager-groupxx

17.nebula-manager-groupxx

18.nebula-manager-groupxx

19.nebula-manager-groupxx

20.nebula-manager-groupxx

21.nebula-manager-groupxx

and watch the queue:

$ qstat -n1

nebula-manager:

Req'd Req'd Elap

Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time

----------------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------

12.nebula-manager-groupxx pippo long test.sh 9201 -- -- -- -- R 00:00:00 vmXX/0

13.nebula-manager-groupxx pippo long test.sh 9206 -- -- -- -- R 00:00:00 vmXX/0

14.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

15.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

16.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

17.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

18.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

19.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

20.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

21.nebula-manager-groupxx pippo long test.sh -- -- -- -- -- Q -- --

Configure BOSS and play more jobs submission ….