hit counter

Servicios y sistemas informáticos para Internet. Curso 2008-2009

Capítulo 4. Trabajos con dependencias en Condor

Tabla de contenidos

4.1. Trabajos con dependencias
4.2. Ejercicio propuesto

4.1. Trabajos con dependencias

En algunas ocasiones es necesario ejectuar varios trabajos que tienen dependencias entre ellos. El ejemplo clásico sería aquel que empieza con una tarea que se encarga de dividir los datos del trabajo, continua con múltiples tareas que procesan cada parte de los datos, y termina con una tarea de combina los resultados de cada tarea de procesamiento. Este tipo de dependencias se expresan mediante un DAG (Directed Acylic Graph). En condor este tipo de trabajos se ejecutan mediante DAGMan (DAG Manager).

En el ejemplo que vamos a realizar se tiene una tarea que prepara los datos (setup), cuyo cometido es simplemente imprimir un número. A continuación, varias tareas (work1 y work2) procesarán los datos generados por la primera tarea. En este caso, dividirán el número entre dos. Por último, habrá otra tarea (finalize) que se encargará de combinar los resultados sumándolos.

Antes de empezar el ejercicio, vamos crear un nuevo directorio que lo contenga.

$> mkdir ~/condor/dag
$> cd ~/condor/dag

Ahora vamos a crear los programas

$> cat > setup
#!/bin/sh
echo $RANDOM
$> cat > work
#!/bin/sh
read num
expr $num / 2
$> cat > finalize
#!/bin/sh
sum=0
for f in "$@"
do
	num=`cat $f`
	sum=`expr $num + $sum`
done
echo $sum

Le damos permisos de ejecución.

$> chmod +x setup
$> chmod +x work
$> chmod +x finalize

Verificamos que funciona.

$> ./setup | ./work > work1.out
$> ./setup | ./work > work2.out
$> ./finalize work1.out work2.out
18364

El programa setup imprime un número de forma aleatoria por la salida estándar. El programa work lee un número por la entrada estándar, lo divide entre dos e imprime el resultado por la salida estándar. El programa finalize recibe como parámetra una lista de ficheros, lee su contenido (debe ser un número), los suma e imprime el resultado.

A continuación, vamos a crear los ficheros con la descripción de los trabajos, uno para cada tarea (setup, work1, work2 y finalize).

$> cat > setup.sub
Universe   = vanilla 
Executable = setup
output     = setup.out
log        = job.log
Error      = setup.error
should_transfer_files   = YES
when_to_transfer_output = ON_EXIT
Queue
$> cat > work1.sub
Universe   = vanilla 
Executable = work
input      = setup.out
output     = work1.out
error      = work1.error
log        = job.log
should_transfer_files   = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = setup.out
Queue
$> cat > work2.sub
Universe   = vanilla 
Executable = work
input      = setup.out
output     = work2.out
error      = work2.error
log        = job.log
should_transfer_files   = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = setup.out
Queue
$> cat > finalize.sub
Universe   = vanilla 
Executable = finalize
arguments  = work1.out work2.out
output     = finalize.out
error      = finalize.error
log        = job.log
should_transfer_files   = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = work1.out,work2.out
Queue

Nota

Fíjate en el parámetro transfer_input_files, es necesario para transferir los ficheros que esta tarea necesita. Recuerda que ahora estamos utilizando el universo vanilla.

Además, tenemos que crear un fichero donde describamos las dependencias de las tareas.

$> cat > job.dag
Job  setup  setup.sub 
Job  work1  work1.sub 
Job  work2  work2.sub        
Job  finalize  finalize.sub
PARENT setup  CHILD work1 work2
PARENT work1 work2 CHILD finalize

Ya sólo nos queda enviar el trabajo y esperar por los resultados.

$> condor_submit_dag -f job.dag

Checking all your submit files for log file names.
This might take a while...
Done.
-----------------------------------------------------------------------
File for submitting this DAG to Condor           : job.dag.condor.sub
Log of DAGMan debugging messages                 : job.dag.dagman.out
Log of Condor library output                     : job.dag.lib.out
Log of Condor library error messages             : job.dag.lib.err
Log of the life of condor_dagman itself          : job.dag.dagman.log

Condor Log file for all jobs of this DAG         : /gpfs/home_gridis/ruf/condor/dag/job.log
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 145.
-----------------------------------------------------------------------

Vamos a observar en la terminal la ejecución del trabajo.

$> watch -n 1 condor_q

Para enviar el trabajo hemos utilizado el comando condor_submit_dag que funciona como el condor_submit excepto que espera recibir un fichero con la descripción de las dependencias de las tareas (DAG description file). condor_submit_dag envía un trabajo con el programa condor_dagman que se va a encargar de ejecutar el DAG.

El trabajo condor_dagman se ejecuta en el universo Scheduler. En este universo los trabajos se ejecutan en la máquina de envío (son ejecutados por condor_schedd) y nunca son desalojados.

Nota

Cuando termine la ejecución pulsa Ctrl-C para terminar la ejecución del programa watch. La ejecución completa puede tardar unos 5 minutos.

Cuando termina la ejecución del DAG podemos observar el log.

$> cat job.log
000 (146.000.000) 01/31 11:25:22 Job submitted from host: <193.206.208.141:9632>
    DAG Node: setup
...
001 (146.000.000) 01/31 11:30:13 Job executing on host: <193.206.208.205:9680>
...
005 (146.000.000) 01/31 11:30:13 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	6  -  Run Bytes Sent By Job
	23  -  Run Bytes Received By Job
	6  -  Total Bytes Sent By Job
	23  -  Total Bytes Received By Job
...
000 (147.000.000) 01/31 11:30:22 Job submitted from host: <193.206.208.141:9632>
    DAG Node: work1
...
000 (148.000.000) 01/31 11:30:22 Job submitted from host: <193.206.208.141:9632>
    DAG Node: work2
...
001 (148.000.000) 01/31 11:30:33 Job executing on host: <193.206.208.214:9659>
...
005 (148.000.000) 01/31 11:30:33 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	6  -  Run Bytes Sent By Job
	38  -  Run Bytes Received By Job
	6  -  Total Bytes Sent By Job
	38  -  Total Bytes Received By Job
...
001 (147.000.000) 01/31 11:30:33 Job executing on host: <193.206.208.205:9680>
...
005 (147.000.000) 01/31 11:30:33 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	6  -  Run Bytes Sent By Job
	38  -  Run Bytes Received By Job
	6  -  Total Bytes Sent By Job
	38  -  Total Bytes Received By Job
...
000 (149.000.000) 01/31 11:30:42 Job submitted from host: <193.206.208.141:9632>
    DAG Node: finalize
...
001 (149.000.000) 01/31 11:30:53 Job executing on host: <193.206.208.205:9680>
...
005 (149.000.000) 01/31 11:30:53 Job terminated.
	(1) Normal termination (return value 0)
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
		Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
	6  -  Run Bytes Sent By Job
	98  -  Run Bytes Received By Job
	6  -  Total Bytes Sent By Job
	98  -  Total Bytes Received By Job
...

Podemos observar también los resultados de las tareas.

$> cat setup.out
23129
$> cat work1.out
11564
$> cat work2.out
11564
$> cat finalize.out
23128

Nota

Como podrás obversar se ha trabajado con una aritmética de números enteros.

Comprueba el contenido de los ficheros job.dag.condor.sub, job.dag.dagman.log y job.dag.dagman.out. Contienen información sobre la ejecución del DAG.

$> cat job.dag.condor.sub
# Filename: job.dag.condor.sub
# Generated by condor_submit_dag job.dag
universe        = scheduler
executable      = /opt/condor/bin/condor_dagman
getenv          = True
output          = job.dag.lib.out
error           = job.dag.lib.err
log             = job.dag.dagman.log
remove_kill_sig = SIGUSR1
# Note: default on_exit_remove expression:
# ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED && ExitCode >=0 && ExitCode <= 2))
# attempts to ensure that DAGMan is automatically
# requeued by the schedd if it exits abnormally or
# is killed (e.g., during a reboot).
on_exit_remove  = ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED && ExitCode >=0 && ExitCode <= 2))
copy_to_spool   = False
arguments       = -f -l . -Debug 3 -Lockfile job.dag.lock -Condorlog /gpfs/home_gridis/ruf/...
environment     = _CONDOR_DAGMAN_LOG=job.dag.dagman.out;_CONDOR_MAX_DAGMAN_LOG=0
queue
$> cat job.dag.dagman.log
# Filename: job.dag.condor.sub
# Generated by condor_submit_dag job.dag
universe        = scheduler
executable      = /opt/condor/bin/condor_dagman
getenv          = True
output          = job.dag.lib.out
error           = job.dag.lib.err
log             = job.dag.dagman.log
remove_kill_sig = SIGUSR1
# Note: default on_exit_remove expression:
# ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED && ExitCode >=0 && ExitCode <=                                              2))
# attempts to ensure that DAGMan is automatically
# requeued by the schedd if it exits abnormally or
# is killed (e.g., during a reboot).
on_exit_remove  = ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED && ExitCode >=0                                              && ExitCode <= 2))
copy_to_spool   = False
arguments       = -f -l . -Debug 3 -Lockfile job.dag.lock -Condorlog /gpfs/home_                                             gridis/ruf/condor/dag/job.log -Dag job.dag -Rescue job.dag.rescue
environment     = _CONDOR_DAGMAN_LOG=job.dag.dagman.out;_CONDOR_MAX_DAGMAN_LOG=0
queue
$> cat job.dag.dagman.out
1/31 11:25:09 ******************************************************
1/31 11:25:09 ** condor_scheduniv_exec.145.0 (CONDOR_DAGMAN) STARTING UP
1/31 11:25:09 ** /opt/condor/bin/condor_dagman
1/31 11:25:09 ** $CondorVersion: 7.1.1 Jul  1 2008 PRE-RELEASE-UWCS $
1/31 11:25:09 ** $CondorPlatform: I386-LINUX_CENTOS45 $
1/31 11:25:09 ** PID = 31186
1/31 11:25:09 ** Log last touched time unavailable (No such file or directory)
1/31 11:25:09 ******************************************************
1/31 11:25:09 Using config source: /opt/condor/etc/condor_config
1/31 11:25:09 Using local config sources:
1/31 11:25:09    /var/condor/condor_config.local
1/31 11:25:09 DaemonCore: Command Socket at <193.206.208.141:9642>
1/31 11:25:09 DAGMAN_SUBMIT_DELAY setting: 0
1/31 11:25:09 DAGMAN_MAX_SUBMIT_ATTEMPTS setting: 6
1/31 11:25:09 DAGMAN_STARTUP_CYCLE_DETECT setting: 0
1/31 11:25:09 DAGMAN_MAX_SUBMITS_PER_INTERVAL setting: 5
1/31 11:25:09 allow_events (DAGMAN_IGNORE_DUPLICATE_JOB_EXECUTION, DAGMAN_ALLOW_EVENTS) setting: 114
1/31 11:25:09 DAGMAN_RETRY_SUBMIT_FIRST setting: 1
1/31 11:25:09 DAGMAN_RETRY_NODE_FIRST setting: 0
1/31 11:25:09 DAGMAN_MAX_JOBS_IDLE setting: 0
1/31 11:25:09 DAGMAN_MAX_JOBS_SUBMITTED setting: 0
1/31 11:25:09 DAGMAN_MUNGE_NODE_NAMES setting: 1
1/31 11:25:09 DAGMAN_DELETE_OLD_LOGS setting: 1
1/31 11:25:09 DAGMAN_PROHIBIT_MULTI_JOBS setting: 0
1/31 11:25:09 DAGMAN_SUBMIT_DEPTH_FIRST setting: 0
1/31 11:25:09 DAGMAN_ABORT_DUPLICATES setting: 1
1/31 11:25:09 DAGMAN_ABORT_ON_SCARY_SUBMIT setting: 1
1/31 11:25:09 DAGMAN_PENDING_REPORT_INTERVAL setting: 600
1/31 11:25:09 DAGMAN_AUTO_RESCUE setting: 0
1/31 11:25:09 DAGMAN_MAX_RESCUE_NUM setting: 100
1/31 11:25:09 argv[0] == "condor_scheduniv_exec.145.0"
1/31 11:25:09 argv[1] == "-Debug"
1/31 11:25:09 argv[2] == "3"
1/31 11:25:09 argv[3] == "-Lockfile"
1/31 11:25:09 argv[4] == "job.dag.lock"
1/31 11:25:09 argv[5] == "-Condorlog"
1/31 11:25:09 argv[6] == "/gpfs/home_gridis/ruf/condor/dag/job.log"
1/31 11:25:09 argv[7] == "-Dag"
1/31 11:25:09 argv[8] == "job.dag"
1/31 11:25:09 argv[9] == "-Rescue"
1/31 11:25:09 argv[10] == "job.dag.rescue"
1/31 11:25:09 DAG Lockfile will be written to job.dag.lock
1/31 11:25:09 DAG Input file is job.dag
1/31 11:25:09 Rescue DAG will be written to job.dag.rescue
1/31 11:25:09 All DAG node user log files:
1/31 11:25:09   /gpfs/home_gridis/ruf/condor/dag/job.log (Condor)
1/31 11:25:09 Parsing 1 dagfiles
1/31 11:25:09 Parsing job.dag ...
1/31 11:25:09 Dag contains 4 total jobs
1/31 11:25:09 Truncating any older versions of log files...
1/31 11:25:09 Sleeping for 12 seconds to ensure ProcessId uniqueness
1/31 11:25:21 Bootstrapping...
1/31 11:25:21 Number of pre-completed nodes: 0
1/31 11:25:21 Registering condor_event_timer...
1/31 11:25:22 Submitting Condor Node setup job(s)...
1/31 11:25:22 submitting: condor_submit -a dag_node_name' '=' 'setup -a ...
1/31 11:25:22 From submit: Submitting job(s).
1/31 11:25:22 From submit: Logging submit event(s).
1/31 11:25:22 From submit: 1 job(s) submitted to cluster 146.
1/31 11:25:22   assigned Condor ID (146.0)
1/31 11:25:22 Just submitted 1 job this cycle...
1/31 11:25:22 Event: ULOG_SUBMIT for Condor Node setup (146.0)
1/31 11:25:22 Number of idle job procs: 1
1/31 11:25:22 Of 4 nodes total:
1/31 11:25:22  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
1/31 11:25:22   ===     ===      ===     ===     ===        ===      ===
1/31 11:25:22     0       0        1       0       0          3        0
1/31 11:30:17 Event: ULOG_EXECUTE for Condor Node setup (146.0)
1/31 11:30:17 Number of idle job procs: 0
1/31 11:30:17 Event: ULOG_JOB_TERMINATED for Condor Node setup (146.0)
1/31 11:30:17 Node setup job proc (146.0) completed successfully.
1/31 11:30:17 Node setup job completed
1/31 11:30:17 Number of idle job procs: 0
1/31 11:30:17 Of 4 nodes total:
1/31 11:30:17  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
1/31 11:30:17   ===     ===      ===     ===     ===        ===      ===
1/31 11:30:17     1       0        0       0       2          1        0
1/31 11:30:22 Submitting Condor Node work1 job(s)...
1/31 11:30:22 submitting: condor_submit -a dag_node_name' '=' 'work1 -a ...
1/31 11:30:22 From submit: Submitting job(s).
1/31 11:30:22 From submit: Logging submit event(s).
1/31 11:30:22 From submit: 1 job(s) submitted to cluster 147.
1/31 11:30:22   assigned Condor ID (147.0)
1/31 11:30:22 Submitting Condor Node work2 job(s)...
1/31 11:30:22 submitting: condor_submit -a dag_node_name' '=' 'work2 -a ...
1/31 11:30:22 From submit: Submitting job(s).
1/31 11:30:22 From submit: Logging submit event(s).
1/31 11:30:22 From submit: 1 job(s) submitted to cluster 148.
1/31 11:30:22   assigned Condor ID (148.0)
1/31 11:30:22 Just submitted 2 jobs this cycle...
1/31 11:30:22 Event: ULOG_SUBMIT for Condor Node work1 (147.0)
1/31 11:30:22 Number of idle job procs: 1
1/31 11:30:22 Event: ULOG_SUBMIT for Condor Node work2 (148.0)
1/31 11:30:22 Number of idle job procs: 2
1/31 11:30:22 Of 4 nodes total:
1/31 11:30:22  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
1/31 11:30:22   ===     ===      ===     ===     ===        ===      ===
1/31 11:30:22     1       0        2       0       0          1        0
1/31 11:30:37 Event: ULOG_EXECUTE for Condor Node work2 (148.0)
1/31 11:30:37 Number of idle job procs: 1
1/31 11:30:37 Event: ULOG_JOB_TERMINATED for Condor Node work2 (148.0)
1/31 11:30:37 Node work2 job proc (148.0) completed successfully.
1/31 11:30:37 Node work2 job completed
1/31 11:30:37 Number of idle job procs: 1
1/31 11:30:37 Event: ULOG_EXECUTE for Condor Node work1 (147.0)
1/31 11:30:37 Number of idle job procs: 0
1/31 11:30:37 Event: ULOG_JOB_TERMINATED for Condor Node work1 (147.0)
1/31 11:30:37 Node work1 job proc (147.0) completed successfully.
1/31 11:30:37 Node work1 job completed
1/31 11:30:37 Number of idle job procs: 0
1/31 11:30:37 Of 4 nodes total:
1/31 11:30:37  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
1/31 11:30:37   ===     ===      ===     ===     ===        ===      ===
1/31 11:30:37     3       0        0       0       1          0        0
1/31 11:30:42 Submitting Condor Node finalize job(s)...
1/31 11:30:42 submitting: condor_submit -a dag_node_name' '=' 'finalize -a ...
1/31 11:30:42 From submit: Submitting job(s).
1/31 11:30:42 From submit: Logging submit event(s).
1/31 11:30:42 From submit: 1 job(s) submitted to cluster 149.
1/31 11:30:42   assigned Condor ID (149.0)
1/31 11:30:42 Just submitted 1 job this cycle...
1/31 11:30:42 Event: ULOG_SUBMIT for Condor Node finalize (149.0)
1/31 11:30:42 Number of idle job procs: 1
1/31 11:30:42 Of 4 nodes total:
1/31 11:30:42  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
1/31 11:30:42   ===     ===      ===     ===     ===        ===      ===
1/31 11:30:42     3       0        1       0       0          0        0
1/31 11:30:57 Event: ULOG_EXECUTE for Condor Node finalize (149.0)
1/31 11:30:57 Number of idle job procs: 0
1/31 11:30:57 Event: ULOG_JOB_TERMINATED for Condor Node finalize (149.0)
1/31 11:30:57 Node finalize job proc (149.0) completed successfully.
1/31 11:30:57 Node finalize job completed
1/31 11:30:57 Number of idle job procs: 0
1/31 11:30:57 Of 4 nodes total:
1/31 11:30:57  Done     Pre   Queued    Post   Ready   Un-Ready   Failed
1/31 11:30:57   ===     ===      ===     ===     ===        ===      ===
1/31 11:30:57     4       0        0       0       0          0        0
1/31 11:30:57 All jobs Completed!
1/31 11:30:57 Note: 0 total job deferrals because of -MaxJobs limit (0)
1/31 11:30:57 Note: 0 total job deferrals because of -MaxIdle limit (0)
1/31 11:30:57 Note: 0 total job deferrals because of node category throttles
1/31 11:30:57 Note: 0 total PRE script deferrals because of -MaxPre limit (0)
1/31 11:30:57 Note: 0 total POST script deferrals because of -MaxPost limit (0)
1/31 11:30:57 **** condor_scheduniv_exec.145.0 (condor_DAGMAN) EXITING WITH STATUS 0

4.2. Ejercicio propuesto

Para ejercitar los conocimientos sobre condor, se propone ejecutar un trabajo que se encarge de calcular el histograma de las palabras que aparezcan en un libro.

Antes de empezar el nuevo ejercicio, vamos crear un nuevo directorio que lo contenga.

$> mkdir ~/condor/histo
$> cd ~/condor/histo

Una vez cumplido el tramite organizativo, vamos a descargar algún libro.

$> wget http://www.atc.uniovi.es/doctorado/6grid/data/quijote.txt
--12:07:44--  http://www.atc.uniovi.es/doctorado/6grid/data/quijote.txt
           => `quijote.txt'
Resolving www.atc.uniovi.es... 156.35.151.4
Connecting to www.atc.uniovi.es|156.35.151.4|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,130,803 (2.0M) [text/plain]

100%[=================================================================================>] 2,130,803    788.18K/s

12:07:48 (786.01 KB/s) - `quijote.txt' saved [2130803/2130803]

$> wget http://www.atc.uniovi.es/doctorado/6grid/data/biblia.txt
--12:51:14--  http://www.atc.uniovi.es/doctorado/6grid/data/biblia.txt
           => `biblia.txt'
Resolving www.atc.uniovi.es... 156.35.151.4
Connecting to www.atc.uniovi.es|156.35.151.4|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4,076,632 (3.9M) [text/plain]

100%[=================================================================================>] 4,076,632    907.05K/s    ETA 00:00

12:51:20 (795.48 KB/s) - `biblia.txt' saved [4076632/4076632]

El programa para calcular el histograma está escrito en python.

$> cat > histo.py
#!/usr/bin/env python
import string, sys
from string import punctuation

if len(sys.argv) != 2:  
	sys.exit("Usage: histo <file>")


file = open(sys.argv[1], "r")
filedata = file.read()
file.close()

filewords=string.split(filedata)

words=[]
for word in filewords:
	words.append(word.strip(punctuation).lower())

histogram = {}
for word in words:
        histogram[word] = histogram.get(word, 0) + 1

flist = []
for word, count in histogram.items():
    flist.append([count, word])
flist.sort()
flist.reverse()

for pair in flist:
    print pair[1], pair[0]

Nota

Observa el programa, no debería haber muchos problemas para entender lo que hace.

Dale permisos de ejecución y comprueba que funciona correctamente.

$> chmod +x histo.py
$> ./histo.py quijote.txt
que 20610
de 18196
y 18153
la 10360
a 9799
en 8205
el 8203
no 6224
los 4744
se 4690
con 4184
por 3894
las 3465
lo 3459
le 3398
su 3352
don 2647
del 2490
me 2344
como 2261
quijote 2175
sancho 2148
es 2104

...

Además, vamos a crear un programa que nos permita combiar el histograma obtenido de varios libros.

$> cat > comhisto.py
#!/usr/bin/env python
import string, sys

histogram = {}
for filenum in range(1, len(sys.argv)):
	file = open(sys.argv[filenum], "r")
	for line in file.readlines():
		fields = string.splitfields(line, ' ')
		histogram[fields[0]] = histogram.get(fields[0], 0) + int(fields[1])
	file.close()

flist = []
for word, count in histogram.items():
    flist.append([count, word])
flist.sort()
flist.reverse()

for pair in flist:
    print pair[1], pair[0]
$> chmod +x comhisto.py

Crea los ficheros de descripción necesarios para ejecutar tres trabajos: dos que calculen el histograma de los libros quijote.txt y biblia.txt, y un tercero que los combine. Este último trabajo depende de los anteriores, por lo cual será necesario utilizar DAGMan. Configura la salida de los trabajos para que se escriba en los ficheros quijote.histo.out, biblia.histo.out y comhisto.out para el histograma combinado.

Para comprobar que los trabajos se han ejecutado correctamente puedes hacer la siguiente prueba.

$> grep '^dios ' quijote.histo.out
dios 524
$> grep '^dios ' biblia.histo.out
dios 4280
$> grep '^dios ' comhisto.out
dios 4804

Atención

Cuando termines la sesión de prácticas, borra los ficheros que se han ido generando durante la ejecución de los trabajos con ~/condor/clean y guarda una copia del resto (Puede usar el WinSCP).