Pato Díaz: 2012

miércoles, 2 de mayo de 2012

Descargar todas las imágenes de un blog de Tumblr

Advertencia:
El contenido de este artículo está presentado solo con fines didácticos, como demostración de como utilizar herramientas GNU y expresiones regulares para procesar una gran cantidad de información de forma automática.
La propiedad intelectual de las imágenes en el sitio de Tumblr es de sus respectivos dueños.

Hace unos días se me ocurrió intentar bajar todas las imágenes de un blog de Tumblr, pero hacerlo a mano tomaría demasiado tiempo, por lo que empecé a analizar la forma de automatizar la tarea.

Ya me había fijado en que todas las imágenes que estaba bajando estaban alojadas en dominios que empiezan con dos números seguidos de .media.tumblr.com y los nombres empiezan con tumblr_. Veamos un ejemplo: la imágen del post http://ilovephotographyclub.tumblr.com/post/22189996450/via-on-my-way-to-heaven-by-farhadvm-on
es http://27.media.tumblr.com/tumblr_m3chaqtcbC1roly7jo1_1280.jpg.

Para este artículo voy a utilizar herramientas GNU, que pueden utilizarse tanto bajo las variantes *nix (Linux, FreeBSD, MacOS, etc.) como bajo Windows si instalamos cygwin (lo cual recomiendo encarecidamente).

Podemos utilizar wget para recuperar el contenido de la web y sed para parsear el código html de la página para recuperar las direcciones de la imágenes. Con la siguiente expresión regular '/="http:\/\/.*media\.tumblr\.com\/tumblr_.*"/ s/.*"(http:\/\/.*\.[a-zA-Z]{3,4})".*/\1/p' coincidimos las URL que nos interesan.

Ejecutando lo siguiente:

$ wget -qO- http://ilovephotographyclub.tumblr.com/post/22189996450/via-on-my-way-to-heaven-by-farhadvm-on | sed -rn -e '/="http:\/\/.*media\.tumblr\.com\/tumblr_.*"/ s/.*"(http:\/\/.*\.[a-zA-Z]{3,4})".*/\1/p'
http://26.media.tumblr.com/tumblr_m3chaqtcbC1roly7jo1_250.jpg
http://27.media.tumblr.com/tumblr_m3chaqtcbC1roly7jo1_1280.jpg

y terminamos con dos lineas que son los enlaces a las dos versiones de la imagen, una en baja resolución y la otra en una mayor resolución.

Ahora simplemente podemos bajar las imágenes con:

$ wget http://26.media.tumblr.com/tumblr_m3chaqtcbC1roly7jo1_250.jpg
$ wget http://27.media.tumblr.com/tumblr_m3chaqtcbC1roly7jo1_1280.jpg

Hasta acá solo probamos nuestra teoría de obtener los enlaces de los posts, ahora veamos como podemos aplicar esto a todos los posts del blog.

Vamos a trabajar con la página principal del blog para parsearla y recuperar de ahí los enlaces a cada post en particular. Personalmente no encontré una forma de hacer esto en pocos pasos, especialmente después de probarlo con varios blogs. La siguiente linea de comando nos retorna la lista de posts que solo pertenecen al blog que estamos procesando (muchas veces hay referencias a otros blogs de donde proviene la imágen).

$ wget -qO- http://ilovephotographyclub.tumblr.com | sed -rn -e "/tumblr\.com\/post/ s/.*(\"http:\/\/.*\.tumblr\.com\/post.*\").*/\1/p" | sed -rn -e 's/"([^"|^#]*)(["#].*)/\1/p' | sort | uniq

El comando recupera la página, filtra por los enlaces a posts y luego elimina el texto redundante que pasó por el primer filtro, también se aprovecha para eliminar referencias a la misma página (#), luego se ordena con sort para que uniq nos devuelva una lista única.

Ahora sería interesante aplicar esto a todos los posts del blog, si pudiéramos encontrar la forma de acceder a algún tipo de lista de los mismos. La página archive del blog nos permite acceder al historial el blog, pero muestra solo los últimos posts en orden descendente, al ir bajando -mediante javascript- va agregando dinámicamente el resto de los posts mas antiguos que no aparecieron en la página al cargarse. Si intentamos recuperar esta página con wget tenemos solo la página inicial y no todo el archivo por lo que no es práctico para nuestros intereses.

Otra forma de acceder al archivo de Tumblr es a través de páginas. Se pueden acceder ellas a través de la subcarpeta page seguida del número de página a la que queremos acceder. Ej: http://blog.tumblr.com/page/3 para acceder a la página 3.

Con esto podemos recorrer todas las páginas con un contador y un bucle, hasta que lleguemos al final de las páginas. Si solicitamos una página posterior a la última que tenga contenido, el sitio nos devuelve una página sin enlaces a post alguno, con formato pero vacía, podría decirse.

Ya no podemos probar este concepto directamente desde la linea de comandos, tendremos que utilizar un script.

#!/bin/bash

PAGE_NUM=1
SALIR=0
BASE_URL="http://$1.tumblr.com"

while [ $SALIR -eq 0 ]; do
  SITE="$BASE_URL/page/$PAGE_NUM"
  echo "procesando la página $PAGE_NUM [$SITE]"
  POST_LIST=`wget -qO- $SITE | sed -rn -e "/$1\.tumblr\.com\/post/ s/.*(\"http:\/\/.*\.tumblr\.com\/post.*\").*/\1/p" | sed -rn -e 's/"([^"|^#]*)(["#].*)/\1/p' | sort | uniq`
  if [ -z "$POST_LIST" ]; then
    SALIR=1
  else
    for POST in $POST_LIST; do
      echo $POST
    done
    let PAGE_NUM=$PAGE_NUM+1
  fi
done

Este script entra en un loop en el que incrementaremos nuestro contador de páginas, recuperaremos los enlaces a posts de cada página, si no podemos recuperar ningún enlace mas significa que llegamos al final de las páginas, entonces salimos del loop. La única acción del script es recorrer la lista y mostrar los enlaces. Debemos de pasar el nombre del blog como parámetro. Ej:

$ sh dwn_tumblr_test.sh ilovephotographyclub

Teniendo todo esto, es hora de programar un script que implemente todos los conceptos que probamos a lo largo del artículo.

#!/bin/bash

BLOG=$1
LOG="$1.log"
URL="http://$1.tumblr.com"
ARCHIVE="$URL/archive"
DUMP_DIR=$1


echo -e "Iniciando recuperacion del blog $1\n" > $LOG
echo "URL: $URL" >> $LOG

echo -e "Iniciando recuperacion del blog $1\n" 
echo "URL: $URL"

# creamos la carpeta de salida
if [ -e $1 ] && [ -d $1 ]; then
  echo "usando directorio $PWD/$1" >> $LOG
  echo "usando directorio $PWD/$1"
else
  echo "directorio $PWD/$1 no existe, creando." >> $LOG
  echo "directorio $PWD/$1 no existe, creando." 
  mkdir $1 >> $LOG
fi

echo "" >> $LOG

PAGE_NUM=1 # el número de página que vamos a procesar
SALIR=0    # el loop iteractuará mientras esta variable sea 0
while [ $SALIR -eq 0 ]; do
  PAGE_URL="$URL/page/$PAGE_NUM"
  echo "procesando la página $PAGE_NUM [$PAGE_URL]"
  echo "procesando la página $PAGE_NUM [$PAGE_URL]" >> $LOG
  
  POST_LIST=`wget -qO- $PAGE_URL | sed -rn -e "/$1\.tumblr\.com\/post/ s/.*(\"http:\/\/.*\.tumblr\.com\/post.*\").*/\1/p" | sed -rn -e 's/"([^"|^#]*)(["#].*)/\1/p' | sort | uniq`
  if [ -z "$POST_LIST" ]; then
    SALIR=1
  else # if [ ! -z "$POST_LIST" ] ...
    for POST in $POST_LIST; do
   # recuperamos una lista de los enlaces de las imágenes del post. normalmente hay varias versiones
   # de la imágen posteada en varias resoluciones 
   IMG_URL_LIST=`wget -qO- $POST | sed -rn -e '/="http:\/\/.*media\.tumblr\.com\/tumblr_.*"/ s/.*("http:\/\/.*media\.tumblr\.com\/tumblr_.*").*/\1/p' | sed -rn -e 's/"([^"|^#]*)(["#].*)/\1/p' | sort | uniq`
  
   # recorremos la lista de imágenes
   for IMG_URL in $IMG_URL_LIST; do
  echo "      url: $IMG_URL"
    
  # recuperamos el nombre del archivo 
  FILE_NAME=`basename $IMG_URL`
  echo "      filename: $FILE_NAME"
    
  # para ahorrar tiempo solo bajamos el archivo si no existe en el directorio de salida 
  if [ -e "$DUMP_DIR/$FILE_NAME" ]; then
    echo "url: $IMG_URL #filename: $FILE_NAME  post:$POST" >> $LOG
    echo "# ya existe"
  else
    echo ">> bajando"
    echo "url: $IMG_URL >filename: $FILE_NAME  post:$POST" >> $LOG    
    wget -qO "$DUMP_DIR/$FILE_NAME" $IMG_URL >> $LOG
  fi
   done # for IMG_URL in $IMG_URL_LIST ...
    done # for POST in $POST_LIST ...
    let PAGE_NUM=$PAGE_NUM+1
  fi # if [ ! -z "$POST_LIST" ] ...
done # while [ $SALIR -eq 0 ] ...

Guardamos el script en un archivo y lo ejecutamos, siempre pasando el nombre del blog como parámetro:

$ sh dwn_tumblr.sh ilovephotographyclub

y al terminar tendremos un directorio con el nombre del blog con las imágenes y un archivo también con el mismo nombre pero con extensión .log con el detalle de todo lo descargado.

Todo el ejemplo aquí expuesto fue creado y probado con CygWin bajo Windows 7 x64.

Actualización 03/05/2012: En el último script, en la linea 42 se agregó "| sort | uniq" al código a fin de eliminar duplicados:

IMG_URL_LIST=`wget -qO- $POST | sed -rn -e '/="http:\/\/.*media\.tumblr\.com\/tumblr_.*"/ s/.*("http:\/\/.*").*/\1/p' | sed -rn -e 's/"([^"|^#]*)(["#].*)/\1/p'`

por

IMG_URL_LIST=`wget -qO- $POST | sed -rn -e '/="http:\/\/.*media\.tumblr\.com\/tumblr_.*"/ s/.*("http:\/\/.*").*/\1/p' | sed -rn -e 's/"([^"|^#]*)(["#].*)/\1/p' | sort | uniq`

En las lineas 54 y 58 se agregó "post: $POST" al texto del echo, a fin de registrar de cual entrada se recuperó la imágen.

jueves, 19 de abril de 2012

Autómatas con Genexus Ev1 en Linux y Windows, y II

En el capítulo anterior creamos un programa de ejemplo con unos requerimientos específicos con la intención de ejecutarlo periódicamente de forma automática.

Ahora que tenemos nuestro programa funcionando, vamos a ponerlo a funcionar bajo linux.

La idea es crear un script del shell para llamar a nuestro programa, hacer algunas verificaciones y enviar el reporte resultante por correo.

Verificaciones previas

Antes que nada debemos asegurarnos que nuestra máquina linux puede enviar mensajes de correo y tenemos instalada alguna máquina virtual java.

Correo

Para este proyecto utilizaremos Sendmail, que se instala con casi todas las distribución de linux mas conocidas. Cómo configurar sendmail escapa del objetivo de este artículo, hay un montón de artículos al respecto en la red.

Podemos enviar un mensaje de prueba con el siguiente comando:

echo -e "Subject: ping \n\nthis goes on the body"  | sendmail -f micuenta@dominio.com otracuenta@gmail.com

Verificamos si tenemos java instalado

Para distribuciones basadas en Debian utilizamos el siguiente comando:

dpkg --get-selections | grep openjdk

Para distribuciones que utilizan RPM utilizamos esto:

rpm -qa | grep openjdk

Copiar archivos al servidor

A los efectos de nuestro proyecto crearemos una carpeta chequeodesatendido en /opt, aunque no es obligatorio y podría ir en otro lugar, por ejemplo en /usr/local o /usr/lib.

Ahora debemos copiar nuestro programa a la caja linux, personalmente utilizo WinSCP. En realidad no necesitamos copiar todos los archivos que nos creó el Deployment Wizard, solo necesitamos las carperas Shared y chequeodesatendido.

Pruebas preliminares

Ya tenemos todo lo necesario para empezar a probar nuestro programa. Creamos un archivo llamado test.sh y le agregamos el siguiente código:

#!/bin/bash
# get current date with file name friendly format
FECHA=`date +"%Y-%m-%d_%H-%M"`

# output file
ARCHIVO_SALIDA=cr_$FECHA

# output folder
DIRECTORIO_SALIDA="$PWD/reportes"

GXCLASSPATH="Shared/.:Shared/gxclassp.jar:Shared/iText.jar:chequeodesatendido/chequeodesatendido_GXWS.jar"


HOY=`date +"%d/%m/%y"`
java -cp $GXCLASSPATH achequeodesatendido "$HOY" "$HOY" "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA" $> "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log"

En este script la variable FECHA captura la fecha y la hora del sistema, a diferencia del script que utilizamos bajo windows que solo capturaba la fecha.

Atender que la lista GXCLASSPATH bajo linux debe estar separada por ":" mientras que bajo windows va separada por ";"

La variable HOY es nueva en este script y es simplemente la fecha del sistema en el órden y formato que espera nuestro programa.

Ejecutamos la prueba:

sh test.sh

Como en el capítulo anterior, deberíamos tener un PDF en la carpeta reportes.

El script definitivo

Creamos un script llamado run.sh y le agregamos el siguiente código:

#!/bin/bash

#cambiamos al directorio base
cd /opt/chequeodesatendido

FECHA=`date +"%Y-%m-%d_%H-%M"`
ARCHIVO_SALIDA="cr_$FECHA"
DIRECTORIO_SALIDA="$PWD/reportes"
GXCLASSPATH="Shared/.:Shared/gxclassp.jar:Shared/iText.jar:chequeodesatendido/chequeodesatendido_GXWS.jar"

# a quien notificar
LISTA_NOTIFICACION="destino1@dominio.com"

# si tenemos mas de un destinatario, separar cada uno con un espacio
#LISTA_NOTIFICACION="destino1@dominio.com destino2@dominio.com destino3@dominio.com"
REMITENTE='verificador@dominio.com'
SEPARADOR="$$-$$-$$-"

# tamaño de la carpeta
TAMANHO_ACTUAL=`du -h $DIRECTORIO_SALIDA`

enviar_mail() {
    local DIRECCION_CORREO
    local REPORTE
    local ERRORES
    local FECHA_LINDA

    # codificamos el reporte en base64 y lo cargamos en una variable. esto va
    # a ir como un atado al mensaje.
    REPORTE=`base64 $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.pdf`

    # recuperamos las lineas del log en otra variable, estas no van a ir como
    # atado sino en el cuerpo del texto.
    ERRORES=`cat "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log"`

    # podemos filtrar las lineas, por ejemplo para ignorar advertencias o depuración.
    #ERRORES=`cat $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log | grep "Errores:"`

    # fecha en formato mas amigable
    FECHA_LINDA=`date +"%d/%m/%Y %H:%M"`

    # enviamos un mensaje por cada dirección de la lista de notificación.
    # también se podría enviar un solo mensaje con las direcciones en CC.
    for DIRECCION_CORREO in $DIRECCION_NOTIFICACION; do
        /usr/sbin/sendmail -f $REMITENTE -t <<EOF
MIME-Version: 1.0
To: $DIRECCION_CORREO
From: $REMINTENTE
Subject: Verificación de consistencia - $FECHA_LINDA - $ERRORES
Content-Type: multipart/mixed; boundary="$SEPARADOR"

--$SEPARADOR
Content-Type: text/plain; charset=UTF8; format=flowed
content-transfer-encoding: 8bit

$ERRORES

Reporte generado al $FECHA_LINDA
Archivo adjunto: $ARCHIVO_SALIDA.pdf

Tamaño actual del directorio: $ACTUAL_SIZE

--$SEPARADOR
Content-Type: application/pdf; name="$ARCHIVO_SALIDA.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="$ARCHIVO_SALIDA.pdf"

$REPORTE

--$SEPARADOR
Content-Type: text/plain; name="$ARCHIVO_SALIDA.log"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="$ARCHIVO_SALIDA.log"

`cat $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log`

EOF
    done

}

# generar el reporte
HOY=`date +"%d/%m/%y"`
java -cp $GXCLASSPATH achequeodesatendido "$HOY" "$HOY" "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA" $> "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log"


# si se generó el archivo de salida, lo enviamos.
if [ -f $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.pdf ]; then
    echo "Existe $ARCHIVO_SALIDA"
    enviar_mail
fi

Analicemos el script.

#!/bin/bash

# cambiamos al directorio base
cd /opt/chequeodesatendido

FECHA=`date +"%Y-%m-%d_%H-%M"`
ARCHIVO_SALIDA="cr_$FECHA"
DIRECTORIO_SALIDA="$PWD/reportes"
GXCLASSPATH="Shared/.:Shared/gxclassp.jar:Shared/iText.jar:chequeodesatendido/chequeodesatendido_GXWS.jar"

# a quien notificar
LISTA_NOTIFICACION="destino1@dominio.com"

# si tenemos mas de un destinatario, separar cada uno con un espacio
#LISTA_NOTIFICACION="destino1@dominio.com destino2@dominio.com destino3@dominio.com"
REMITENTE='verificador@dominio.com'
SEPARADOR="$$-$$-$$-"

# tamaño de la carpeta
TAMANHO_ACTUAL=`du -h $DIRECTORIO_SALIDA`

Hasta acá solo son definiciones de variables que utilizaremos mas tarde. SEPARADOR es un texto arbitrario que definiremos para separar las secciones del mensaje, explicaré eso mas adelante. Como siempre a modo de ejemplo en la variable TAMANHO_ACTUAL recuperamos el espacio en disco ocupado por los reportes y la incluiremos en el cuerpo del mensaje.

A continuación tenemos la función que construye y envía el mensaje de correo.

enviar_mail() {
    local DIRECCION_CORREO
    local REPORTE
    local ERRORES
    local FECHA_LINDA

    # codificamos el reporte en base64 y lo cargamos en una variable. esto va
    # a ir como un atado al mensaje.
    REPORTE=`base64 $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.pdf`

    # recuperamos las lineas del log en otra variable, estas no van a ir como
    # atado sino en el cuerpo del texto.
    ERRORES=`cat "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log"`

    # podemos filtrar las lineas, por ejemplo para ignorar advertencias o depuración.
    #ERRORES=`cat $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log | grep "Errores:"`

    # fecha en formato mas amigable
    FECHA_LINDA=`date +"%d/%m/%Y %H:%M"`

    # enviamos un mensaje por cada dirección de la lista de notificación.
    # también se podría enviar un solo mensaje con las direcciones en CC.
    for DIRECCION_CORREO in $DIRECCION_NOTIFICACION; do
        /usr/sbin/sendmail -f $REMITENTE -t <<EOF
MIME-Version: 1.0
To: $DIRECCION_CORREO
From: $REMINTENTE
Subject: Verificación de consistencia - $FECHA_LINDA - $ERRORES
Content-Type: multipart/mixed; boundary="$SEPARADOR"

--$SEPARADOR
Content-Type: text/plain; charset=UTF8; format=flowed
content-transfer-encoding: 8bit

$ERRORES

Reporte generado al $FECHA_LINDA
Archivo adjunto: $ARCHIVO_SALIDA.pdf

Tamaño actual del directorio: $ACTUAL_SIZE

--$SEPARADOR
Content-Type: application/pdf; name="$ARCHIVO_SALIDA.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="$ARCHIVO_SALIDA.pdf"

$REPORTE

--$SEPARADOR
Content-Type: text/plain; name="$ARCHIVO_SALIDA.log"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="$ARCHIVO_SALIDA.log"

`cat $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log`

EOF
    done

}

Como nuestro archivo de salida es un archivo binario (PDF) no lo podemos incluirlo directamente, el mensaje de correo solo puede estar constituido por text ASCII. Para conseguir esto codificamos el archivo en base64 y separamos el mensaje en secciones, para eso nos servirá la variable SEPARADOR. En la primera sección pondremos el cuerpo del mensaje, en las siguientes colocamos los archivos "atados".

Veamos esto con mas detalle:

En la linea 47 tenemos Content-Type: multipart/mixed; boundary="$SEPARADOR", que dice que el mensaje está formado por varias partes, de tipos mezclados y que el 'límite' entre las secciones será el texto de SEPARADOR.

En la linea 49 definimos el límite de la primera sección, notese que todos los límites empiezan con dos signos menos seguidos: "--",

En las siguientes lineas tenemos:

Content-Type: text/plain; charset=UTF8; format=flowed
content-transfer-encoding: 8bit

Define la sección como texto plano, utilizando set de caracteres UTF8 y que la codificación de la transferencia será de 8bits (http://www.freesoft.org/CIE/RFC/1521/5.htm).
Agregamos al texto del cuerpo algunas de los datos que habíamos colectado, ERRORES y TAMANHO_ACTUAL. En la linea 60 se define el límite de otra sección, esta vez el de nuestro archivo de salida binario. La cabecera de la sección está definida por:

Content-Type: application/pdf; name="$ARCHIVO_SALIDA.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="$ARCHIVO_SALIDA.pdf"

Definimos que el tipo de contenido va a ser application/pdf y el nombre del atado, se especifica que la codificación de transferencia será base64, que es un atado y el nombre de archivo sugerido al guardar. Y en la linea 65 colocamos el texto que goardamos en REPORTE.

Por último, en la linea 67 tenemos el límite de la última sección, seguido de la siguiente cabecera:

Content-Type: text/plain; name="$ARCHIVO_SALIDA.log"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="$ARCHIVO_SALIDA.log"

Como nuestro archivo de log está en formato texto, solo lo vamos a "volcar" a la sección con`cat $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log` y la cabecera especifica que es un atado.

Ahora nos encontramos con el código principal de nuestro script:

# generar el reporte
HOY=`date +"%d/%m/%y"`
java -cp $GXCLASSPATH achequeodesatendido "$HOY" "$HOY" "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA" $> "$DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.log"


# si se generó el archivo de salida, lo enviamos.
if [ -f $DIRECTORIO_SALIDA/$ARCHIVO_SALIDA.pdf ]; then
    echo "Existe $ARCHIVO_SALIDA"
    enviar_mail
fi

A pesar de ser el código principal, no hay mucho que explicar aquí, formateamos la fecha actual de la forma que espera nuestro programa como parámetro, llamamos a nuestro programa y redirigimos la salida estandar al archivo de log.

Finalmente verificamos si el programa generó alguna salida y la enviamos por mail con la función que explicamos mas arriba.

Probamos nuestro script:

sh test.sh

Si todo fue bien, debemos de estar recibiendo un mail con nuestros archivos como atados.

Agendar para ejecución periódica

Tan solo nos falta programar una tarea con CRON para que se ejecute periódicamente nuestro autómata.

Ejecutamos crontab -e y agregamos la siguiente linea:
*/5 * * * * sh /opt/chequeodesatendido/run.sh

Guardamos la tarea (presionamos ESC, luego :w + ENTER, por último :q + ENTER para salir). A partir de ahora estaríamos recibiendo un correo electrónico cada 5 minutos con los resultados de nuestro programa como atado.

Bueno, hasta aquí esta segunda entrega de la serie.

Autómatas con Genexus Ev1 en Linux y Windows, y I

miércoles, 18 de abril de 2012

Automata with GeneXus Ev1 on Linux and Windows, and I

Spanish version

Today I will begin a series of four publications in which I will try to give a glimpse into how to program an automata in Genexus Ev1, periodically run it on Linux or Windows and send logs and/or resulting reports by mail.

This technique we used in production at one company where I worked as a developer to implement a data integrity verification process.

In this first issue will discuss the creation of the automaton with GeneXus Evolution 1 and the Java generator for Windows, but in fact could be programmed with any language that can throw a PDF with the name and directory you specify.

In the second issue we will deal with running it periodically under Linux using nothing but standard GNU applications that come with most distributions.

In the third post will do the same but under Windows, using some third party applications to achieve the same effect as inLinux.

In the fourth and final issue I will modify the script for Linux to use the same applications that we will use in Windows, as an exercise in adaptation from one platform to another.

Requirements of the automata

We need to create a program without a user interface that receives parameters from the command line, create files in a directory passed as parameter, as its output goes to standard output text to the OS.

Access to databases or other procedures will depend in each case we want to implement this technique, so they are optional as requirement.

Hands On

The main procedure

As a first step we create a KB on GeneXus. For the purposes of this project we choose Java Environment in Prototyping Environment and chose Win as target, the rest of the details aren't relevant but to follow the example would be advisable to name chequeodesatendido to the KB.

Now create a procedure called chequeodesatendido, this will be the main procedure of our automata. Change in its properties Main program to True and Call protocol to Command line.

Add some code:

Rules

parm(in:&date_ini, in:&date_end, in:&filename);

Source

if &date_ini.IsEmpty() or &date_end.IsEmpty() or &filename.IsEmpty() 
  msg("Faltan parametros.")
  msg("Se debe proveer fecha inicial, fecha final y nombre del archivo de salida")
endif

msg("date ini: " + &date_ini.ToFormattedString() + 
  " | date_end: " + &date_end.ToFormattedString() + 
  " | filename: " + &filename
)

reporte.call(&date_ini, &date_end, &filename)

As can be seen the code is quite simple, just as an example.Msg is output to the console, which under Linux can be redirected to a file easily under Windows can not find yet how to do the same.

This is where we run or call the code that performs some verification, correction, closing process, etc.. on our data.

Report

Create a procedure called report and modify its properties. By coincidence the properties matches the above procedure, change Main program to True and Call protocol to Command line.
Rules

parm(in:&date_ini, in:&date_end, in:&filename);
output_file(&filename, 'PDF');

Source

print printBlock1
return

Layout

For our example create just a band called printBlock1 and add the variables that received as parameter. In production this is where the report to be sent as a result should be generated.

Now define chequeodesatendido as the Startup object and create the project.

Deploy it

The easiest way to package the files needed to run our project is doing a "deploy", so you run the Deployment Wizard.

On the first screen our two procedures appears in the list of Available mains, pass them to the right under Mains to deploy.

We turn to the second screen and don't touch anything there, just hit Next to reach the third screen. Once there we check the checkbox Transfer location files then we enter a directory where the wizard will place the files, finally click Finish.

Now opens the Genexus Web Start Deployment window, change VM: to Sun, specify a name in Application name:, I used again chequeodesatendido, now click on Build Archives.

Now, if all went well, we should have all the files needed to run our program in the directory you specified in the third window of the Genexus Deployment Wizard. We should have a Shared folder and another with the name you specified in the Application name: in the GeneXus Web Start Deployment. Create a folder named reportes, which is where we'll ask our automata to send its reports.

Last actions

From here, in theory, we are able to test our automata, but there are still a couple of details that the wizard didn't cover, no idea why. For some reason the wizard doesn't copy the package iText.jar that is necessary to generate the report, so we must copy it manually from our KB. Copy it from the folder JavaModel on the KB to Shared on our deployment.

Under Windows 7, I gess it's should be the same under Vista, at the first run the progam attempts to copy the file winjutil.dll to the bin folder of the JRE, but will fail due to permissions. There are two ways to solve this problem, the first is run once our project as Administrator, the other is to copy the file from the KB JavaModel our JRE's bin folder.

Testing the automata

Create a file names named test.cmd and add the following code:

@echo off

rem reemplazamos los backslach "/" de la fecha por el signo menos "-"
for /f "tokens=1-3 delims=/" %%a in ("%date%") do set FECHA=%%a-%%b-%%c
set ARCHIVO_SALIDA=cr_%FECHA%
set DIRECTORIO_SALIDA=%CD%\reportes
set GXCLASSPATH="shared/.;shared/gxclassp.jar;shared/iText.jar;chequeodesatendido/chequeodesatendido_GXWS.jar"

java -cp %GXCLASSPATH% achequeodesatendido "%FECHA%" "%FECHA%" "%DIRECTORIO_SALIDA%\%ARCHIVO_SALIDA%"

Execute test.cmd, after that whe should have a PDF file in the reportes folder.

That's all for the first issue, the second part in a couple of days.

lunes, 16 de abril de 2012

Autómatas con Genexus Ev1 en Linux y Windows, y I

Versión en inglés

Hoy empezaré una serie de cuatro publicaciones en las que trataré de dar un vistazo a como programar un autómata en Genexus Ev1, ejecutarlo periódicamente en Linux y Windows, enviar logs y/o reportes resultantes por mail.

Esta técnica la habíamos utilizado en producción en una de las empresas donde trabajé como desarrollador para ejecutar unos procesos de verificación de integridad de datos.

En esta primera entrega abordaremos la creación del autómata con Genexus Evolution 1 y el generador Java para Windows, aunque en realidad se podría programar con cualquier lenguaje que pueda arrojarnos un PDF con el nombre y en el directorio que especifiquemos.

En la segunda entrega nos ocuparemos de hacerlo correr periódicamente bajo Linux utilizando nada mas que las aplicaciones GNU estándar que vienen con las mayoría de las distribuciones.

En la tercera entrega haremos lo mismo pero bajo Windows, utilizando algunas aplicaciones de terceros para lograr el mismo efecto que en Linux.

En la cuarta y última entrega modificaremos el script para Linux para utilizar las mismas aplicaciones que utilizamos en Windows, como un ejercicio de adaptación de una plataforma a otra.

Requerimientos del autómata

Necesitamos crear un programa sin interfaz de usuario, que reciba parámetros desde la linea de comandos, cree archivos en un directorio pasado por parámetro, su salida vaya como texto a la salida estándar del SO.

El acceso a bases de datos u otros procedimientos dependerá en cada caso que queramos implementar esta técnica, por lo que son opcionales como requerimiento.

Manos a la obra

Procedimiento principal

Como primer paso debemos crear una KB en Genexus. A los efectos de este proyecto debemos elegir Java Environment en Prototyping Environment y en target elegimos Win, el resto de los detalles no son relevantes pero para seguir el ejemplo sería recomendable nombrar a la kb chequeodesatendido.

Ahora creamos un procedimiento llamado chequeodesatendido, este será el procedimiento principal de nuestro autómata. En sus propiedades cambiamos Main program a True y Call protocol a Command line.

Agregamos algo de código:

Rules

parm(in:&date_ini, in:&date_end, in:&filename);

Source

if &date_ini.IsEmpty() or &date_end.IsEmpty() or &filename.IsEmpty() 
  msg("Faltan parametros.")
  msg("Se debe proveer fecha inicial, fecha final y nombre del archivo de salida")
endif

msg("date ini: " + &date_ini.ToFormattedString() + 
  " | date_end: " + &date_end.ToFormattedString() + 
  " | filename: " + &filename
)

reporte.call(&date_ini, &date_end, &filename)

Como puede verse el código es bastante simple, solo a modo de ejemplo. La salida de msg va a la consola, la que bajo Linux puede redirigirse fácilmente hacia un archivo, bajo Windows no encuentro aún la forma de hacer lo mismo.

Aquí es donde debemos ejecutar o llamar al código que realice alguna verificación, corrección, cierre, etc. sobre nuestros datos.

Reporte

Creamos un procedimiento llamado reporte y modificamos sus propiedades. Por casualidad las propiedades coinciden con el procedimiento anterior, ponemos Main program a True y Call protocol a Command line.

Rules

parm(in:&date_ini, in:&date_end, in:&filename);
output_file(&filename, 'PDF');

Source

print printBlock1
return

Layout
Para nuestro ejemplo sencillamente creamos una banda llamada printBlock1 y agregamos las variables que recibimos como parámetro, pero es aquí donde debe generarse el reporte que va a enviar como resultado de la ejecución del autómata.

Ahora definimos a chequeodesatendido como el Startup object y construimos el proyecto.

Lo desplegamos

La forma mas fácil de empaquetar los archivos necesarios para ejecutar nuestro proyecto es haciendo un "deploy", así que ejecutamos el Deployment Wizard.

En la primera pantalla nos apareceran nuestros dos procedimientos en la lista Available mains, los pasamos a la derecha a Mains to deploy.

Pasamos a la segunda pantalla y no tocamos nada ahí, solo le damos siguiente (next) para llegar a la tercera pantalla. Una vez ahí chequeamos el cuadro Transfer location files luego ingresamos un directorio donde el wizard colocará los archivos, por último le damos Finish.

Ahora se abrirá la pantalla del Genexus Web Start Deployment, cambiamos VM: a Sun, especificamos un nombre en Application name:, yo usé de vuelta chequeodesatendido, y por último le damos a Build Archives.

Ahora, si todo fue bien, deberíamos de tener todos los archivos necesarios para correr nuestro programa en el directorio que especificamos en la tercera ventana del Genexus Deployment Wizard. Deberíamos de tener una carpeta Shared y otra con el nombre que especificamos en Application name: en el Genexus Web Start Deployment. Creamos la carpeta reportes, que es donde le pediremos a nuestro autómata que envíe sus reportes.

Últimas acciones

A partir de aquí teóricamente estamos en condiciones de probar nuestro autómata, pero todavía quedan un par de detalles que el wizard no cubrió, ni idea del porqué. Por algún motivo el wizard no copia el paquete iText.jar que es necesario para el reporte, por lo que debemos copiarlo manualmente de nuestra KB, de la carpeta JavaModel a la carpeta Shared de nuestro deployment.

Bajo Windows 7, supongo que debería de ser lo mismo con Vista, al ejecutar el programa por primera vez intentará copiar el archivo winjutil.dll a la carpeta bin del JRE, pero fallará por cuestión de permisos. Hay dos formas de solucionar este inconveniente, el primero es ejecutar una vez nuestro proyecto como Administrador, la otra es copiar el archivo desde la carpeta JavaModel de nuestra KB a la carpeta bin del JRE.

Probando nuestro autómata

Creamos un archivo test.cmd y le agregamos el siguiente código:

@echo off

rem reemplazamos los backslach "/" de la fecha por el signo menos "-"
for /f "tokens=1-3 delims=/" %%a in ("%date%") do set FECHA=%%a-%%b-%%c
set ARCHIVO_SALIDA=cr_%FECHA%
set DIRECTORIO_SALIDA=%CD%\reportes
set GXCLASSPATH="shared/.;shared/gxclassp.jar;shared/iText.jar;chequeodesatendido/chequeodesatendido_GXWS.jar"

java -cp %GXCLASSPATH% achequeodesatendido "%FECHA%" "%FECHA%" "%DIRECTORIO_SALIDA%\%ARCHIVO_SALIDA%"

Ejecutamos el archivo test.cmd, luego deberíamos tener un archivo PDF en la carpeta reportes.

Eso es todo para la primera entrega.

Autómatas con Genexus Ev1 en Linux y Windows, y II

viernes, 13 de abril de 2012

Enumerate COM ports in Windows with Lazarus

Spanish version

As I have only USB ports on my development machine, is quite common at connecting a serial device that I have to go to the device manager to see in which COM port it was installed.

I am developing an application that needs to access a serial device, so I need to get aware of the COM ports installed, either to indicate which to use or to verify the existence of which was already configured.

Researching a bit to not reinvent the wheel, I found this page: http://www.lazarus.freepascal.org/index.php?topic=14313.0 which publishes three interesting functions.

The first function GetSerialPortNames is extracted from the package synaser (http://synapse.ararat.cz/doku.php/download), returns the COM ports installed on the operating system (in my case: COM3 & COM17). More or less what I need, but only the list without identifying them.

function GetSerialPortNames: string;
var
  reg: TRegistry;
  l, v: TStringList;
  n: integer;
begin
  l := TStringList.Create;
  v := TStringList.Create;
  reg := TRegistry.Create;
  try
{$IFNDEF VER100}
    reg.Access := KEY_READ;
{$ENDIF}
    reg.RootKey := HKEY_LOCAL_MACHINE;
    reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM');//, false);
    reg.GetValueNames(l);
    for n := 0 to l.Count - 1 do
      v.Add(reg.ReadString(l[n]));
    Result := v.CommaText;
  finally
    reg.Free;
    l.Free;
    v.Free;
  end;
end;

The second function GetSerialPortRegNames is a variant of the first in that shows the devices installed (in my case: \ Device \ ProlificSerial0 & \ Device \USBSER000), which is also not very clear.

function GetSerialPortRegNames: string;
var
  reg: TRegistry;
  l  : TStringList;
  n: integer;
begin
  l := TStringList.Create;
//  v := TStringList.Create;
  reg := TRegistry.Create;
  try
{$IFNDEF VER100}
    reg.Access := KEY_READ;
{$ENDIF}
    reg.RootKey := HKEY_LOCAL_MACHINE;
    reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM');//, false);
    reg.GetValueNames(l);
//    for n := 0 to l.Count - 1 do
//      l[n]:= l[n]+'='+ reg.ReadString(l[n]);
    Result := l.CommaText;
  finally
    reg.Free;
    l.Free;
//    v.Free;
  end;
end;

The last function GetComPortList seeks information from another part of the registry and gets the common name (FriendlyName) of the device installed, which is what I'm wanting.

function GetComPortList(PortList: TStrings): integer;
var
  i,idx: integer;
  SerPortNum: integer;
  Reg: TRegistry;
  EnumList: TStrings;
begin
  result := -1;
 
  if not CheckMinOS(osWin2k) then
   exit;
 
  Reg := TRegistry.Create();
  EnumList := TStringList.Create;
  try
    // Anzahl der Schnittstellen ermitteln
    Reg.RootKey := HKEY_LOCAL_MACHINE;
    if Reg.OpenKeyReadOnly('\System\CurrentControlSet\Services\SerEnum\Enum') then
    begin
      SerPortNum := Reg.ReadInteger('Count');
 
      // Registry-Schlüssel der Schnittstellen zwischenspeichern
      for i:=0 to SerPortNum-1 do
        EnumList.Add(Reg.ReadString(inttostr(i)));
      Reg.CloseKey;
 
      // Daten der Schnittstellen ermitteln
      for i:=0 to SerPortNum-1 do
      begin
        // Schnittstellenname ermitteln (z.B. 'COM2')
        if Reg.OpenKeyReadOnly('\System\CurrentControlSet\Enum\'+EnumList.Strings[i]+'\Device Parameters') then
          idx := PortList.Add(Reg.ReadString('PortName')+'=');
        Reg.CloseKey;
        // Bezeichnung wie im Gerätemanager ermitteln (z.B. 'USB Serial Port (COM2)' )
        if Reg.OpenKeyReadOnly('\System\CurrentControlSet\Enum\'+EnumList.Strings[i]) then
          PortList.ValueFromIndex[idx] := Reg.ReadString('FriendlyName');
        Reg.CloseKey;
      end;
    end;
  finally
    EnumList.Free;
    Reg.Free;
  end;
  result := PortList.Count;
end;

But there is a problem with this last feature, it's looking for the information only in HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\SerEnum. To test I plugged in two serial devices, a generic USB-to-RS232 adapter and a Blu Samba Q cellphone. One of them appears on the sheet SerEnum but the other appears under USBSER, which makes me assume that depending on how it is programmed the device driver, service name is arbitrary and therefore its location in the tree HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services too.

To solve my problem I took as example the first and the third function and made my own.

function GetSerialPortNamesExt: string;
var
  reg  : TRegistry;
  l,v  : TStringList;
  n    : integer;
  pn,fn: string;
 
  function findFriendlyName(key: string; port: string): string;
  var
    r : TRegistry;
    k : TStringList;
    i : Integer;
    ck: string;
    rs: string;
  begin
    r := TRegistry.Create;
    k := TStringList.Create;
 
    r.RootKey := HKEY_LOCAL_MACHINE;
    r.OpenKeyReadOnly(key);
    r.GetKeyNames(k);
    r.CloseKey;
 
    try
      for i := 0 to k.Count - 1 do
      begin
        ck := key + k[i] + '\'; // current key
        // looking for "PortName" stringvalue in "Device Parameters" subkey
        if r.OpenKeyReadOnly(ck + 'Device Parameters') then
        begin
          if r.ReadString('PortName') = port then
          begin
            //Memo1.Lines.Add('--> ' + ck);
            r.CloseKey;
            r.OpenKeyReadOnly(ck);
            rs := r.ReadString('FriendlyName');
            Break;
          end // if r.ReadString('PortName') = port ...
        end  // if r.OpenKeyReadOnly(ck + 'Device Parameters') ...
        // keep looking on subkeys for "PortName"
        else // if not r.OpenKeyReadOnly(ck + 'Device Parameters') ...
        begin
          if r.OpenKeyReadOnly(ck) and r.HasSubKeys then
          begin
            rs := findFriendlyName(ck, port);
            if rs <> '' then Break;
          end; // if not (r.OpenKeyReadOnly(ck) and r.HasSubKeys) ...
        end; // if not r.OpenKeyReadOnly(ck + 'Device Parameters') ...
      end; // for i := 0 to k.Count - 1 ...
      result := rs;
    finally
      r.Free;
      k.Free;
    end; // try ...
  end; // function findFriendlyName ...
 
begin
  v      := TStringList.Create;
  l      := TStringList.Create;
  reg    := TRegistry.Create;
  Result := '';
 
  try
    reg.RootKey := HKEY_LOCAL_MACHINE;
    if reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM') then
    begin
      reg.GetValueNames(l);
 
      for n := 0 to l.Count - 1 do
      begin
        pn := reg.ReadString(l[n]);
        fn := findFriendlyName('\System\CurrentControlSet\Enum\', pn);
        v.Add(pn + ' = '+ fn);
      end; // for n := 0 to l.Count - 1 ...
 
      Result := v.CommaText;
    end; // if reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM') ...
  finally
    reg.Free;
    v.Free;
  end; // try ...
end;

jueves, 12 de abril de 2012

Enumerar puertos seriales en Windows con Lazarus

Versión en ingles

Como tengo solo puertos USB en mi máquina de desarrollo, es muy común que al conectar un dispositivo serial tengo que ir al administrador de dispositivos para ver en que puerto COM se instaló.

Estoy desarrollando una aplicación que necesita acceder a un dispositivo serial, por lo que necesito hacerlo consciente de los puertos COM instalados, ya sea para indicar cual usar o para verificar la existencia del que ya fue configurado.

Investigando un poco para no reinventar la rueda, me encontré con esta página: http://www.lazarus.freepascal.org/index.php?topic=14313.0 donde se publican tres funciones muy interesantes.

La primera función GetSerialPortNames es extraida del paquete synaser (http://synapse.ararat.cz/doku.php/download), devuelve los puertos COM instalados en el sistema operativo (en mi caso: COM3 y COM17). Mas o menos lo que necesito, pero solo los enumera sin identificarlos.

function GetSerialPortNames: string;
var
  reg: TRegistry;
  l, v: TStringList;
  n: integer;
begin
  l := TStringList.Create;
  v := TStringList.Create;
  reg := TRegistry.Create;
  try
{$IFNDEF VER100}
    reg.Access := KEY_READ;
{$ENDIF}
    reg.RootKey := HKEY_LOCAL_MACHINE;
    reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM');//, false);
    reg.GetValueNames(l);
    for n := 0 to l.Count - 1 do
      v.Add(reg.ReadString(l[n]));
    Result := v.CommaText;
  finally
    reg.Free;
    l.Free;
    v.Free;
  end;
end;

La segunda función GetSerialPortRegNames es una variante de la primera en la que se muestra el dispositivo instalado en si (en mi caso: \Device\ProlificSerial0 y \Device\USBSER000), lo que tampoco es muy claro.

function GetSerialPortRegNames: string;
var
  reg: TRegistry;
  l  : TStringList;
  n: integer;
begin
  l := TStringList.Create;
//  v := TStringList.Create;
  reg := TRegistry.Create;
  try
{$IFNDEF VER100}
    reg.Access := KEY_READ;
{$ENDIF}
    reg.RootKey := HKEY_LOCAL_MACHINE;
    reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM');//, false);
    reg.GetValueNames(l);
//    for n := 0 to l.Count - 1 do
//      l[n]:= l[n]+'='+ reg.ReadString(l[n]);
    Result := l.CommaText;
  finally
    reg.Free;
    l.Free;
//    v.Free;
  end;
end;

La última función GetComPortList busca la información en otra parte del registro y obtiene el nombre común (FriendlyName) del puerto instalado, que es justamente lo que estoy queriendo.

function GetComPortList(PortList: TStrings): integer;
var
  i,idx: integer;
  SerPortNum: integer;
  Reg: TRegistry;
  EnumList: TStrings;
begin
  result := -1;

  if not CheckMinOS(osWin2k) then
   exit;

  Reg := TRegistry.Create();
  EnumList := TStringList.Create;
  try
    // Anzahl der Schnittstellen ermitteln
    Reg.RootKey := HKEY_LOCAL_MACHINE;
    if Reg.OpenKeyReadOnly('\System\CurrentControlSet\Services\SerEnum\Enum') then
    begin
      SerPortNum := Reg.ReadInteger('Count');

      // Registry-Schlüssel der Schnittstellen zwischenspeichern
      for i:=0 to SerPortNum-1 do
        EnumList.Add(Reg.ReadString(inttostr(i)));
      Reg.CloseKey;

      // Daten der Schnittstellen ermitteln
      for i:=0 to SerPortNum-1 do
      begin
        // Schnittstellenname ermitteln (z.B. 'COM2')
        if Reg.OpenKeyReadOnly('\System\CurrentControlSet\Enum\'+EnumList.Strings[i]+'\Device Parameters') then
          idx := PortList.Add(Reg.ReadString('PortName')+'=');
        Reg.CloseKey;
        // Bezeichnung wie im Gerätemanager ermitteln (z.B. 'USB Serial Port (COM2)' )
        if Reg.OpenKeyReadOnly('\System\CurrentControlSet\Enum\'+EnumList.Strings[i]) then
          PortList.ValueFromIndex[idx] := Reg.ReadString('FriendlyName');
        Reg.CloseKey;
      end;
    end;
  finally
    EnumList.Free;
    Reg.Free;
  end;
  result := PortList.Count;
end;

Pero hay un problema con esta última función, busca la información solo en HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\SerEnum. Para probar conecté dos dispositivos seriales un adaptador USB-RS232 genérico y un celular Blu Samba Q, uno de ellos figura en la hoja SerEnum pero la otra aparece bajo UsbSer, lo que me hace suponer que dependiendo de cómo está programado el driver del dispositivo, el nombre del servicio que los controla es arbitrario por lo tanto su ubicación en el árbol HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services también.

Para resolver mi problema tomé como ejemplo la primera y la tercera función e hice la mía propia.

function GetSerialPortNamesExt: string;
var
  reg  : TRegistry;
  l,v  : TStringList;
  n    : integer;
  pn,fn: string;

  function findFriendlyName(key: string; port: string): string;
  var
    r : TRegistry;
    k : TStringList;
    i : Integer;
    ck: string;
    rs: string;
  begin
    r := TRegistry.Create;
    k := TStringList.Create;

    r.RootKey := HKEY_LOCAL_MACHINE;
    r.OpenKeyReadOnly(key);
    r.GetKeyNames(k);
    r.CloseKey;

    try
      for i := 0 to k.Count - 1 do
      begin
        ck := key + k[i] + '\'; // current key
        // looking for "PortName" stringvalue in "Device Parameters" subkey
        if r.OpenKeyReadOnly(ck + 'Device Parameters') then
        begin
          if r.ReadString('PortName') = port then
          begin
            //Memo1.Lines.Add('--> ' + ck);
            r.CloseKey;
            r.OpenKeyReadOnly(ck);
            rs := r.ReadString('FriendlyName');
            Break;
          end // if r.ReadString('PortName') = port ...
        end  // if r.OpenKeyReadOnly(ck + 'Device Parameters') ...
        // keep looking on subkeys for "PortName"
        else // if not r.OpenKeyReadOnly(ck + 'Device Parameters') ...
        begin
          if r.OpenKeyReadOnly(ck) and r.HasSubKeys then
          begin
            rs := findFriendlyName(ck, port);
            if rs <> '' then Break;
          end; // if not (r.OpenKeyReadOnly(ck) and r.HasSubKeys) ...
        end; // if not r.OpenKeyReadOnly(ck + 'Device Parameters') ...
      end; // for i := 0 to k.Count - 1 ...
      result := rs;
    finally
      r.Free;
      k.Free;
    end; // try ...
  end; // function findFriendlyName ...

begin
  v      := TStringList.Create;
  l      := TStringList.Create;
  reg    := TRegistry.Create;
  Result := '';

  try
    reg.RootKey := HKEY_LOCAL_MACHINE;
    if reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM') then
    begin
      reg.GetValueNames(l);

      for n := 0 to l.Count - 1 do
      begin
        pn := reg.ReadString(l[n]);
        fn := findFriendlyName('\System\CurrentControlSet\Enum\', pn);
        v.Add(pn + ' = '+ fn);
      end; // for n := 0 to l.Count - 1 ...

      Result := v.CommaText;
    end; // if reg.OpenKeyReadOnly('HARDWARE\DEVICEMAP\SERIALCOMM') ...
  finally
    reg.Free;
    v.Free;
  end; // try ...
end;

La función busca los puertos COM enumerados en HKEY_LOCAL_MACHINE\HARDWARE\DEVICEMAP\SERIALCOMM, luego busca recursivamente en HKEY_LOCAL_MACHINE\System\CurrentControlSet\Enum\ por el nombre común (FriendlyName) del dispositivo. Devuelve algo así: "COM3 = Prolific USB-to-Serial Comm Port (COM3)","COM17 = MTK6225 USB Modem Driver (COM17)"