Professional Positions

  • Present 2015

    Post-doctoral Researcher

    CONICET, Institute of Software Engineering (ISISTAN)

  • Present 2011

    Graduate Teaching Assistant

    UNICEN University, Faculty of Ciencias Exactas

  • 2014 2010

    Ph.D. Candidate

    CONICET, Institute of Software Engineering (ISISTAN)

  • 2010 2009

    Java / Mobile Software Developer

    Q4Tech, Research & Development Department

Education & Training

  • Ph.D. 2015

    Doctor in Computer Science

    UNICEN University - ISISTAN Research Institute

  • M.Eng. 2013

    Master in System Engineering

    UNICEN University - ISISTAN Research Institute

  • B.S.E. 2009

    Bachelor in System Engineering

    UNICEN University - Faculty of Ciencias Exactas

Fellowships, Awards and Grants

  • 2015
    CONICET Researcher Grant: Ingreso a Carrera de Investigador Científico (CIC)
    image
    Selected to join the National Council on Scientific and Technical Research (CONICET) as an Assistant Researcher (designation pending) | From April 2016 to March 2021
  • 2015
    Best paper award at SoSyM/MODELS'15
    image
    Best paper award granted by the journal of Software & Systems Modeling, presented at ACM/IEEE 18th International Conference on Model Driven Engineering Languages and Systems (MODELS’15) - Ottawa, ON, Canada
  • 2014
    CONICET Fellowship Grant: Beca Posdoctoral
    image
    Postdoctoral research fellowship grant awarded by the National Council on Scientific and Technical Research (CONICET), Argentina | From April 2015 to March 2017
  • 2012
    CONICET Fellowship Grant: Beca Tipo II
    image
    Postgraduate research fellowship grant awarded by the National Council on Scientific and Technical Research (CONICET), Argentina | From April 2013 to March 2015
  • 2009
    CONICET Fellowship Grant: Beca Tipo I
    image
    Postgraduate research fellowship grant awarded by the National Council on Scientific and Technical Research (CONICET), Argentina | From April 2010 to March 2013.
  • 2009
    CIC Fellowship Grant: Beca de Perfeccionamiento BP10
    image
    Postgraduate research fellowship grant awarded by the Committee for Scientific Research (CIC), Argentina | Rejected by incompatibility with CONICET award.

Fellow Researchers

Claudia Marcos

Associate researcher

Follow

Andrés Díaz-Pace

Independent researcher

Follow

Santiago Vidal

Assistant researcher

Follow

Matias Nicoletti

Doctoral student

Follow

Research Team

These are the people with whom I work every day. We are jointly conducting studies on several projects and we share similar research interests. I eagerly suggest you to check their websites for further information.

Research Projects

  • image

    Recomendación de Anomalias de Código relevantes a nivel Arquitectural

    Proyecto de Cooperación MINCYT-CAPES

    Funded by: Ministerio de Ciencia, Tecnología e Innovación Productiva de la República Argentina (MINCYT) y la Coordinación de Perfeccionamiento del Personal de Nivel Superior (CAPES) de Brasil

    Directed by: PhD. Andrés Díaz-Pace and PhD. Alessandro Garcia

    Dates: from Jan. 2014 to Dec. 2015

  • image

    Estrategias para Mejorar la Evolución y el Mantenimiento de Sistemas

    Programa de Subsidios Proyectos de Investigacion Cientifica y Tecnologica - Resolución N° 243/13

    Funded by: Comisión de Investigaciones Científicas (CIC)

    Directed by: PhD. Claudia Marcos

    Dates: from Jan. 2014 to Dec. 2015

  • image

    EVOL: Monitoreo y Control de Calidad de Software

    Proyecto de Cooperación MINCYT-CONICYT

    Funded by: Ministerio de Ciencia, Tecnología e Innovación Productiva de la República Argentina (MINCYT) y la Comisión Nacional de Investigación Científica y Tecnológica (CONICYT) de Chile

    Directed by: PhD. Claudia Marcos and PhD. Alexandre Bergel

    Dates: from Jan. 2013 to Dec. 2014

  • image

    Asistencia Inteligente a Usuarios en Aplicaciones de Escritorio, Web y Móviles

    Proyecto de Incentivos 03/C237

    Funded by: Secretaría De Políticas Universitarias (SPU) Y Ministerio De Educación

    Directed by: Ph.D. Alfredo Teyseyre and Ph.D. Marcelo Campo

    Dates: from Jan. 2012 to Dec. 2014

    Description (in spanish): El objetivo de este proyecto es la adaptación de métodos y técnicas inteligentes existentes, definición de nuevos métodos y técnicas, y su evaluación para su integración en agentes personales y sistemas de recomendación. Fundamentalmente la utilización de estas técnicas permitirá la construcción de herramientas que provean asistencia efectiva mediante la adaptación de soluciones tanto a la plataforma como al dominio del problema, y como así también a las características individuales de los usuarios (preferencias, costumbres, hábitos, etc.). Más precisamente estas técnicas y métodos serán analizados en dos dimensiones ortogonales (plataforma y dominio de estudio). Esto se debe a que la asistencia que se puede ofrecer a los usuarios resulta más efectiva si se tienen en cuenta tanto los requerimientos de la plataforma como de los dominios específicos.

    Plataforma/contexto:

    • Escritorio: aplicaciones tradicionales que corren en una computadora de escritorio.
    • Web: aplicaciones que hacen uso de Internet o acceden mediante navegadores.
    • Móvil: aplicaciones que corren en dispositivos móviles como por ejemplo teléfonos inteligentes o tablets.

    Dominio de estudio específico:

    Asistencia en el proceso de desarrollo de software: el desarrollo de software es una tarea compleja y ardua. En este contexto la asistencia inteligente puede resultar de gran utilidad a los ingenieros, desarrolladores y managers en numerosas actividades (por ejemplo, navegación de repositorios de software, debugging, refactorización, etc.) reduciendo así los tiempos y los costos de desarrollo.

    Asistencia y soporte en educación: E-learning es un área que día a día cobra un creciente interés como una forma de ofrecer formación a personas que por diversas razones no pueden asistir a cursos presenciales. Básicamente E-learning involucra todas las formas de enseñanza y aprendizaje soportado por computadoras, por ejemplo sistemas web o mundos virtuales para aprendizaje. En particular, la incorporación de asistencia inteligente en esta área permite que el sistema se adapte a las características de aprendizaje de cada estudiante y así lograr que el proceso de aprendizaje sea más efectivo y motivante.

    Asistencia en el descubrimiento y análisis de información: hoy en día los usuarios deben tratar con el crecimiento explosivo y continuo de la información disponible. Los sistemas de recomendación y agentes de información permiten descubrir y analizar la información existente permitiendo que los usuarios puedan más rápidamente encontrar la información relevante.

  • image

    Agentes Inteligentes aplicados a la Gestión de Documentación de Arquitecturas de Software en Redes de Trabajo

    CONICET PICT Project (PICT-2010-2247)

    Funded by: Agencia Nacional de Promoción Científica y Tecnológica

    Directed by: Ph.D. Andrés Díaz Pace

    Dates: from Oct. 2011 to Sep. 2013

    Description (in spanish): El desarrollo de software involucra normalmente el trabajo de varios grupos de personas que interactúan durante las etapas del desarrollo. El éxito de un proyecto depende tanto de aspectos técnicos como de la calidad y efectividad de la comunicación entre los grupos. Durante un desarrollo de software, la documentación del diseño arquitectónico tiene un rol preponderante, ya que sirve como artefacto para comunicar y discutir decisiones entre los actores del sistema. Cada uno estos actores posee sus propios objetivos e intereses respecto al sistema.Sin embargo, generar y mantener actualizada la documentación arquitectónica en un desarrollo, de manera que resulte realmente útil todos los actores, es una tarea costosa y cuyo valor no siempre se percibe en los niveles gerenciales. Este problema impide un aprovechamiento de las decisiones arquitectónicas durante el desarrollo, y a menudo producen re-trabajo y falencias de calidad en el software. El presente proyecto se centra en técnicas basadas en agentes para el análisis de una red de trabajo (formada por distintos actores), a fin de mejorar la generación, uso y mantenimiento de la documentación arquitectónica dentro de la red. Específicamente, se investigaran mecanismos basados en perfiles de usuario y análisis de redes sociales para ajustar la estructura y contenidos de la documentación al contexto de la red y sus integrantes. Dado el gran volumen de interacciones en una red de desarrollo y la necesidad de satisfacer los objetivos de múltiples actores, se implementaran estos mecanismos mediante prototipos de agentes inteligentes para asistir a un analista humano en las tareas de gestión involucradas.

  • image

    Métodos Inteligentes para Asistencia Personalizada y Adaptativa a Usuarios

    Proyecto de Incentivos 03/C197

    Funded by: Secretaría De Políticas Universitarias (SPU) Y Ministerio De Educación

    Directed by: PhD. Daniela Godoy and PhD. Marcelo Campo

    Dates: from Jan. 2009 to Dec. 2012

  • image

    Legacy Systems Maintenance through Aspect-Oriented Programming

    Internal Project

    Directed by: PhD. Claudia Marcos

    Members: Alejandro Rago, Esteban S. Abait and Santiago A. Vidal

    Description:

    Crosscutting concerns hinders software evolution. Such crosscutting concerns are features of a program that can not be cleanly modularized given the current modular decomposition of the program. Aspect-Oriented Software Development (AOSD) is a paradigm that aims to better the separation of concerns in software and hence improve its maintainability and evolution. However the migration of an object-oriented software system into an aspect-oriented one is far from being a trivial task. Due to the large size of the implementation and the lack of sound documentation there is a need for tools that automate the identification, quantification and refactorization of crosscutting concern into aspects of the new systems.

    The main purpose for this project is the development of tools and techniques that automate the identification of crosscutting concerns (aspect mining) and its encapsulation as aspects of the system (aspect refactoring). We are working in the following directions:

    • Development of aspect mining techniques based on dynamic analysis and crosscutting concern sorts.
    • Development of aspect refactoring tools to assist a software developer in the encapsulation of crosscutting concerns identified by aspect mining techniques.
    • Development of tools and techniques to ease the evolution of aspect-oriented software systems.

Filter by type:

  

TeXTracT: a Web-based Tool for Building NLP-enabled Applications

Alejandro Rago, Facundo M. Ramos, Juan I. Velez, Andrés Díaz-Pace, Claudia Marcos
Conference Paper In Proceedings of the XVII Argentine Symposium on Software Engineering (ASSE'16), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'16) | Buenos Aires, Argentina, September 2016 | Publisher: SADIO | ISSN: 1850-2792
image

Abstract

Over the last few years, the software industry has showed an increasing interest for applications with Natural Language Processing (NLP) capabilities. Several cloud-based solutions have emerged with the purpose of simplifying and streamlining the integration of NLP techniques via Web services. These NLP techniques cover tasks such as language detection, entity recognition, sentiment analysis, classification, among others. However, the services provided are not always as extensible and configurable as a developer may want, preventing their use in industry-grade developments and limiting their adoption in specialized domains (e.g., for analyzing technical documentation). In this context, we have developed a tool called TeXTracT that is designed to be composable, extensible, configurable and accessible. In our tool, NLP techniques can be accessed independently and orchestrated in a pipeline via RESTful Web services. Moreover, the architecture supports the setup and deployment of NLP techniques on demand. The NLP infrastructure is built upon the UIMA framework, which defines communication protocols and uniform service interfaces for text analysis modules. TeXTracT has been evaluated in two case-studies to assess its pros and cons.

Opportunities for Analyzing Hardware Specifications with NLP Techniques

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the 3rd Workshop on Design Automation for Understanding Hardware Designs (DUHDe'16) at the Design, Automation and Test in Europe Conference and Exhibition (DATE'16) | Dresden, Germany, March 2016
image

Abstract

Hardware design is a mature discipline that heavily relies on complex models to create the blueprints of a system and special notations to describe the expected behavior of its components. However, hardware engineers frequently have to go through multiple specifications written in natural language to identify components, constraints and assertions and translate them to more formal expressions in order to enable automated verifications and consistency checks. For this reason, computer-assisted tools capable of processing and understanding hardware documentation can be of great help to assist and guide engineers in difficult and otherwise error-prone activities. In previous works, we have explored several Natural Language Processing (NLP) techniques for the analysis of requirements and architecture specifications with promising results. In this article, we report on some interesting applications we developed for inspecting Software Engineering documentation and discuss their potential applications to automated hardware design.

Identifying Duplicate Functionality in Textual Use Cases by Aligning Semantic Actions (SoSyM abstract)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the ACM/IEEE 18th International Conference on Model Driven Engineering Languages and Systems (MODELS'15) | Ottawa, Ontario, Canada, September 2015 | Pages 442 | Publisher: ACM/IEEE | ISBN: 978-1-4673-6908-4/15 | DOI: 10.1109/MODELS.2015.7338276
image

Abstract

Developing high-quality requirements specifications often demands a thoughtful analysis and an adequate level of expertise from analysts. Although requirements modeling techniques provide mechanisms for abstraction and clarity, fostering the reuse of shared functionality (e.g., via UML relationships for use cases), they are seldom employed in practice. A particular quality problem of textual requirements, such as use cases, is that of having duplicate pieces of functionality scattered across the specifications. Duplicate functionality can sometimes improve readability for end users, but hinders development-related tasks such as effort estimation, feature prioritization and maintenance, among others. Unfortunately, inspecting textual requirements by hand in order to deal with redundant functionality can be an arduous, time-consuming and error-prone activity for analysts. In this context, we introduce a novel approach called ReqAligner that aids analysts to spot signs of duplication in use cases in an automated fashion. To do so, ReqAligner combines several text processing techniques, such as a use-case-aware classifier and a customized algorithm for sequence alignment. Essentially, the classifier converts the use cases into an abstract representation that consists of sequences of semantic actions, and then these sequences are compared pairwise in order to identify action matches, which become possible duplications. We have applied our technique to five real-world specifications, achieving promising results and identifying many sources of duplication in the use cases.

REAssistant: a Tool for Identifying Crosscutting Concerns in Textual Requirements (Tool Demonstration)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the Demo and Poster Session at the ACM/IEEE 18th International Conference on Model Driven Engineering Languages and Systems (MODELS'15) | Volume 1554 - Ottawa, Ontario, Canada, September 2015 | Pages 32-35 | Publisher: ACM/IEEE | ISSN: 1613-0073
image

Abstract

Use case modeling is very useful to capture requirements and communicate with the stakeholders. Use cases normally have textual specifications that describe the interactions between the system and external actors. However, since use cases are specified from a functional perspective, concerns that do not fit well this decomposition criterion are kept away from the analysts’ eye and might end up intermingled in multiple use cases. These crosscutting concerns (CCCs) are generally relevant for analysis, design and implementation activities and should be dealt with from early stages. Unfortunately, identifying such concerns by hand is a cumbersome and error-prone task, mainly because it requires a semantic interpretation of textual requirements. To ease the analysis of CCCs, we have developed an automated tool called REAssistant that is able to extract semantic information from textual use cases and reveal candidate CCCs, helping analysts to reason about them before making important commitments in the development. Our tool performs a series of advanced NLP analyses based on the UIMA framework. Analysts can define concern-specific queries in the tool to search for CCCs in the requirements via a flexible SQL- like language. In this article, we briefly discuss the technologies behind the tool and explain how an end user can interact with REAssistant to analyze CCCs in use case specifications. A short video explaining the main features of the tool can be found at https://youtu.be/i3kSJil_2eg. The REAssistant tool can be downloaded from https://code.google.com/p/reassistant.

An Automated Approach for Assisting the Analysis of Textual Requirements

Alejandro Rago
Report Doctorate Thesis in Computer Science | March 2015 | Advisors: Claudia Marcos and Andrés Díaz-Pace | UNICEN University - ISISTAN Research Institute
image

Abstract

The analysis of requirements is one of the most critical aspects of software projects today, mainly because requirements represent the functionality of a system and lay the foundation for its development. Specifying high-quality requirements is not trivial because it requires both a thoughtful analysis of the stakeholders’ needs and an appropriate structuring of the functionality. There are two particular problems in requirements documentation that analysts have to pay attention to. First, although requirements modeling techniques such as use cases provide mechanisms for abstraction and clarity, fostering the reuse of shared functionality (e.g., via UML relationships), these mechanisms are seldom employed in practice. Consequently, textual requirements usually contain duplicated descriptions of functionality. Duplicate functionality can sometimes improve readability for end users, but hinders development-related tasks such as effort estimation, feature prioritization and maintenance, among others. Second, those requirements that “shape” an architecture the most, called architecturally-significant requirements (ASRs), are often documented intertwined with core functionalities. This situation may hinder the design decisions made by architects, because some relevant information about those ASRs might be missing. In the long run, this situation might even have an impact on the resulting architecture. Usually, use-case specifications hold evidences about candidate ASRs that appear scattered throughout the documents. These evidences should be further explored by the analysts or architects so as to better comprehend the ASRs of the system. Manually inspecting textual requirements to find the two problems above requires a meticulous revision of the system documentation, which often takes considerable time and effort. Hence, the application of automated techniques based on text processing for identifying deficiencies in requirements can be very useful to reduce the analysts’ efforts and improve the quality of the documents. In this thesis, we present a semi-automated approach for assisting the analysis of textual requirements. Concretely, the approach aims at revealing duplicate functionality and uncovering evidences of ASRs in use case specifications. The approach applies a tandem of advanced natural language processing techniques (e.g., semantic analysis) for inspecting textual requirements. Especially, the approach can identify abstractions being typical of use cases. The approach is divided into two tools, called ReqAligner and REAssistant, which are able to i) detect duplicate functionality by means of algorithms for aligning semantic actions and ii) identify ASRs using rules that codify search patterns, respectively. ReqAligner recommends refactorings to improve the modularity of the requirements and REAssistant relies on the semantic knowledge of the text to infer hidden ASRs. Both tools are built upon the UIMA architecture and equipped with several text analyses techniques. The approach was evaluated in case-studies, achieving promising results. On one hand, the ReqAligner tool recognized most duplications of functionality present in five real-world systems and correctly suggested refactorings to remove the redundancy of the text. On the other hand, the REAssistant tool identified the majority of the sentences affected by ASRs in three systems, making few mistakes along the process. Moreover, the performance of the semantic rules in REAssistant was comparable to that of human analysts and better than that of a third-party tool.

Improving Use Case Specifications by means of Refactoring (available online)

Claudia Marcos, Alejandro Rago, Andrés Díaz-Pace
Journal Paper IEEE Latin America Transactions [JCR® 2014 I.F. 0.326] | Volume 13, Issue 4, April 2015 | Pages 1135-1140 | Publisher: IEEE Region 9 | ISSN: 1548-0992 | DOI: 10.1109/TLA.2015.7106367
image

Abstract

This work presents a semi-automatic tool for use case refactoring called RE-USE. This tool discovers existing quality problems in use cases and suggests a prioritized set of candidate refactorings to functional analysts. The analyst then reviews the recommendation list and selects the most important refactoring. The tool applies the chosen refactoring and returns an improved specification. The tool effectiveness in detecting existing quality problems and recommending proper refactorings was assessed using a set of case studies related to real-world systems obtaining encouraging results.

Identifying Duplicate Functionality in Textual Use Cases by Aligning Semantic Actions (available online)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Journal Paper Software & Systems Modeling [JCR® 2014 I.F. 1.408] | Publisher: Springer Verlag | ISSN: 1619-1366 (Print Version) - 1619-1374 (Online Version) | DOI: 10.1007/s10270-014-0431-3
image

Abstract

Developing high-quality requirements specifications often demands a thoughtful analysis and an adequate level of expertise from analysts. Although requirements modeling techniques provide mechanisms for abstraction and clarity, fostering the reuse of shared functionality (e.g., via UML relationships for use cases), they are seldom employed in practice. A particular quality problem of textual requirements, such as use cases, is that of having duplicate pieces of functionality scattered across the specifications. Duplicate functionality can sometimes improve readability for end users, but hinders development-related tasks such as effort estimation, feature prioritization and maintenance, among others. Unfortunately, inspecting textual requirements by hand in order to deal with redundant functionality can be an arduous, time-consuming and error-prone activity for analysts. In this context, we introduce a novel approach called ReqAligner that aids analysts to spot signs of duplication in use cases in an automated fashion. To do so, ReqAligner combines several text processing techniques, such as a use-case-aware classifier and a customized algorithm for sequence alignment. Essentially, the classifier converts the use cases into an abstract representation that consists of sequences of semantic actions, and then these sequences are compared pairwise in order to identify action matches, which become possible duplications. We have applied our technique to five real-world specifications, achieving promising results and identifying many sources of duplication in the use cases.

An Approach for Automating Use Case Refactoring (selected paper from ASSE'13 conference)

Alejandro Rago, Paula Frade, Miguel Ruival, Claudia Marcos
Journal Paper Electronic Journal of SADIO [LatinIndex] | Volume 13, Issue 1, September 2014 | Publisher: SADIO | ISSN: 1514-6774
image

Una Comparación de Técnicas de NLP Semánticas para Analizar Casos de Uso (in spanish)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the 2nd IEEE Biennial Congress of Argentina (ARGENCON'14) | San Carlos de Bariloche, Argentina, June 2014 | Pages 479-484 | Publisher: IEEE Argentina | ISSN: 1850-0870 | ISBN: 978-1-4799-4270-1 | DOI: 10.1109/ARGENCON.2014.6868539
image

Abstract

The inspection of documents written in natural language with computers has become feasible thanks to the advances in Natural Language Processing (NLP) techniques. However, certain applications require a deeper semantic analysis of the text to produce good results. In this article, we present an exploratory study of semantic-aware NLP techniques for discovering latent concerns in use case specifications. For this purpose, we propose two NLP techniques, namely: semantic clustering and semantically-enriched rules. After evaluating these two techniques and comparing them with a technique developed by other researchers, results have showed that semantic NLP techniques hold great potential for detecting candidate concerns. Particularly, if these techniques are properly configured, they can help to reduce the efforts of requirement analysts and promote better quality in software development.

Towards Recovering Architectural Information from Images of Architectural Diagrams

Emmanuel Maggiori, Luciano Gervasoni, Matías Antúnez, Alejandro Rago, Andrés Díaz-Pace
Conference Paper In Proceedings of the XV Argentine Symposium on Software Engineering (ASSE'14), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'14) | Buenos Aires, Argentina, September 2014 | Publisher: SADIO | ISSN: 1850-2792
image

Abstract

The architecture of a software system is often described with diagrams embedded in the documentation. However, these diagrams are normally stored and shared as images, losing track of model-level architectural information and refraining software engineers from working on the architectural model later on. In this context, tools able to extract architectural information from images can be of great help. In this article, we present a framework called IMEAV for processing architectural diagrams and recovering information from them. We have instantiated our framework to analyze “module views” and evaluated this prototype with an image dataset. Results have been encouraging, showing a good accuracy for recognizing modules, relations and textual features.

Improving Requirements with NLP Techniques (extended abstract & poster)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In 2nd School of International Joint Conference on Artificial Intelligence (IJCAI'14), co-located with the XV Argentine Symposium on Artificial Intelligence (ASAI'14), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'14) | Buenos Aires, Argentina, September 2014 | Publisher: SADIO | ISSN: 1850-2784
image

Assisting Requirements Analysts to Find Latent Concerns with REAssistant (available online)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Journal Paper Automated Software Engineering [JCR® 2014 I.F. 1.733] | Publisher: Springer Netherlands | ISSN: 0928-8910 (Print Version) - 1573-7535 (Online Version) | DOI: 10.1007/s10515-014-0156-0
image

Abstract

Textual requirements are very common in software projects. However, this format of requirements often keeps relevant concerns (e.g., performance, synchronization, data access, etc.) from the analyst’s view, because their semantics are implicit in the text. Thus, analysts must carefully review requirements documents in order to identify key concerns and their effects. Concern mining tools based on NLP techniques can help in this activity. Nonetheless, existing tools cannot always detect all the crosscutting effects of a given concern on different requirements sections, as this detection requires a semantic analysis of the text. In this work, we describe an automated tool called REAssistant that supports the extraction of semantic information from textual use cases in order to reveal latent crosscutting concerns. To enable the analysis of use cases, we apply a tandem of advanced NLP techniques (e.g, dependency parsing, semantic role labeling, and domain actions) built on the UIMA framework, which generates different annotations for the use cases. Then, REAssistant allows analysts to query these annotations via concern-specific rules in order to identify all the effects of a given concern. The REAssistant tool has been evaluated with several case-studies, showing good results when compared to a manual identification of concerns and a third- party tool. In particular, the tool achieved a remarkable recall regarding the detection of crosscutting concern effects.

Un Enfoque para Automatizar la Refactorización de Casos de Uso (in spanish)

Alejandro Rago, Paula Frade, Miguel Ruival, Claudia Marcos
Conference Paper In Proceedings of the IX Argentine Symposium on Software Engineering (ASSE'13), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'13) | Cordoba, Argentina, September 2013 | Pages 183-197 | Publisher: SADIO | ISSN: 1850-2792
image

Abstract

Llevar a cabo las actividades de captura y modelamiento de requerimientos no es una tarea sencilla. Ésta requiere realizar un análisis profundo de las necesidades de los clientes y demanda cierto grado de experiencia de los analistas. Para comunicar satisfactoriamente los requerimientos, se deben aprovechar los instrumentos provistos por las técnicas de especificación (por ejemplo, relaciones entre casos de uso) de forma tal que se evite la redundancia y se promueva el reuso y abstracción de comportamiento. En la práctica, estos instrumentos no son utilizados tanto como deberían y es necesario que los analistas revisen los documentos periódicamente para preservar la calidad de los requerimientos. Desafortunadamente, inspeccionar los documentos manualmente es una tarea compleja y ardua. Por ello, en este artículo presentamos un enfoque semi-automático que permite encontrar defectos en las especificaciones de casos de uso y solucionarlos por medio de refactorizaciones. El enfoque implementa heurísticas avanzadas para localizar los defectos (por ej., comportamientos duplicados) y mecanismos que agilizan su resolución. El enfoque fue evaluado en sistemas reales, obteniendo resultados preliminares prometedores.

Tool Support for Identifying Crosscutting Concerns in Use Case Specifications

Alejandro Rago
Report Master Thesis in Systems Engineering | March 2013 | Advisors: Claudia Marcos and Andrés Díaz-Pace | UNICEN University - ISISTAN Research Institute
image

Abstract

The definition of software requirements is a crucial activity in software development. The analysis of requirements fundamentally serves to understand the stakeholders’ needs and to build the “right” system. Software requirements are commonly specified in natural language, because it simplifies the communication among stakeholders. Unfortunately, some requirements approaches still have issues to capture some central concerns. For instance, use case specifications present a limited support to handle quality-related concerns. These kind of concerns, called crosscutting concerns, generally remain understated and scattered across multiple use case documents. Crosscutting concerns often produce a number of negative effects in the development of a system. A typical alternative to overcome these problems is to have a requirements analyst to review the specifications manually searching for latent concerns. Since this review is an arduous and error-prone activity, several tools for automating the detection of crosscutting concerns have been developed with some success. However, these mining tools have some problems that can reduce their usefulness. For instance, tools often fail to detect relevant concerns due to their basic analysis of the text, and have poor visualization and edition support for analysts to interpret the results. Also, tools tend to be inflexible in terms of searching techniques and customization capabilities, so as for using the tool in other software domains.
In this thesis, we have developed a flexible tool to discover latent crosscutting concerns in use case specifications. To do so, our research focused on the application of semantic analyses techniques to textual requirements. In particular, we have built a prototype tool called REAssistant. This tool was developed as a set of Eclipse plugins and has flexible text processing features that rely on the UIMA framework. REAssistant is equipped with several implementations of text analysis algorithms. Furthermore, several editors and viewers are available within the Eclipse interface for analysts to refine and visualize the concerns found with the tool. We argue that our tool contributes features for an improved development and integration of techniques for processing and analyzing textual requirements.
We have exercised REAssistant capabilities by developing two semantic techniques for finding latent concerns. The first technique combines clustering algorithms with measures of semantic relatedness be- tween words. In this way, the technique can identify semantically-related behaviors that are found scattered across the use cases, and suggest those behaviors as candidate concerns to an analyst. The second technique combines a querying language with semantic information obtained from the textual use cases, allowing analysts to define and execute concern-specific queries to uncover particular crosscutting concerns. We have evaluated our tool with three case studies, and compared its results against those of a third-party tool and of human analysts. This evaluation produced promising results, making REAssistant a good approach for discovering and analyzing concerns in real-word projects.

Text Analytics for Discovering Concerns in Requirements Documents

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the XIII Argentine Symposium on Software Engineering (ASSE'12), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'12) | La Plata, Argentina, September 2012 | Pages 185-198 | Publisher: SADIO | ISSN: 1850-2792
image

Abstract

Recent trends in the software engineering community advocate for the improvement of textual requirements using (semi-)automated tools. In particular, the detection of incomplete or understated concerns at early development stages hold potential, due to the negative effects of untreated concerns on the development. Assistive tools can be of great help for analysts to get a quick picture of the requirements and narrow down the search for latent concerns. In this article, we present a tool called REAssistant that supports the process of discovering concerns in textual specifications. To do so, the tool relies on the UIMA framework and EMF-based technologies to provide an extensible architecture for concern-related analyses. Currently, the tool is configured to process textual use cases by using a number of textual analytics modules that identify lexical, syntactical and semantic entities in the specifications. We have conducted a preliminary evaluation of our tool in two case studies, obtaining promising results when comparing to manual inspections and to another tool.

Uncovering Quality-attribute Concerns in Use Case Specifications via Early Aspect Mining

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Journal Paper Requirements Engineering [JCR® 2013 I.F. 1.147] | Volume 18, Issue 1, March 2013 | Pages 67-84 | Publisher: Springer London | ISSN: 0947-3602 (Print Version) - 1432-010X (Online Version) | DOI: 10.1007/s00766-011-0142-z
image

Abstract

Quality-attribute requirements describe constraints on the development and behavior of a software system, and their satisfaction is key for the success of a software project. Detecting and analyzing quality attributes in early development stages provides insights for system design, reduces risks, and ultimately improves the developers’ understanding of the system. A common problem, however, is that quality-attribute information tends to be understated in requirements specifications and scattered across several documents. Thus, making the quality attributes first-class citizens becomes usually a time-consuming task for analysts. Recent developments have made it possible to mine concerns semi-automatically from textual documents. Leveraging on these ideas, we present a semi-automated approach to identify latent quality attributes that works in two stages. First, a mining tool extracts early aspects from use cases, and then these aspects are processed to derive candidate quality attributes. This derivation is based on an ontology of quality-attribute scenarios. We have built a prototype tool called QAMiner to implement our approach. The evaluation of this tool in two case studies from the literature has shown interesting results. As main contribution, we argue that our approach can help analysts to skim requirements documents and quickly produce a list of potential quality attributes for the system.

Can Quality-attribute Requirements be Identified from Early Aspects? QAMiner: a Preliminary Approach to Quality-attribute Mining

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the XII Argentine Symposium on Software Engineering (ASSE'11), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'11) | Cordoba, Argentina, September 2011 | Pages 192-203 | Publisher: SADIO | ISSN: 1850-2792
image

Abstract

Specifying good software requirement documents is a difficult task. Many software projects fail because of the omission or bad encapsulation of concerns. A practical way to solve these problems is to use advanced separation of concern techniques, such as aspect-orientation. However, quality attributes are not completely addressed by them. In this work, we present a novel approach to uncover quality-attribute requirements. The identification is performed in an automated-fashion, relying on early aspects to guide it and using ontologies to model domain knowledge. Our tool was evaluated on two well-known systems, and contrasted with architectural documents.

Classification of Domain Actions in Software Requirement Specifications

Alejandro Rago
Technical Paper Technical Report TR002-2011 | Tandil, Argentina, October 2011 | UNICEN University - ISISTAN Research Institute
image

Facilitando el Diseño de Software mediante una Mejor Separación de Concerns desde Etapas Tempranas (abstract paper in spanish)

Alejandro Rago, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the XI Argentine Symposium on Software Engineering (ASSE'10), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'10) | Buenos Aires, Argentina, September 2010 | Pages 613 | Publisher: SADIO | ISSN: 1850-2792
image

Early Aspect Identification from Use Cases using NLP and WSD Techniques

Alejandro Rago, Esteban Abait, Claudia Marcos, Andrés Díaz-Pace
Conference Paper In Proceedings of the Workshop on Early Aspects held at the 15th International Conference on Aspect-Oriented Software Development (AOSD’09) | Charlottesville, Virginia, USA, January 2009 | Pages 19–24 | Publisher: ACM New York, NY, USA ©2009 | ISBN: 978-1-60558-456-0 | DOI: 10.1145/1509825.1509830
image

Abstract

In this article, we present a semi-automated approach for identifying candidate early aspects in requirements specifications. This approach aims at improving the precision of the aspect identification process in use cases, and also solving some problems of existing aspect mining techniques caused by the vagueness and ambiguity of text in natural language. To do so, we apply a combination of text analysis techniques such as: natural language processing (NLP) and word sense disambiguation (WSD). As a result, our approach is able to generate a graph of candidate concerns that crosscut the use cases, as well as a ranking of these concerns according to their importance. The developer then selects which concerns are relevant for his/her domain. Although there are still some challenges, we argue that this approach can be easily integrated into a UML development methodology, leading to improved requirements elicitation.

Técnicas de NLP y WSD Asistiendo Al Desarrollo de Software Orientado a Aspectos (in spanish)

Alejandro Rago, Claudia Marcos
Conference Paper In Proceedings of the X Argentine Symposium on Artificial Intelligence (ASAI'09), held at Jornadas Argentinas de Computación e Investigación Operativa (JAIIO'09) | Mar del Plata, Argentina, September 2009 | Pages 179-190 | Publisher: SADIO | ISSN: 1850-2784
image

Abstract

El Desarrollo de Software Orientado a Aspectos (DSOA) provee medios sistemáticos para la identificación, modularización, representación y composición de crosscutting concerns en unidades denominadas aspectos. La ingeniería de requerimientos orientada a aspectos intenta identificar los posibles aspectos desde las primeras etapas del ciclo de vida de un sistema. La identificación de los aspectos en esta etapa mejora la traceabilidad entre los requerimientos y artefactos posteriores, facilita una estimación más sencilla del impacto del cambio y, particularmente, reduce el peligro de cambios no esperados en los productos de software. En este trabajo se presenta un enfoque de identificación de aspectos tempranos el cual realiza un análisis sintáctico y semántico de la funcionalidad del sistema especificada por medio de casos de uso. Para identificar los aspectos se utilizan técnicas de NLP (Natural Language Processing) y algoritmos WSD (Word Sense Disambiguation).

A Semantic-aware Approach for Identifying Early Aspects (in spanish)

Alejandro Rago
Report Bachelor Thesis in Systems Engineering | June 2009 | Advisors: Claudia Marcos | UNICEN University
image

Abstract

En este trabajo se presenta una nueva técnica de identificación de aspectos candidatos desde las especificaciones de requerimientos. Para realizarlo se efectúo un análisis de los enfoques presentados en varias publicaciones, centrando la atención en aquellos que soportan la identificación semi-automatizada de aspectos en especificaciones de requerimientos tanto de sistemas nuevos como legados, y se los comparó utilizando un grupo de criterios establecidos con el propósito de criticar las características de cada uno de los enfoques existentes con respecto a los demás. Con la adquisición de este conocimiento, se determinaron las propiedades que debería tener una técnica para intentar solucionar los problemas observados utilizando las mejores estrategias conocidas.
Se desarrolló una técnica de identificación de aspectos candidatos dada la especificación de requerimientos en casos de usos. Se definió un proceso que ataca los defectos detectados, generalmente causados por las ambigüedades y vaguezas del texto natural. La técnica utilizó estrategias tales como análisis del texto con procesadores de lenguaje natural, utilización de algoritmos de desambigüación semántica, análisis de las sentencias utilizando patrones sintácticos, explotación de relaciones semánticas para agrupar palabras semánticamente relacionadas, generación de estructuras de navegación de concerns para soportar la identificación y el filtrado de aspectos, y ponderación de los aspectos según su importancia.
A fin de comprobar la veracidad del desarrollo, se llevó a cabo una evaluación de la herramienta sobre casos de estudio reales. Se recopilaron cinco especificaciones de sistemas, todas efectuadas usando UML como lenguaje de modelamiento. Tres de estos sistemas fueron extraídos de trabajos internos de nuestra facultad, mientras que los restantes dos se obtuvieron de sistemas desarrollados por IBM para demostrar su línea de sistemas Rational. Se evaluaron en conjunto a nuestra técnica propuesta otras dos técnicas de identificación de aspectos. La primera cuenta con una técnica sencilla pero muy efectiva para minar aspectos. La segunda es una técnica más elaborada, que utiliza conceptos más avanzados para detectar los aspectos. Se analizaron detalladamente estos casos de estudio exponiendo el raciocinio de cada uno de los aspectos identificados por las tres técnicas evaluadas.
Los resultados de los análisis de las tres técnicas sobre los casos de estudio fueron sintetizados utilizando métricas específicas, las cuales permitieron determinar de manera sistemática las conclusiones necesarias para calificar la calidad de cada una de las técnicas.

Análisis Semántico para la Identificación de Aspectos (in spanish)

Alejandro Rago, Claudia Marcos
Conference Paper In Proceedings of Encuentro Chileno de Computación (ECC'08), held at the International Conference of the Sociedad Chilena de Ciencia de la Computación (SCCC'08) | Punta Arenas, Chile, October 2008 | Pages 6.1–6.10 | ISBN: 978-956-319-507-1

Abstract

Este trabajo presenta una técnica de identificación de aspectos candidatos dada la especificación de requerimientos. Dicha técnica fue desarrollada con el objetivo de solucionar los problemas de las propuestas ya existentes, mejorando principalmente la precisión de la identificación. El proceso propuesto define una técnica automatizada que resuelve los defectos detectados, generalmente causados por las ambigüedades y vaguezas del texto natural. La técnica utiliza estrategias tales como análisis del texto con procesadores de lenguaje natural, utilización de algoritmos de desambigüación semántica, análisis de las sentencias utilizando patrones sintácticos, explotación de relaciones semánticas para agrupar palabras semánticamente relacionadas, generación de estructuras de navegación de los concerns (para realizar la identificación y filtrado de concerns), y ponderación de los concerns según su importancia.

Current Teaching

  • Present 2012

    Advanced Separation of Concerns

    UNICEN University Course for Bachelor in Systems Engineering (Fourth Year & Optional)

    See more at the course webpage here

  • Present 2012

    Agile Software Development

    UNICEN University Course for Bachelor in Systems Engineering (Fourth Year & Optional)

    See more at the course webpage here

  • Present 2011

    Software Development Methodologies

    UNICEN University Course for Bachelor in Systems Engineering (Third Year & Mandatory)

    See more at the course webpage here

Thesis Advising

  • 2016 2014

    Juan Ignacio Vélez & Facundo Ramos

    Thesis for the grade of Bachelor in Systems Engineering | UNICEN University

    "Integración de Técnicas de Procesamiento de Lenguaje Natural a través de Servicios Web"

    Main advisor with Andrés Díaz-Pace

      Download the thesis report here

  • Present 2013

    Marcos Basso & Eduardo Solis

    Thesis for the grade of Bachelor in Systems Engineering | UNICEN University

    "Minería de Decisiones Arquitectónicas a partir de Documentos de Diseño"

    Co-advisor with Andrés Díaz-Pace

      Download the thesis proposal here

  • 2015 2012

    Rodrigo González & German Attanasio

    Thesis for the grade of Bachelor in Systems Engineering | UNICEN University

    "Detección Automática y Análisis de Trazabilidad entre Documentos de Requerimientos y Diseño"

    Main advisor with Claudia Marcos

      Download the thesis proposal here and the thesis report here

  • 2013 2011

    Paula Frade & Miguel Ruival

    Thesis for the grade of Bachelor in Systems Engineering | UNICEN University

    "Refactorización de Casos de Uso"

    Co-advisor with Claudia Marcos

      Download the thesis report here

  • 2011 2010

    Francisco Bertoni & Sebastian Villanueva

    Thesis for the grade of Bachelor in Systems Engineering | UNICEN University

    "Identificación y Trazabilidad de Atributos de Calidad en Requerimientos"

    Consultant (advised by Claudia Marcos y Andrés Diaz-Pace)

      Download the thesis report here

Teaching History

  • 2014 2012

    Advanced Separation of Concerns

    Position: Graduate Teaching Assistant

  • 2014 2012

    Agile Software Development

    Position: Graduate Teaching Assistant

  • 2014 2012

    Architecture-driven Software Development

    Position: Graduate Collaborator

  • 2014 2011

    Software Development Methodologies

    Position: Graduate Teaching Assistant

  • 2012 2009

    Aspect-oriented Software Development

    Position: Graduate Collaborator

Assets

  • Dataset of use case requirements for identifying domain actions

    This zip contains a dataset of sentences excerpted from the requirements of several software systems. There are individual ARFF files for each of the enriched and non-enriched datasets used for training and testing the classifiers.

      Download it here

  • Dataset of architectural diagrams (images) used for recovering model nodes and their relations

    This zip contains a dataset of architectural views encoded as images collected from several software systems. The dataset was used in the paper submitted to JAIIO'14 for evaluating IMEAV (Image Extractor for Architectural Views), a tool capable of identifying the underlying model of architectural views "frozen" in static images and able to persist that model information using a graph-based representation.

      Download it here

  • Empirical evaluation data of REAssistant

    This zip contains files from CRS and HWS case-studies. MSLite files are not included because project details cannot be disclosed due to confidentiality issues.

      Download it here

  • REAssistant distribution (october 2013)

      Eclipse Plugins
      NLP Models

Tools

  • image

    REAssistant

    This is an Eclipse toolset that supports the identification of software concerns in textual requirements specifications, mainly use cases.

    To achieve its work, REAssistant is based on three pillars:

    • an annotation-based representation of textual use cases,
    • a pipeline of Natural Language Processing (NLP) techniques and domain knowledge (about use cases),
    • concern-specific rules that, when executed on the use case representation, can extract concern-related information.

    In this way, REAssistant aims at exposing both candidate concerns and contextual information (typically, crosscutting relations) that might be overlooked by the analyst.

    You can visit the website of the tool at Google Code, where we provide distributable versions of REAssistant as well as its source code.

  • image

    ReqAligner

    This tool is composed by a set of Eclipse plugins that aids analysts to spot signs of duplication in use case specifications in an automated fashion and improve their modularity

    In order to effectively detect duplication, ReqAligner combines several text processing techniques, such as a use-case-aware classifier and a customized algorithm for sequence alignment. Essentially, the classifier converts the use cases into an abstract representation that consists of sequences of semantic actions, and then these sequences are compared pairwise in order to identify action matches, which become possible duplications.

    You can visit the website of the tool at Google Code and see the source code of ReqAligner.

  • image

    QAMiner

    This is an Eclipse toolset that helps analysts to extract potential quality attributes from use cases, based on the early aspects detected by another tool.

    QAMiner works in two stages: the first stage takes a set of use cases and asks another tool to generate a list of early aspects with crosscutting relations to the use cases; and the second stage processes the outputs of SAET using a predefined quality-attribute ontology in order to derive a list of candidate quality attributes. QAMiner goes through the early aspects and looks for words that match the concepts of a given quality attribute within the ontology. Those quality attributes that receive more matchings are outputted as candidate quality attributes for the system.

    You can visit the website of the tool at Google Code and see the source code of ReqAligner.

At the office

You can find me at my office located at ISISTAN Institute in UNICEN University.

I am at my office every day from 9:00 until 18:00 am, except when i am occupied with teaching activities.

Please send me an email or call me before coming to the office unannounced.

At the faculty

During the first semester, you can find me giving lectures at the faculty of Ciencias Exactas classrooms as well as in the shared spaces of campus at UNICEN University.

Our courses are usually given from 11:00 a.m to 16:00 p.m. Contact me by mail or by phone to fix an appointment in advance.

At the lab

During the second semester, you can find me giving lectures and doing practices with students at the laboratories located at the Faculty of Ciencias Exactas and the ISISTAN Research Institute here in UNICEN University.

Our courses are usually given from 9:00 a.m to 15:00 p.m., but you may consider a call or mail to fix an appointment.