Universidad de Talca
search.button.filter.restore

Search Results

Now showing 1 - 2 of 2
  • Item
    Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile
    Autores: Palacios, Carlos A.; Reyes Suarez, José A.; Bearzotti, Lorena A.; Leiva, Victor; Marchant, Carolina
    Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student's data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.
  • Item
    YARS-PG: Property Graphs Representation for Publication and Exchange
    Autores: Szeremeta, Lukasz; Tomaszuk, Dominik; Angles, Renzo
    Graph serialization is a critical aspect of advancing graph-oriented systems and applications. Despite the importance of standardized serialization for property graphs, there is a lack of a universal format encompassing all essential features of graph database systems. This study introduces YARS-PG, a simple, extensible, and platform-independent serialization format tailored for property graphs. YARS-PG supports all the features permitted by current property graph-based database systems and is compatible with other graph-oriented databases and tools. We delineate the design requirements of YARS-PG by detailing both functional and non-functional aspects. Besides the basic features of property graph data, YARS-PG supports schema definition, metadata, metaproperties, variables, and graph definitions. Moreover, we discuss extensions of YARS-PG, demonstrating its flexibility through canonicalization techniques. Our comparative analyses with existing formats provide valuable insights, emphasizing the unique strengths that distinguish YARS-PG in the realm of graph data interchange. This paper serves as a definitive guide to YARS-PG, unraveling its complexities and showcasing its potential as a communication protocol, a data storage format, and a messaging specification.