Analysis of Academic Achievement in Higher-Middle Education in Mexico through Data Clustering Methods

In recent years, there is a natural need to look for new ways to analyze and process data from different sources. One of these ways is through data analysis methods. Thus, given the importance of making academic diagnoses, this paper presents the academic achievement analysis, in Language and Communication and Mathematics, of students from autonomous, public and private schools of Higher-Middle Education in Mexico through data analysis methods. Data analyzed were registers of the National Plan for the Evaluation of Learning, which puts into operation the National Institute for the Evaluation of Education in coordination with the Secretariat of Public Education, Mexico. A variety of academic achievements was observed, highlighting Insufficient and Elementary in the evaluated population, while a small number reached acceptable achievements, that is, Satisfactory and Outstanding. This contrasts a notable difference between the levels reached by students, which leads them to delay or stop their university studies because they obtain a completion certificate of studies without having the necessary knowledge to pass the entrance examination in the universities.


Introduction
Nowadays, education is one of the key pillars for the social and economic development of a country. Students who currently attend compulsory education, such as primary, secondary and higher-middle school, in future will be responsible for becoming the labor and economic force of a region and, therefore, of a country [1]. However, in order to obtain satisfactory results, quality education is needed; which is achieved through educational systems that have a decisive role in the improvement of educational quality [2]. An educational system could be made up of the academic training of teachers, the educational contents found in the plans and study programs, and the daily life of schools.
Thus, as academic achievement is an important measurement parameter on the education quality, provided by educational systems, there is a need to know to what extent students achieve essential learning in different domains at the end of each educational level, with the purpose of making a diagnosis of the performance and knowledge achieved by students. In the specific case of Higher-Middle Education, according to [3], there are cases of students who at the end of their studies obtain the completion certificate without having the minimum knowledge necessary to subsequently pass the entrance exams in universities of the country; bringing as a consequence that they delay or stop their university studies.
Given this, make diagnoses about the necessary knowledge acquired by students in school age is important because through these could be articulated strategies to improve the academic level and ensure homogeneous conditions to students for the continuation of a career university [4]. Precisely, currently, one of these diagnoses is made through the National Plan for the Evaluation of Learning (PLANEA, for its acronym in Spanish) in Higher-Middle Education, which puts into operation the National Institute for the Evaluation of Education (INEE, for its acronym in Spanish) in coordination with the Secretariat of Public Education (SEP, for its acronym in Spanish) of Mexico. PLANEA has as its main purpose to know the extent to which students manage to master a set of essential subjects at different times of their compulsory education [5]. In addition, the results offered by PLANEA aim to improve education based on the following actions [4,6]: • Inform society about the educational level in terms of student learning.
• Provide information of interest to educational authorities for the planning, programming, monitoring, and operation of the education system.
• Offer information to schools to help improve teaching and learning practices.
On the other hand, at present, due to the growth of data collection and the evolution of computing power, information is stored in different sources. This allows for using historical data to explain the past, understand the present and predict future situations [1]. Therefore, there is an increasing need to look for new ways to analyze and process existing data sources to obtain useful information and knowledge. However, the data volume, which these sources reach, is often a limitation for analysis of manual way. Therefore, specialized technologies have been developed to process and obtain information of interest with the purpose of supporting the decision-making process.
Given these conditions, there is interest in analyzing the results of the evaluation offered by PLANEA in Higher-Middle Education 2017, since there is a notable difference between the levels of academic achievement achieved by students from one federative entity (state) to another in Mexico [7]. The purpose of this study is to identify elements and significant characteristics of academic achievement, in the domains of Language and Communication (reading comprehension) and Mathematics, of students of public and private institutions of the country through methods of data analysis. The data patterns obtained could be useful as an information tool for parents, students, teachers, principals, educational authorities and society in general.

Background
In Mexico, Higher-Middle Education acquired greater responsibilities in both coverage and education quality that imparts to its students, since at present it is Analysis of Academic Achievement in Higher-Middle Education in Mexico through Data… DOI: http://dx.doi.org/10.5772/intechopen.84744 evident the relevance of education imparted in that level and the impact that it will have on the development of the country [8]. In this context, it is important to mention that Mexico, in Higher-Middle Education, seeks the compulsory nature and strengthening of the selection procedures for entry and graduation of said school level [9].
Consequently, it is important to describe the role of PLANEA in Higher-Middle Education, which is designed to offer specific information on the academic achievement of schools and their students. This properly used plan is a powerful recognition tool to improve the quality of education.
The evaluation carried out by PLANEA in Higher-Middle Education is aimed at students throughout the Mexican Republic who are in the last school year (semester, year, or any other variant defined by the educational institution), enrolled in a campus or institution educational, whether autonomous, state, federal or private. Areas of competence that PLANEA currently evaluates at this level of education are Language and Communication and Mathematics, which have the following characteristics [4,6]: • Language and Communication. In this domain, the students' abilities to reflect, interpret, analyze and use written texts are explored through the identification of their structure, functions, and elements. All this with the purpose of employing communicative competitiveness and allowing it to actively intervene in society.
• Mathematics. In this domain, the students' abilities to identify, apply, synthesize, interpret and evaluate their environment mathematically are explored, making use of their creativity and logical and critical thinking, which allows them to solve different quantitative problems.
In the case of Language and Communication, the indicators associated with reading comprehension competences are subject to measurement, therefore, the evaluation topics focused on the processes associated with reading, such as extraction of information, interpretation, and reflection on language nature, and its use as a tool of logical thinking. Among the indicators evaluated in this domain are [4,7]: • Identification, ordering, and interpretation of ideas, data and explicit and implicit concepts in a text, considering the context in which it was generated and in which it is received.
• Evaluation of text by comparing the content, previous and new knowledge.
• Identification of the normative use of the language, considering the intention and the communicative situation.
• Analysis of a precise, coherent and creative argument.
• Relation of ideas and concepts coherent and creative compositions, with introductions, development and clear conclusions.
• Sequence evaluation, or logical relationship in the communication process.
• Identification and interpretation of the general idea and possible development of a written message, drawing on previous knowledge and the cultural context.

Education Systems Around the World
For Mathematics, the aim is to encourage the development of creativity and logical-critical thinking in students, considering that a student can better argue and structure their ideas and reasoning. Therefore, given the standardization that is sought in the evaluation process, as well as the use of multiple choice reagents, the exercises to solve do not require the use of calculators or specialized formulas. Among the indicators evaluated in this domain are [4,7]: • Interpretation of mathematical models through the application of arithmetic, algebraic, geometric and variational procedures for the understanding and analysis of real and hypothetical situations.
• Solving mathematical problems, applying different approaches.
• Interpretation of data obtained through mathematical procedures and contrast with established models or real situations.
• Analysis of relationships between two or more variables of a social or natural process to determine their behavior.
• Quantification and mathematical representation of magnitudes of space and the physical properties of the objects that surround it.
• Reading of tables, graphs, maps, diagrams, and texts with mathematical and scientific symbols.
The first PLANEA evaluation in Higher-Middle Education was held in March 2015, the second in April 2016, while the third was in April 2017. These evaluations were made to students in the upper middle level of the last school year of public and private schools of the country. Table 1 shows the number of schools and students that participated in the three editions of the PLANEA evaluation in Higher-Middle Education.
Specifically, the aspects evaluated are aimed at measuring the academic achievement, highlighting the knowledge that a student of the upper-middle level must have to continue their academic life. Therefore, PLANEA constitutes a general diagnosis that can support self-directed intentions, enrollment in extracurricular activities, planning campaigns within schools, and other actions.
In order to guarantee that the evaluation of PLANEA will be carried out under homogeneous conditions throughout the country and to contribute to the reliability of the results obtained, some measures were implemented to strengthen the procedure for applying the test, such as [10]: (a) training on regulatory and operational aspects; (b) integration of personal files in order to verify that they meet the required profile; (c) use of optical reading to obtain fast information about the application and frequent incidents, and (d) use of a digital monitoring system to monitor the main activities scheduled before, during and after; to mention a few.

Materials and methods
As a method of work to analyze the academic achievement of students of highermiddle education in Mexico, a qualitative and quantitative approach was used. For this, data from the National Plan for the Evaluation of Learning was used, operated by the National Institute for the Evaluation of Education and the Secretariat of Public Education of Mexico. For the analysis of results, variables relevant to the current context of educational evaluation in Mexico were used.

Data source
As a data source, records were used from the National Plan for the Evaluation of Learning database (PLANEA), specifically data from schools of Higher-Middle Education, public, federal and state, and private schools recognized by the Secretariat of Public Education of Mexico.
Access to the version of the data source was made through the institutional PLANEA page (http://planea.sep.gob.mx/ms/base_de_datos_2017). PLANEA's main aim is to know to what extent students manage to master a set of essential learning at different times of their compulsory education [7,11,12], in this case at the end of Higher-Middle Education, in two areas of competence: (a) Language and Communication, and (b) Mathematics.
In 2017, PLANEA used, as an evaluation instrument, an exam consisting of 100 multiple-choice items, divided into two educational competencies: (a) 50 for Language and Communication and (b) 50 for Mathematics. The test application includes 50-minute sessions distributed over 2 days. It is a diagnostic test; it is not a selection test for admission to Higher Education institutions. describe the performance that a student in the last school year can obtain as a qualification in the PLANEA evaluation. These levels represent the tasks and cognitive processes that students should achieve when they graduate from high-middle in the areas of Language and Communication, and Mathematics. These proficiency levels not only serve to identify the academic achievement that students have but also to have an overview of the performance of schools in general.

Academic achievement levels
Based on the foregoing, PLANEA clusters academic achievement into four levels that provide information about the key learning that must be acquired by students, and to what extent they have appropriated them [13]. These levels go from I to IV in progressive order, that is, the lowest level is I (insufficient) and the highest is IV (outstanding). These levels of academic achievement constitute an important reference for the detailed analysis of the results [4]. The levels are cumulative, that is, those students who have acquired the learning of a certain level have those of the previous level, for example, those who are located in level II (elementary), they already have the level I learning (insufficient); those who are in level III (satisfactory), have those of II and those of I, and so on.
PLANEA in the Higher-Middle Education is designed to offer parents, students, teachers, principals, educational authorities and society in general, specific information about the academic achievement of the schools and, properly used, constitutes an instrument that could contribute to improving the quality of education. The four levels of academic achievement have the following characteristics [7]: • Level I (Insufficient). The students who are located at this level have insufficient knowledge of the key learning included in the curricular references. This reflects greater difficulties to continue with their academic career.
• Level II (Elementary). The students who are located in this level have an elementary knowledge of the key learning included in the curricular referents.
• Level III (Satisfactory). The students who are located in this level have a satisfactory knowledge of the key learning included in the curricular referents.
• Level IV (Outstanding). The students that are located in this level have an outstanding knowledge of the key learning included in the curricular referents.

General procedure of PLANEA
For the execution of PLANEA, National Institute for the Evaluation of Education and Secretariat of Public Education have the support of the State Evaluation Areas of each federal entity [10,12]. So, in the first instance, principals of the educational campuses are notified with approximately 8 days in advance to facilitate the ordering of the application groups and to implement strategies to ensure the participation of students.
The evaluation seeks to minimize any modification to normal school activities. Since the test only applies to students in the last grade, classes and school activities are not suspended for the rest of the students.
A Coordinator-Applicator participates per school who, together with the external applicator (if applicable), meets with the principal to explain in detail the logistics of application. The Coordinator-Applicator transfers the evaluation The day of the evaluation is attended by external observers (parents and community leaders, businessmen, among others) who supervise that the test is carried out in accordance with established regulations. These people do not intervene in the evaluation process.
The principals of all the participating schools answer, through Internet, a context questionnaire that has the purpose of obtaining information about the characteristics of the school. For their part, the students evaluated also answer a context questionnaire to gather information about school climate and sociocultural aspects.

Clustering methods
Clustering is the descriptive analysis par excellence of data mining. It consists of generating 'natural' clusters from the data [14]. A cluster consists of one or more data vectors, in turn, these vectors comprise several attributes (variables). The aim of this method is to divide a heterogeneous data set into homogeneous sub-clusters based on the similarities of their records [15]. There are two main types of clustering [16]: (a) hierarchical, which is characterized by the recursive development of a structure in the form of a tree, and (b) partition, which organizes records within k clusters. Partition-methods have advantages when a large amount of data is involved, since the construction of a tree is complex.

Results and discussion
Derived from the PLANEA data analysis, a data view was obtained. The main consideration was to determine how many and which are the appropriate variables for the study. Table 3 shows the structure of the data view consisting of 17 significant variables and 16,380 records.

Results at national level
As a result of the analysis, it was observed in Language and Communication (Figure 1), at the national level, that one-third of the students (33.9%) who are about to finish upper-middle education were located in Level I (insufficient). While on average 1 out of every 3 students were located in both Level II-elemental-(28.1%), and in Level III-satisfactory-(28.7%), respectively; and only 9 out of 100 students (9.2%) were located in Level IV (outstanding).
Students located in Level I were not able to identify the author's position in opinion articles, essays or critical reviews; nor were they able to explain the information of simple text with words other than those of reading. In the case of students located in Level II, they were able to identify main ideas that support the proposal of a brief opinion article, discriminated and related timely and reliable information, and organized it based on a purpose. Students in Level III recognized in an opinion article the purpose, the argumentative connectors and the parts that constitute it; in addition, they identified the differences between objective information, opinion, and evaluation of the author; they also identified the different ways in which written language is used according to the communicative purpose and used strategies to understand what they read. While Level IV students selected and organized pertinent information from an argumentative text, they identified the author's position, interpreted information from argumentative texts, such as critical reviews and opinion articles, and inferred the paraphrase of expository text, such as a divulgation article.
In Mathematics (Figure 2), 6 out of 10 students were placed in Level I-insufficient-(66.2%); approximately 2 out of 10 were located in Level II-elemental-(23.3%); in Level III only 8 out of every 100 students (8%) achieved a satisfactory domain; while in Level IV, 3 students out of every 100 (2.5%) achieved outstanding proficiency.
Students located in Level I had difficulties to perform operations with fractions and operations that combine unknowns or variables (represented by letters), as well as to establish and analyze relationships between two variables. On the other hand, students located in Level II expressed, in mathematical language, situations where a value is unknown or the relations of proportionality between two variables,   and solved problems that implied proportions between quantities, for example, the calculation of percentages. In the case of students located in Level III, they used mathematical language to solve problems that required the calculation of unknown values, and to analyze situations of proportionality. While those located in Level IV dominated the rules to transform and operate with mathematical language (for example, the laws of signs); they expressed in mathematical language the relationships that exist between two variables of a situation or phenomenon; and they determined some of their characteristics, for example, they deduced the equation of the straight line from its graph.

Results at state level
In Language and Communication, the entities that had a lower average score, with respect to the national average and that is significant, were Chiapas, Tabasco, Guerrero, and Michoacán (Figure 3). The entities that had a higher average score with respect to the national average were Mexico City, Aguascalientes, Jalisco, Baja California, Querétaro, Yucatán, Colima, and Nuevo León. Chiapas was the entity with the highest percentage of students in Level I (66.1%), while Mexico City was the entity with the lowest percentage of students in this Level I (17.8%). Likewise, Mexico City had the highest percentage of students in Level IV (15.9%). There is a significant difference between the highest score (Mexico City) and the lowest score (Chiapas). The states with the highest percentages of students in Level IV, aside  from Mexico City, were Nuevo León, Yucatán, Jalisco, Baja California, Querétaro, Aguascalientes, Colima, Hidalgo, Sonora and Puebla.
In Mathematics (Figure 4), the entities that had a lower average score with respect to the national average were Chiapas, Tabasco, Guerrero, Michoacán, and Tamaulipas. The entities that had a higher average score and with a significant difference with respect to the national average were Aguascalientes, Jalisco, Querétaro, Baja California, Colima, and Nuevo León. There is a significant difference between the highest score (Aguascalientes) and the lowest score (Chiapas). The entity with the highest percentage of students in Level I was Chiapas (85.6%), while Aguascalientes was the state with the lowest percentage of students in Level I (53.3%). When comparing the results in Mathematics of all the states, Nuevo León had the highest percentage of students in Level IV (5.1%), followed by Colima (4.3%) and Coahuila (4%), the rest was below said percentage values.
The difference in the scores that are observed, from one entity to another, could be linked to the heterogeneity of the service that the educational institutions provide. Similarly, there may also be differences between students who attend the same type of school. On the one hand, there are public institutions, which serve an important cluster of Higher-Middle Education; and on the other hand, there are private institutions, which serve the population that was not accepted in public institutions or decided this type of education, with periodic payments and others with high costs that, in turn, usually provide better conditions in their offer educational. On the other hand, the differences in academic achievement may be conditioned by the students' socioeconomic level, because it is a source of accumulation of educational opportunities.

Results by control type
In Language and Communication (Figure 5a), the academic achievement levels of students of autonomous schools reflected a better performance than students from federal, private and state institutions, given that only 20.4% of the students were placed in Level I, in contrast to 28.2, 27.5 and 41.9%, respectively. The highest percentage of students located in the highest (outstanding) achievement level corresponds to autonomous institutions (17.4%), then private schools (16.1%) were located, followed by federal schools (9.2%) and finally state schools (4.8%). In Mathematics (Figure 5b), students of state schools had the lowest performance, with 73% in Level I -Insufficient-this compared to the other types of administrative control. In the other extreme, in the highest level (IV, outstanding), in general, no type of educational institution exceeds 6%, that is, only private (5.1%) and autonomous (4.8%) schools were the best performers. This situation contrasts the low educational levels in Mathematics of students in Higher-Middle Education.

Conclusions
Education is one of the key pillars for the social and economic development of a country. Therefore, in order to obtain satisfactory results, a quality education is needed; which is achieved through educational systems that have a decisive role in the improvement of educational quality.
Academic achievement analysis offers timely information that could be useful to know the successes and challenges that are found in the learning of the contents of the areas evaluated, thus contributing to the development for the improvement of the educational system.
The results of PLANEA 2017 allow an overview of the levels of academic achievement, in Language and Communication and Mathematics. Results indicate that there is inequity among the students who attend the different educational centers. If a periodic follow-up is carried out, it will be possible to know if the distances are shortened. Results confirm the low educational levels at the national level of the students of Higher-Middle Education of the National Educational System. In Language and Communication, 34% of students were located in Level I (the lowest), and 66% in Level I of Mathematics. These students have not consolidated the keys learning that were evaluated in the PLANEA 2017 test, such as making implicit content inferences in different types of text or making inferences from a mathematical model. At the other extreme, in Language and Communication, only 9% of students are in Level IV (the highest), and in Mathematics only 3%.
Results of academic achievement are a reflection of various social, cultural and economic factors, from students' school activities such as habits, attitudes, and values, to the conditions of educational institutions and the socioeconomic context in which they live. In addition, due to the diversity of educational institutions of the National Education System. Therefore, the improvement of educational achievement requires differentiated attention in each entity, type of service and type of administrative control.
Undoubtedly, the academic achievement of students of Higher-Middle Education is linked to the results of previous educational levels. To address this situation, it is necessary to reduce knowledge gaps, opportunities and general conditions of teaching and learning, from the beginning of compulsory education.
In this sense, the results show that there is a huge challenge ahead for Higher-Middle Education, this challenge implies that coordinated efforts of many actors from previous educational levels are also required so that all students can fully exercise their right to receive an education quality.
In the case of Language and Communication, one of the initiatives may be the promotion of reading different text carriers and the critical analysis of them. The support of other subjects to carry out similar activities, including choosing articles, stories or books that students propose, helps to exercise cognitive processes that will be refined to develop a reading competence.
For Mathematics, in addition to emphasizing the role of the practice of the exercises and activities, it is advisable to multiply the occasions in which the student faces to solve contextualized problems and, progressively, of greater difficulty. The complexity of the exercises depends on the number of variables that need to be considered and the type of language needed to represent the situations. These aspects are enhanced when the contents are taught through problems in everyday scenarios, contrary to what happens with direct situations or exercises of mere resolution of operations.
© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.