Economic growth and tourism performance in Latin America and the Caribbean: a comparative analysis by clustering techniques and causality tests

Since the growing interest in the study of the dynamics of tourism and its implications for economic growth, this paper explores the Latin America and the Caribbean case. From the economic regimes’ conception and time series symbolization, we perform a cluster analysis and a posterior causality relationship estimation between economic growth and tourism performance for 22 countries of Latin America and the Caribbean considering the period between 1995 and 2015. Results show the existence of two clusters of countries with similar dynamic behaviour within them. The estimations show a positive causal and unidirectional relationship from economic growth to the tourism industry in the low-performing tourism cluster, and also a positive and bidirectional causal relationship in the high-performing tourism one. JEL codes: C33; C36; Z32.


INTRODUCTION
In recent years, the academic field has been interested in investigating the tourism industry, especially in the joint dynamics of tourism and economic growth. In 2002, since the Balaguer and Cantavella's study, it had been formalized the idea of tourism as a boost of growth (the Tourism Led-Growth Hypothesis), which has emerged as an extension of the well-known Export Led-Growth Hypothesis. Based on these contributions, the volume of research carried out in this industry grew, seeking to verify or test the validity of this theory at an empirical level, either at the level of individual case studies or studies for regions or groups of countries.
In cases where the study is focused on groups of countries, panel data techniques are performed. It is relevant to note that in these cases, it assumes homogeneous behaviours or similar dynamics within the panel units. However, this fact might not be appropriate since a common dynamic is not necessarily verified in all the countries that make up the analysis. In Latin America and the Caribbean case, countries seem to follow different dynamics, not only between them but also within them, because of their continuous development process and attempts to improve their economic and social conditions. So, considering a single panel including all countries dynamics might not be the best strategy, due to the dissimilar dynamics behaviours between the units included in the panel.
This paper proposes a panel data of Latin America and the Caribbean countries to study the dynamic relationships between the tourism industry performance and the economic growth. Previously, and taking into consideration the dissimilar behaviours between countries, we propose a cluster analysis to get a better adjustment of the panel data and a better validation level of posterior estimates. It is important to develop a cluster analysis because of the many fluctuations that this region has. So, it would not be plausible considering a single panel structure to justify the behaviour of these countries. Because of that, clustering techniques propose a better approach adjusting the relationship between tourism and economic growth (at least in the short-term) in a better way.
When measuring tourism industry performance, it might not be the best idea to consider absolute levels, for example, of visitors to each country. This indicator includes the country size in terms of area and population. As an example, based on World Bank data, in 2015, Brazil received more than double the number of tourists

Resumen
A partir del creciente interés en el estudio de la dinámica del turismo y sus implicancias en el crecimiento económico, este trabajo busca aportar en esa línea para el caso particular de América Latina y el Caribe, una región en constante proceso de desarrollo. A partir de un estudio utilizando el concepto de regímenes económicos y simbolización de series temporales, se propone realizar un análisis de cluster y una posterior estimación de las relaciones dinámicas existentes entre el crecimiento económico y el turismo para 22 países de América Latina y el Caribe entre 1995-2015. Los resultados muestran la existencia de dos grupos de países con un comportamiento dinámico similar que, a su vez, no parecen converger en el tiempo. Se observa la existencia de una relación causal positiva unidireccional desde el crecimiento hacia el turismo en el grupo de turismo bajo y una relación bidireccional también positiva en el caso del grupo con turismo elevado. compared to Uruguay (while Brazil received 6,306,000 tourist arrivals, Uruguay received 2,773,000). However, when population is taken into consideration, their results are different: Uruguay received more than 27 times the number of arrivals per inhabitant compared to Brazil (while Brazil received 0.03 arrivals per inhabitant, Uruguay received 0.82). 1 So, it could be appropriate, when considering countries with different size, taking an indicator which incorporates in a better way the industry performance.
In what follows, in Section 2, a brief literature review is presented. Then in Section 3, we explain the data sources, methodological strategy, and the empirical results from the economic regimes conception, the timeseries symbolization, and the causality relationship approach. Finally, in Section 4, some conclusions are drawn, and comments are made.

LITERATURE REVIEW
Since 2002, many articles have been published with the aim of verifying the theoretical framework proposed by Balaguer & Cantavella of the tourism led-growth hypothesis. These applications have been done on individual study-cases countries, as well as in clusters of countries, using different sources of data and methodology. Some of them are Castro-Nuño et al., 2013;Jiménez García et al., 2015;Brida et al., 2016;Brida et al, 2017;Seetanah et al., 2017;Chingarande & Saayman, 2018;Li et al., 2018;Comerio & Strozzi, 2019;Fonseca & Sánchez-Rivero, 2020. In those in which they are applied in Latin America and the Caribbean countries, some of them found a unidirectional relationship between tourism and economic growth. Many of them are applied in the study of a specific country and use techniques, such as Granger causality and cointegration test by the Vector Error Correction Model (VECM) (Ramirez, 2006;Brida et al., 2008;Brida et al., 2008;Croes & Vanegas, 2008;Brida & Risso, 2009;Brida, et al., 2009;Brida & Monterubbianesi, 2010;Jackman & Lorde, 2010;Brida et al., 2010;Brida et al., 2011;Rosero-Barzola & Zúñiga-Contreras, 2014). The indicator used as a measure of tourism differs. Some of them use international arrivals level. Others use the tourist expenditures as well as tourism receipts. However, their results tend to be the same: a causal relationship between tourism and the economic growth, verifying the Tourism Led Growth Hypothesis.
In addition, in these applications, another group of studies found that the relationship is bidirectional (Lorde et al., 2011;Amaghionyeodiwe, 2012;Jackman, 2012;Ridderstaat et al., 2013;Cruz-Chavez et al., 2016). It means that not only the tourism performance causes the economic growth, but also, the latter causes the former. The measure of tourism and economic growth, and the methodology applied in these cases does not differ from the description above.
Also, since we applied this study to the Latin America and the Caribbean context, we considered a group of countries as a panel. So, we need to consider specific panel data methodology for this application. In the literature, specifically that on this region (considering the ones which consider a panel data of this region, exclusively, and considering ones which use panel data that has this region included within it), we found again that some studies verify the tourism led growth hypothesis and others verify bidirectional relationships between tourism and economic growth. However, the methodology used in these cases are more heterogeneous. Apergis & Payne (2012), Risso (2018), and Lee & Chang (2018) are the only ones who used the conventional Granger causality test; although Lee & Chang, as well as Mitra (2009) consider also a causality test which allows heterogeneous panel (Dumitrescu & Hurlin causality test). In the most recent papers, some of them introduce a non-linear and non-parametric approach to the dynamic relationship between tourism and economic growth (Brida et al., 2015;Brida et al., 2016;Kumar & Stauvermann, 2016;Wu et al., 2016;Chiang et al., 2017;Gül & Özer, 2018;Karimi, 2018;Bella, 2018;Zhang & Cheng, 2019;Eyuboglu & Eyuboglu, 2019). Finally, other studies used other techniques as for example the Generalized Moment Method, Dynamic OLS, Arellano & Bond estimator, and so on (Gardella & Aguayo, 2002;Eugenio-Matín & Martín-Morales, 2004;Muslija et al., 2017).
Within this framework, in our study we focused on the case of Latin America and the Caribbean: a developing region with many macroeconomics fluctuations. We expect to find some causal relationships between tourism and economic growth, with the aim of developing strategics and economic policy to boost economic growth, the tourism industry, and economic development, as well. However, due to its characteristics fluctuations and considering a panel data approach, it will be not surprising if these relationships are not as clear as the reference literature found in developed regions or study cases.

Data
To analyse the dynamics between tourism and economic growth, we consider data from 22 Latin America and the Caribbean countries for 21 years . The indicators used are: the international tourism arrivals per inhabitant 2 and the growth rate of the Gross Domestic Product per inhabitant. To the tourism measure, we used time series provided by the development indicators of the World Bank. And to measure the economic growth, we used data provided by the Maddison Project Data (2018). In sum, we have data from 1995 to 2015 for: Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Dominica, Ecuador, Guatemala, Haiti, Honduras, Jamaica, Mexico, Nicaragua, Panama, Paraguay, Peru, Dominican Republic, Saint Lucia, Trinidad and Tobago, Uruguay, and Venezuela. 3

Time series symbolization
Drawing on Daw et al. (2003), it is proposed to turn two-dimensional symbolic time series into one-dimensional symbolic time series. The application of this methodology. which will be developed later, is motivated by the way that working with more than one-dimensional time series can suffer the problem of do not know if any relationship between the measurement units of the variables exists, and many times it does not exist. This prevents the use of Euclidean metrics or others similar, due to these take the idea of equality in the axis measurement units.
To this end, we take the economic regime conception, where from the annual means of tourism ( ) and economic growth ( ) of all countries we get four regimes. Each one represents the qualitative state of the performance of some country at each year in terms of tourism and economic growth, jointly. So, it is defined: , ≤ } Low tourism and low economic growth Low tourism and high economic growth High tourism and high economic growth

High tourism and low economic growth
Having defined the spatial partition, we converted the two-dimensional time series into a one-dimensional symbolic time series by this way: to each country, we assigned a symbol (1, 2, 3, or 4) for each year, depending on which region it has been located; we assigned a symbol j to each country at each year of this country belonging to at this year. By this way, it generates a new time series which take values from 1 to 4 which represent the qualitative dynamics of each country, so, without losing interpretative quality, we are in conditions of taking a measure of the distances between the countries during the period to clustering the more homogeneous ones. 4

Cluster analysis
To this analysis, a concept of dynamic distances between countries is incorporated (Tang et al., 1997;Molgedey & Ebeling, 2000;Piccardi, 2004). Once the symbolic time series is obtained, the binary symbolic distance between countries i and j is defined: According to this metric, lower distances refer to a more equality of the symbolic time series. This means a similar dynamic behaviour between countries. So, by the Nearest Neighbour Search (NNS) criteria (Mantegna, 1999;Mantegna & Stanley, 2000), an algorithm can be iterated concluding in the configuration of clusters with similar dynamic behaviours. By an aggregative way, we begin considering 22 clusters where each country is a cluster. In the first step, those countries with minimum distances ( , ) are grouped. Having obtained a first group, the algorithm is repeated, considering those clusters we have obtained in the last step. Figure 1 illustrates the Hierarchical Tree associated to this problem. We can see in that figure all countries and the distances in which they have been grouped. There is a first cluster (C1) where the countries show an economic growth that fluctuates around its mean, but with a low performance in tourism. These are, in general, large and populous countries, comparatively, where the tourism industry is not developed enough. In addition, seven countries belong to the second cluster (C2) with a high-performing tourism industry in terms of international arrivals per inhabitant. These are, in general, less populous countries with smaller areas; usually, with many attractions and an economic policy focused on tourism (most of them are the Caribbean islands). Note that Panama cannot take part in any cluster due to their not similar behaviour during the period. So that, we can match their dynamic with the ones observed in any of the clusters.
The two clusters denote distinct behaviours. On one hand, a low-performance cluster in terms of tourism where their countries receive on average 8 tourist per 100 inhabitants. In opposition, the other cluster performs well in tourism receiving on average 73 tourist per 100 inhabitants. So, there is evidence in favour of the existence of different behaviours in terms of tourism in Latin America and the Caribbean countries. Thus, considering them in a differentiated way to estimate the relationship between tourism and growth seems to be a good alternative.

Econometric estimations
For the quantitative application of the relationship between tourism and economic growth, we considered the international arrivals per inhabitant as an indicator of tourism performance and the real GDP per inhabitant as a measure of the economic activity. Additionally, we considered the real effective exchange rate as a control or explanatory variable, based on its links with the considered variables and the relevant literature on this subject. This new indicator is obtained from CEPALSTAT. However, the availability of complete data in terms of countries and years is a limitation of this study. Therefore, to get a balanced panel structure we obtained a subsample of countries from each cluster based on the availability of data for this last indicator. Thus, the following countries were considered for Cluster 1: Bolivia, Brazil, Chile, Colombia, Ecuador, Guatemala, Honduras, Mexico, Nicaragua, Paraguay, and Peru; and for Cluster 2: Costa Rica, Dominica, Jamaica, Dominican Republic, Trinidad and Tobago, and Uruguay.
For the study of the causal relationship and its corresponding econometric estimation, the two clusters obtained will be considered separately (Panama is excluded because it is considered an outlier case). Previously, the order of integration of each series is determined. To this aim, the study of the presence of unit roots is proposed, by a set of tests that arose from different methodologies and assumptions (Levin-Lin-Chu, Breitung, and Im-Pesaran-Shin). 5 Results are shown in the appendix.
We obtained, for the two groups, both time series of international arrivals per inhabitant and those of the real exchange rate as well as those of GDP integrated in the first order. Although in one of the tests there is evidence that the process is stationary, the same cannot be concluded regarding all the applied tests. Additionally, the Im-Pesaran and Shin test assesses the unit root processes individually among the individuals in the panel and therefore gives greater specificity in its conclusion. Thus, the processes are I (1) with a first stationary difference.
Having analysed the processes studied in this exercise in terms of their order of integration, we can find in the case of non-stationarity, their stationary transformation. Thus, a central aspect is focused on the existing short-and long-term relationships. To this end, the existing causal relationships will be studied by two methods: on the one hand, Granger causality in its most traditional sense, where a single process that underlies from the panel of available data is considered; in particular, it is assumed that the coefficients are equal in all the units included within the panel (in this case two panels were studied: C1 and C2); On the other hand, an extension to this methodology was proposed and refers to the Dumitrescu-Hurlin causality, which makes the first method more flexible, allowing the parametric structure of the panel units (the countries, in this case) to be dissimilar given a pair of stationary processes. Thus, Table 1 shows the two-bytwo causal relationships between GDP, international arrivals per inhabitant, and real effective change (all variables considered in first differences of their logarithms). By the existing causality relationships, two main conclusions are drawn: • Considering Granger causality in its traditional sense, where a homogeneous parametric structure is considered between the panel units, no causal links are observed between tourism and the activity performance. However, the assumption that the associated parameters are the same for all countries would not seem plausible in this exercise where countries with different dynamic behaviours are considered. Therefore, with the possibility of relaxing this restriction, these conclusions are questioned and the causal relationships are observed from the method proposed by Dumitrescu and Hurlin (2012) where the parametric structure is given freedom to change between the units of the panel, in this case, the countries of each group.
• From Dumitrescu and Hurlin causality test, it is found that: for C1 there would be a unidirectional causal relationship from the level of activity to the tourism sector; while in the case of C2 this relationship is bidirectional.
By the causal relationship found in the dynamics, it is interesting to observe the estimated magnitude of this relationship to assess whether it differs between the groups and which of the those is more favoured in terms of tourism by the GDP expansions. To carry out the estimation, the estimation method for dynamic panels of Arellano and Bond (1991) was applied, which estimates a linear model from the generalized method of moments, saving the existing problem in the classic panel estimations (OLS, EF, EA) of endogeneity between the dynamic component and the stochastic error component and generates robust estimates in the presence of heteroscedasticity. Thus, the following models to estimate can be proposed: Table 2 summarizes the estimations obtained from the application of this method to the data for both clusters. For C1 we observed that the activity performance has a positive effect on the arrivals per inhabitant where, given an increase of 1% in the real GDP per capita of the economy, arrivals per capita increase by 0.52%, this being a significant result at 95% of confidence. In the case of C2, we observe that the same estimated effect is 0.30% on the average arrivals per inhabitant, somewhat lower than that obtained for C1. These results may have some interpretation from the perspective that more developed countries with better tourism performance, although the improvements in their level of activity increase their levels of arrivals, this increase is more attenuated compared with countries with a more lagging tourism industry where the capacity to receive arrivals and generate improvements that enhance the industry is greater than in the other case. In sum, there is a causal relationship between tourism and the level of activity in the economies of Latin America and the Caribbean, where growth takes a process in which arrivals per inhabitant are determined. However, the magnitude of this effect is different depending on the cluster of countries considered, fostering the development of the sector in those countries with poor performance. For the first cluster, it can be interpreted that an increase of 1% in real GDP has a short-term impact on average arrivals of 0.52% upwards. For the C2 we find that the relationship is bidirectional; thus, an increase of 1% in the level of activity increases the average arrivals per inhabitant by 0.31%; while, in the other sense, it can be understood as that an increase of 1% in the arrival of tourists per capita generates an increase in real GDP of 0.40%. Thus, while countries with a more developed tourism industry generate feedback between tourism and growth, generating favourable dynamics for the development of these countries, those that are farthest behind in tourism need to grow economically in order to allocate resources to tourism and enhance their performance from international arrivals.
Having estimated by the Arellano-Bond method the short-term causal effects in the relationship between economic growth and tourism performance, it is necessary to test the absence of autocorrelation in the regression residuals. It can be concluded that at 99% confidence the hypothesis of no serial correlation is not rejected.
Thus, from the estimates, short-term relationships are found between tourism and growth. These relationships are positive, but they differ in the two clusters of countries considered not only in magnitude, but also in the sign that these relationships occur. In this way, there is evidence in favour of not considering the group of Latin American and Caribbean countries in the same panel, but of generating groupings that allow the better determination of the existing relationships in terms of tourism and economic growth, at least in the short term.

CONCLUDING REMARKS
In the last years, there has been a growing interest in the study of tourism in countries due to the economic flows it generates, both in terms of the goods and services market, the capital market, and the labour market. Particularly, value generation in tourism can influence the level of activity of the economies. Since 2002, Balaguer and Cantavella formalized this theory from the TLGH. From there, various authors studied the dynamic link between tourism and economic growth, measuring these variables from different indicators. In this case, per capita international arrivals and economic growth were used as real product per capita for the study of 22 countries in Latin America and the Caribbean between 1995 and 2015.
Latin America and the Caribbean is made up of developing countries, which implies highly fluctuating dynamic behaviour over time within each country, and among them, as well. These arguments cast doubt on how plausible it is to jointly consider all the countries of Latin America and the Caribbean in a single panel that describes a common behaviour. For this reason, this work developed a methodology for symbolizing time series to generate one-dimensional time series to apply clustering techniques with the aim of finding the existence of clusters of countries with a "homogeneous" behaviour in the dynamics. We found two clusters of countries that do not seem to converge in their dynamics: a group (C1) with a lagged tourism sector made up of most Latin American countries. On the other hand, a second group (C2) formed by nations belonging mostly to Caribbean islands and countries such as Costa Rica, and Uruguay, all countries with a strengthened tourism industry and where it is one of the central engines of their economic activity.
Considering both clusters, conventional quantitative techniques are applied to study the existing dynamic relationships in terms of the level of activity and tourism in the countries. Causal relationships are studied, and the magnitudes of short-term relationships are estimated. For this, the causality test of Dumitrescu-Hurlin (2012) shows that while for C1 the causal relationship is unidirectional from economic growth to tourism, in C2 the relationship is bidirectional. From the estimator proposed by Arellano-Bond (1991), it is found that, in the short term, an increase of 1% in real GDP increases arrivals per capita by 0.52% and 0.31% for C1 and C2, respectively. Additionally, an increase of 1% in arrivals per inhabitant would generate an increase in the level of activity of 0.40% on average, for C2.
As final remarks, we can highlight the fact that tourism is measured in per capita terms. This has the advantage of considering the size of the countries to obtain a better approximation of tourism performance in quite different countries in terms of population and their surfaces. However, other indicators could be considered for the quantitative study of tourism, such as net international arrivals, or international tourism expenditures. Also, in terms of symbolization and clustering, it can be highlighted that the proposed partition based on the annual means of both indicators can be modified by repeating the exercise with another partition (median, quantiles, etc.). Regarding the quantitative study, three possible extensions emerge that would be interesting to deepen. On one way, use some other specification of the proposed dynamic models or use any of the extensions proposed to the Arellano-Bond method (1991). Also, as mentioned, the estimated relationships are short-term; so, it could be interesting to inquire about the possible long-term relationships existing in the groups from cointegration studies for dynamic panels. Finally, the exercise could be repeated, and the magnitudes of the relationships could be evaluated considering other control variables, substituting, or adding to the one considered in this study.
To finish, some policy consequences could be analysed. In particular, the low-performing cluster (C1) has attached the fact that most of the countries have a large area so the domestic tourism could have an important role to the detriment of the international one. So, in our conclusions, we must take into consideration this fact, and not to be severe with these results and conclusions. In opposition, analogous comments could be drawn for C2.
Geographical location is another determinant since tourism decisions depend on consumer preferences. In this way, the power of international trade and business agreements between countries can be an alternative that enhances the attraction of tourists and simultaneously increases the current expenditure of tourists, for example, from benefits directed to the tourism industry, reduction of taxes or incentives for consumption and investment flows. Successful countries focus their efforts on generating a tourism industry that enhances activity growth and development. These efforts may be less applied, or at least further lagged, in larger countries. Some facts such as the size of the country and the net flow of people facilitate the applicability of these issues in countries such as the Caribbean islands under consideration and it seems more costly to apply in consolidated and larger countries. This idea arises from the fact that the geographical location can contribute, but it would not seem to be enough, at least in terms of the variable considered to adjust to a high level of tourism (consider the case of Brazil and Colombia, for example, two eccentric countries of great attractiveness, but that do not manage to position themselves as leaders of regional tourism).
Regional development is an influential factor due to the levels of insecurity faced by some of the countries in the region, which generally coincide with the countries that make up the low-tourist group. Social and political stability seems to be a relevant factor when choosing a tourist destination.
Based on the aforementioned approaches, there are some main guidelines that justify and at the same time puts the results obtained in perspective, particularly for those countries with "low" levels of tourism: strengthen international agreements and relations that promote the development of the sector and encourage the international flow; the social and political stability that generate a sustainable environment to develop the tourist activity seems to be a good alternative to enhance the sector; and targeted policies that seek to boost the industry and promote tourist flows seem to be successful tools applied by the Caribbean islands. It should be considered that the use of variables measured in per capita terms relativizes the results from the enhancement of per capita tourism levels and the enhancement of international tourism (which is what is considered in this exercise) over domestic tourism.
The main limitations of this work are the restricted availability of data in terms of time series, because in terms of tourism the collection of data related to arrivals or expenses is not long-standing. In addition, the sample of countries considered implies not too many units (countries) in our panel data because of the requirement of a balanced panel structure. The availability of more data would allow generating more robust estimates and applying other techniques that are only feasible with panels of a larger dimension.

ACKNOWLEDGEMENTS:
Our research was supported by the UdelaR -CSIC, project GIDE. A preliminary version of this paper was presented at the 12th Workshop (Online) "Tourism: Economics and Management Tourists as Consumers, Visitors and Travelers", 24-25 September 2020. The authors acknowledge and thank reviewers of the preliminary version of the paper and participants to the workshop by their various comments and suggestions.
Verónica Segarra is grateful for the financial support received from Comisión Sectorial de Investigación Científica (CSIC), Universidad de la Republica, Uruguay.    Source: The authors.