2.1. Methods
In experimental approaches, treatment assignment can be randomized and, therefore, a comparison of potential outcomes for the treated and control groups can provide statistically valid estimates of treatment effects. However, a farm with land ownership is not random due to the voluntary nature of the farmland owner’s choice. An estimation of the effect of land ownership might be confounded by the possible correlation between economic outcomes and factors affecting the decision to own land.
This study applies a framework with two potential outcomes to overcome the problem of self-selection bias:
Y1—an outcome of farms with land ownership (treated farms)—and
Y0—an outcome of farms without land ownership (control farms). The observed outcome for any individual farm
i can be written as
, where
T {0,1} indicates the treatment status, with
T = 1 if a farm has land ownership. The gain/loss of an individual farm
i with land ownership is
. Estimating the individual farm treatment effect
is not possible and we have to concentrate on (population) average treatment effects (ATE) because we cannot observe both outcomes for an individual farm, as shown in Equation (1) [
19].
The most widely used evaluation parameter is the “average treatment effect on the treated” (ATT), which, in our context, represents the difference between the expected economic performance and viability outcomes of farms with and without land ownership. This can be algebraically explained in Equation (2).
In practice, it is impossible to observe in Equation (2). A farm either does or does not have land ownership; treatment assignments are mutually exclusive. Estimating the ATT associated with land ownership by comparing the mean of difference for and will be erroneous due to the selection bias.
Within social science research, there are several approaches that are used to address the challenge of policy evaluation with the selection bias problem in agriculture. While using the instrumental variable, [
20] analyzed the relationships between land ownership, access to finance, and female entrepreneurial performance in Eswatini, Lesotho, and Zimbabwe, and revealed that land ownership is important for female entrepreneurial performance in terms of sales levels. Using the difference-in-difference propensity score matching estimator, [
21] found that agri-environmental schemes (AES) that are designed for arable land overcompensate farmers fail to comply with the World Trade Organization (WTO) rules.
An increasing number of studies [
8,
12,
22,
23,
24,
25] have used the propensity score matching (PSM) estimator in agricultural context to pair observations within treatment and control groups based upon the propensity score
P(
X), which is the probability of having land ownership, by assuming that
, where
denotes independence. This assumption is often called the conditional independence assumption (CIA), which requires that all of the variables driving self-selection are observable to researchers [
17].
It is also assumed that the probability of being treated (given covariates
X) falls between zero and one,
to ensure overlap or common support in the distributions of all covariates
X between farms with and without land ownership. This overlap condition ensures that overlap in the characteristics of farms with and without land ownership is sufficient for enabling proper matching. Under the CIA and overlap assumption, the PSM estimator for the ATT can be written, as shown in Equation (3):
The CIA also requires the inclusion of all observed covariates
X that simultaneously affect the probability of having land ownership and the potential outcomes in the propensity score estimation. Moreover, land ownership should not affect these variables. This study uses a combination of guidelines from economic theory, previous research studies, and statistical methods to select a set of qualified covariates
X, as suggested by the literature [
19].
We followed [
26,
27], by splitting the full sample into three subgroups of farm, according to the differences in the rice planted area, including small, midsize, and large farms, to evaluate whether the impacts of land ownership on the economic performance and viability of rice farming are heterogeneous across farm types and to lessen the possibility of mismatching. It is worth noting that there is no official farm typology classifying farm types in Thailand, unlike in developed countries, such as the United States and the European Union. In addition, the PSM theoretically requires large samples with substantial overlap between treatment and control groups [
19,
26]. We defined a small farm as having a rice planted area ≤ 1.20 hectares, midsize farm as having a rice planted area between 1.20 and 2.75 hectares, and large farm as having a rice planted area > 2.75 hectares, according to the data distribution to avoid the overlap problem. Several studies also divide the subgroups while using data distribution [
8,
28]. Moreover, we made several estimations by varying the cut-off points of the rice-planted area and found slightly different quantitative and qualitative results from the main findings.
This study conducted a post-matching balancing test to ensure that the covariate balancing property was satisfied. This test involves a comparison of the characteristics of farms with and without land ownership before matching and an evaluation of whether any significant differences in the characteristics of the two farm groups remain after matching. Once the post-matching balancing test was satisfied, the matching of farms with and without land ownership based on estimated propensity scores was utilized to derive the impact of land ownership on the economic performance and viability of rice farming. In addition to the imposition of common support, this study addresses the problem of limited overlap in the covariate distributions between farms with and without land ownership while using the trimming approach that was proposed by [
29].
We utilized several matching algorithms as robustness checks. We firstly used nearest neighbor matching with five matching partners (NN5), ten matching partners (NN10), and kernel matching algorithms
1 because there were a large number of comparable untreated (farms without land ownership) observations in subgroups. The Gaussian kernel function was used for kernel matching. The optimal bandwidth for the kernel function was selected while using the rule of thumb that was suggested by [
30]. We also used the radius matching with a caliper firstly recommended by [
31]
2 to increase the matching quality. However, as discussed in [
32], it is difficult to know a priori what choice for the tolerance level is reasonable. We used the calipers of 0.01 and 0.02 in this study.
The quality of matching outcomes was also evaluated for each algorithm on the basis of the percent reduction of Pseudo R2 and the mean standardized bias. Lastly, this study constructed two corresponding potential outcomes consisting of the rice yield and the informal debt of farm households, respectively, to capture the economic performance and viability of rice farming, which are affected by land ownership. We use STATA software version 15 for all estimation procedures.
2.2. Data
The household-level dataset was constructed and the main source of data was obtained from the 2013 agricultural census that was conducted by Thailand’s National Statistical Office. The dataset contains 62,686 observations that were made over a crop year. After excluding the non-growing rice farms, 38,980 observations remained in the dataset. Several variables used in this study were extracted and constructed from several sources, including the 2013 agricultural census, the Office of Agricultural Economics, the Royal Irrigation Department, and the Meteorological Department. The variables include the potential outcomes (i.e., rice yield and informal debt); the status of land ownership (i.e., full and weak land ownership); operator characteristics (i.e., gender, age, education level, marital status, and member of institutions status); farm characteristics (i.e., percent of agricultural labor, working in agriculture, hiring labor on a farm, source of income, ratio of rice area to total land area, rice harvested area, and integrated agriculture); and, location characteristics of farms (i.e., amount of rainfall, temperature, whether the farm is located in the municipal area, and irrigation system).
Table 1 summarizes the variables that were used in the models and their definitions.
This study classifies land ownership into two types to deeply understand the role of land ownership. The first type captures, “full land ownership”, which assumes a value of 1 if a farm household reports that he/she has land certificates that consist of the title deed and NS3. Alternatively, it takes a value of 0 if a farm household reports other types of land. The owners of the land with the title deed and NS3 have ownership of the land and they can sell the land to other people. The second type captures “weak land ownership”, which extends the land certificates that the farm household owns from the title deed and NS3 to SPK401, NK, NS2, and SK1. Generally, the certificates of the land tenure with SPK401, NK, NS2, and SK1 present the right of the farm households to use the land, but the ownership of the land is not attached to the farm households. For the case of SPK401 as an example, farm households are not allowed to sell their land to other people. The land can only be transferred to the heir of farm households and each farmer cannot hold more than eight hectares of land.
Table 2 presents the mean values of the selected variables for farms with and without full land ownership in each farm subgroup. We also test the mean difference between farms with and without full land ownership.
Table 3 presents the same information while using a broader definition of land ownership, which is weak land ownership. We observe that the rice yields without full or weak land ownership are greater than those with full or weak land ownership at 5 percent level of significance across all types of farm. The amount of informal debt of farms with full land ownership is lower than those without full land ownership in small and mid-size farm subgroups at a 1 percent level of significance. We reveal that mid-size and large farm subgroups with weak land ownership have informal debt that is lower than those without weak land ownership at 1 and 5 percent level of significance, respectively, while using the definition of weak land ownership.
The test of mean difference is also performed for factors that determine the potential outcomes. We observe that the mean values of several explanatory variables are different with statistical significance. For example, the proportions of male farmer and hire permanent labor with full and weak land ownership are lower than those without full and weak land ownership. Conversely, the mean age of the household head of farms with full and weak land ownership is greater than that without full and weak land ownership. The above findings show that we cannot estimate the impacts of land ownership on economic performance and the viability of rice farming by simply comparing the mean difference between farms with and without the land ownership without addressing potential selection bias.