Evaluation of Nursing Facility Locations Using the Specialization Coefficient of the Population Aging Rate

Recently, in Japan, the increase of aging population is especially rapid, the lack of nursing facilities has become a serious social issue, and the political measures against it are continuously enacted. Though the number of nursing facilities and its capacity are increasing, the utilization rate of such facilities remains at the same level, and the lack of facilities has not improved. Based on the background mentioned above, using geographic information systems (GIS) and public open data, the present study aims to quantitatively evaluate the current situation of nursing facility locations in urban areas within Japan. In the present study, the model of the p-median problem used to obtain the optimal location of facilities was modified, and a method to evaluate the current situation concerning the shortage or overage of nursing facilities by area was proposed. As evaluations are conducted using quantitative data, the evaluation results are also quantitative, making it an effective indicator for evaluating the locations of nursing facilities. Additionally, the specialization coefficient of the population aging rate and the distance between nursing facilities and areas were calculated based on public open data. Therefore, the evaluation method has a high temporal reproducibility as well as spatial reproducibility.


Introduction
While the aging population in many advanced countries around the world is increasing, the increase is especially rapid, and the lack of nursing facilities has become a serious social issue in Japan. According to the Ministry of Health, Labour and Welfare (MHLW) [1,2], though the admission capacity of special nursing homes is approximately 498,000 as of 2014, the number of people requesting admission was 524,000. This grave situation highlights the severity of the lack of certified care workers as well as nursing facilities. In order to increase the number of certified care workers, the MHLW released a "Basic guideline concerning measures to increase workers in social welfare services" and is making efforts to secure personnel. Though the number of people registering as a certified care worker is increasing, the lack of workers remains an issue.
Meanwhile, though the number of nursing facilities and its capacity are increasing, the utilization rate of such facilities remains at the same level and the lack of facilities has not improved. Though subsidy is provided by the national and local governments for the construction of nursing facilities, the amount of subsidy cannot be greatly increased due to the need for more childcare and medical facilities as well, making it difficult to expect a great increase of nursing facilities in the future. Therefore, as a solution to the lack of facilities, the construction of new nursing facilities should be prioritized in areas with greater needs. In order to make this possible, first of all, it is necessary to accurately grasp the areas that lack nursing facilities.
Additionally, there are various facility types, and certain characteristics can be seen in selecting the location according to the type. For example, commercial establishments such as convenience stores are located in busy areas that are highly populated in order to get more customers and increase profit. On the other hand, educational and public facilities such as schools tend to be located in areas where many people can fairly access them. Similar to the latter case, instead of being concentrated in certain areas, nursing facilities must be located where everyone in need can access it in a fair manner. Based on the background mentioned above, using geographic information systems (GIS) and public open data, the present study aims to quantitatively evaluate the current situation of nursing facility locations in urban areas within Japan as the target.

Related work
The present study is related to A. studies concerning the sufficiency degree of nursing facilities, B. studies concerning care services provided from nursing facilities, and C. studies concerning facility location problems. The following will introduce the major preceding studies in the above three study areas and discuss the originality of the present study in comparison with the others.
In A. studies concerning the sufficiency degree of nursing facilities, as this topic attracts attention especially in Japan, there were a lot of preceding studies until now. As mentioned in Section 1, the reason for this is that the increase of aging population is especially rapid and the lack of nursing facilities has become a serious social issue in recent Japan. Yamada et al. (2008) [3] grasped the regional difference in demand for nursing facilities by means of interviews. Yamamoto et al. (2015) [4] considered elements such as the positioning of various equipment and nursing modality within the recovery rehabilitation ward as spatial structure and examined the reality of activities of residents in their leisure time. Acker et al. (2015) [5] assessed the nursing facility characteristics, quality ratings, and the views of facility administrators about the implications of an increasing number of foreign-born employees in Washington State in the USA. Takase et al. (2016) [6] grasped the reality of the operation of home care and organized the travel distance of caretakers as well as the service zone using the location and road information of users, caretakers, and offices in Japan. Fujita et al. (2017) [7] organized objections concerning the construction of nursing facilities and examined the changes in social awareness toward such conditions over time in Japan. Hunnicutt et al. (2018) [8] examined and quantified geographic variation in the initiation of commonly used opioids and prescribed dosage strength among older nursing home residents in the USA. Tahara et al. (2018) [9] clarified the attitude of social welfare facility staff regarding the acceptance of evacuees when natural disasters occur and elucidated factors that obstruct the acceptance of such evacuees.
For B. studies concerning care services provided from nursing facilities, there were a lot of preceding studies especially in the USA. Yun et al. (2010) [10] developed and validated an algorithm to identify the use of nursing facility services and differentiate short-from long-term care using Medicare claim data. Walsh et al. (2012) [11] analyzed potentially avoidable hospitalizations (PAHs) for dually eligible beneficiaries receiving long-term or post-acute care services to inform the development of health policies and better care programs and outcomes for this population. Cherubini et al. (2012) [12] examined resident and facility characteristics associated with hospitalization in a cohort of the older nursing home residents in Italy. Onder et al. (2012) [13] assessed the nursing home residents in Europe, focusing on the services and health for the elderly in Long TERm care (SHELTER) study. King et al. (2013) [14] examined how skilled nursing facility (SNF) nurses the transitional care of individuals admitted from hospitals, the barriers they experience, and the outcomes associated with variation in the quality of transitions. Neuman et al. (2014) [15] measured the association between SNF performance measures and hospital readmissions among Medicare beneficiaries receiving postacute care at SNFs. Fry et al. (2018) [16] used robot cats to reduce the total number of falls in the facility quality improvement methods (strategy for improvement, design, setting, participants, interventions, measurements, and evaluation). Yamaguchi (2018) [17] grasped the relationships between received quality of care and information sharing among workers in nursing facilities for the elderly.
For C. studies concerning facility location problems, Segawa et al. (1996) [18] developed a system that can simulate factors related to childcare facility improvements, such as the extension of childcare hours and the location of new childcare facilities. Nagashima et al. (2014) [19] proposed an algorithm that derives the best location for electric vehicle (EV) power stations by means of the mean field approximation. Ozgen et al. (2014) [20] combined a two-phase possibilistic linear programming approach and a fuzzy analytical hierarchical process approach to optimize two objective functions (minimum cost and maximum qualitative factor benefit) in a four-stage (suppliers, plants, distribution centers, and customers) supply chain network in the presence of vagueness. Munemasa et al. (2015) [21] used the linear relaxation method to propose a method that derives the best solution for minimizing travel costs for the urban model made up of residential and business areas. Zhang et al. (2016) [22] investigated a facility location problem incorporating service competition and disruption risks, developing a new binary bilevel linear programming (BBLP) model. Ohdate et al. (2017) [23] considered relocation of facilities for the management of public facilities and categorized them based on building, function, and location to create an evaluation method for them. Nagai et al. (2017) [24] proposed an agent-based urban model in which the relationship between a central urban area and a suburban area was simply expressed. Usui et al. (2018) [25] theoretically investigated the relationship between the continuous walking distance distribution and the density of resting places.
On the other hand, in recent Japan, there are distinctive preceding studies that adopted an economic method into C. studies concerning facility location problems. Tanaka et al. (2015) [26] applied the quintile share ratio (QSR), which is an indicator showing the degree of bias in income, to the facility locational analysis for linear cities. Additionally, with QSR as a reference, Tanaka et al. (2016) [27] used the median share ratio (MSR), which is an equity measure, to develop a facility location evaluation model in a linear city with one or two facilities, as well as a uniformly distributed population. Furuta et al. (2017) [28] used a method that generalized the QSR and proposed a solution to optimize multiple facility locations in cases where the demand and candidate facility locations are discrete.
Regarding studies related to A, though the studies grasped the awareness of the local residents and awareness and behavior of users, as well as the actual condition of operation for nursing facilities, the location of the facilities was not considered. Regarding studies related to B, though the studies investigated the care services provided from nursing facilities, the location of the facilities was not considered. Regarding studies related to C, Nagashima et al. (2014) [19] and Munemasa et al. (2015) [21], respectively, considered EV power stations and business-andresidential distribution as their study subject and proposed a method to derive efficient locations. Though Segawa et al. (1996) [18] conducted simulations with the assumption that the facilities will be relocated, as there are currently many existing nursing facilities in cities of Japan, the above simulations cannot propose realistic solutions for such facilities. Additionally, though Tanaka et al. (2016) [27] and Furuta et al. (2017) [28] focused on the equity concerning the facility location evaluation method, it has only been applied to virtual cities and not to any actual cities. Therefore, with the results of the preceding studies mentioned above as a reference, the present study will demonstrate the originality by considering the lack of nursing facilities, which has become a serious social issue, and quantitatively evaluating current facility locations.

Evaluation framework and process
For the evaluation method of nursing facility locations, PostgreSQL Ver. 9.6.1 and ArcGIS Pro Ver. 2.0 of Environmental Systems Research Institute (ESRI) were used. The evaluation framework and process are as mentioned below: 3.1.1 Creating the distribution maps of the aging population and nursing facilities as well as a road network map The distribution maps of the aging population and nursing facilities as well as a road network map are created in digital map form using GIS. These three types of digital maps are superimposed, and the closest road node from a representative point from each nursing facility and area (divided by town and street) is set on the road network map.

Calculating the specialization coefficient of the population aging rate and adding it to the road network map
The specialization coefficient of the population aging rate is calculated using the aging population in addition to the total population data from each area, and the results are added to the road network map

Calculating the shortest route using A* algorithm
By applying the data obtained in section 3.1.2 to A* algorithm, which is explained in the following section, the shortest route between each nursing facility and each area is calculated.
3.1.4 Evaluating nursing facility location using the specialization coefficient of the population aging rate Using the distribution maps of the aging population obtained in section 3.1.1 and the shortest route calculated in section 3.1.3, the evaluation points are calculated for each area. The age group being evaluated in the present study is those over the age of 75. There are many cases where those over 65 can use nursing facilities. However, according to Hashimoto (2015) [29], the life expectancy and health span in Japan have become higher in recent year, with the latter being 71.19 for men and 74.21 for women. Therefore, assuming most users of nursing facilities are over 75, the age group was set for those over 75. As for the evaluation target area, in order to calculate evaluation points according to each area, GIS is used to display the distribution of the aging population on the digital map.

Distribution map of nursing facilities
GIS is used to display the distribution of nursing facilities on the digital map. While nursing facilities include facility types such as admission type, commuter type, and other related facilities, the present study will only consider admission type of nursing facilities.

Creating a road network map
As for the distance between nursing facilities and each area in the present study, the road network distance is used instead of the linear distance. This is because the linear distance may be extremely short compared to the distance when traveling along the roads to the nursing facilities, and this may cause the estimate of the travel distance of users to be shorter than it actually is.
First, GIS is used to display the road network map of the evaluation target area in digital map form. Next, the node closest to every nursing facility and the representative point in each area will be set up on the digital map. The representative point is the central point of the area, and the nodes are the intersections and endpoints of the roads. This is set up as the distance is calculated according to each node. In the present study, the node set as the representative point of the area is the start node, and the node set as the nursing facility is the end node. In the present study, the distance is weighted so that the road distance of areas with a large demand is longer, while the road distance of areas with a small demand is shorter, as shown in the weighting pattern diagram in Figure 1. If a user from a certain area selects a nursing facility that requires him or her to go through a transit area with a large demand, there is a high chance that there will be competition among users, as those from the transit area will most likely select the same nursing facility. Therefore, the weighting is conducted as mentioned above with the assumption that psychological distance of the users will be longer.
The specialization coefficient of the population aging rate is used as the weighting coefficient. The specialization coefficient of the population aging rate is the value indicating how high the population aging rate is in a certain area. Besides the specialization coefficient of the population aging rate, the population aging rate and the aging population can both be considered as a weighting coefficient. However, because the population aging rate is the ratio with the value always being 1 or less, the distance is always calculated to be shorter. Additionally, the difference in the aging population among the different areas is substantial enough to greatly affect the distance. On the other hand, the specialization coefficient of the population aging rate does not make the value too big or too small; hence, it is suitable as a weighting coefficient.

Calculating the specialization coefficient of the population aging rate
The specialization coefficient of the population aging rate in each area is calculated. In order to calculate the specialization coefficient, the population aging rate must be calculated first. The population aging rate and the specialization coefficient of the population aging rate are, respectively, calculated using Eqs. (1) and (2): where A i is the population aging rate of area i (%), p 75i is the population of those over 75 in area i (people), and p i is the total population of area i (people).
where SC i is the specialization coefficient of the population aging rate in area i, A i is the population aging rate of area i (%), A is the population aging rate of all areas (%).

Adding the specialization coefficient of the population aging rate to the road network map
The specialization coefficient of the population aging rate calculated in Eq. (2) is added to the road network map using PostgreSQL. This is done by multiplying the distance between nodes by the specialization coefficient of the population aging rate. As a rule for addition, the specialization coefficient of the population aging rate is added if the nodes are in the same area as in nodes (i) and (ii) of Figure 2. However, if the nodes are in different areas as in nodes (ii) and (iii), the average value of the specialization coefficient of the population aging rate is added.

Calculating the shortest route using A* algorithm
The shortest route between each nursing facility and each area will be calculated using A* algorithm. A representative search method for finding the shortest route is the Dijkstra method (Dijkstra, 1959) [30]. However, in the present study, the latitude and longitude are used for the coordinates of the representative points in each area as well as the nursing facilities, and the estimate value of the shortest routes between the two points are available in advance. Therefore, A* algorithm, which is the improved version of the Dijkstra method that can effectively calculate the shortest routes, is used.
In general, when considering the shortest route which starts at the start node, goes through node n, and ends at the end node, the route is expressed as shown in Eq. (3): where f * n ð Þ is the shortest route distance (m), g * n ð Þ is the shortest distance from the start node to n (m), and h * n ð Þ is the shortest distance from n to the end node (m).
Though f * n ð Þ can be easily obtained if the values of g * n ð Þ and h * n ð Þ are already known, in reality, the values of g * n ð Þ and h * n ð Þ are impossible to obtain beforehand. Therefore, the method of replacing f * n ð Þ with the estimated f n ð Þ is called A* algorithm, and it is expressed as shown in Eq. (4): where f n ð Þ is the estimate value of the shortest route, g n ð Þ is the estimate value of the shortest route from the start node to n, and h n ð Þ is the estimate value of the shortest route from n to the end node.
An example of actually searching for the shortest route using A* algorithm on a computer is shown in Figure 3. In this example, the shortest route from coordinates (2,2) to (5,5) is obtained. The gray cells are set as impassable, while F, G, and H correspond to f n ð Þ, g n ð Þ, and h n ð Þ of Eq. (4). However, G is the actual travel distance from (2,2) to the current cell, and the distance for moving one cell over in any direction is counted as 1. Additionally, for the estimate value of the shortest distance from the current cell to (5,5), H is set as the higher value of the difference between the x-coordinate and the y-coordinate of the two points.
The algorithm process is as follows: 1. Calculate the values of F, G, and H, surrounding the first cell.
2. Move to the cell with the value of F being the lowest.
3. Calculate the values of F, G, and H, surrounding the moved cell.
4.Among the cells newly calculated in the algorithm process 3, if the value of F is the same as or lower than that of the current cell, it is moved to that cell. If the value of F is higher than that of the current cell, the search on this route is discontinued. Then, among the value of cells already calculated, if the values of F, G, and H can be calculated and the value of F is the same or almost the same as that of the current cell, it is moved to this cell.

The algorithm processes 3 and 4 are repeated until the destination cell is reached.
The cells leading to the destination cell at this time is the shortest route, and the value of F obtained at the end is the distance of the shortest route. The shortest route in this example is indicated by the gray dotted line arrow, and the distance of the shortest route is 5. Though 1 was set as the distance when it was moved one cell over in the example of Figure 3, the distance can be changed according to each moved cell. In the present study, the distance of G is set as the product of the actual travel distance and the specialization coefficient of the population aging rate mentioned in Section 3.2.2.

Evaluation of nursing facility locations using the specialization coefficient of the population aging rate
The p-median problem is a facility location problem in which the total distance from the users to their closest facility is minimized, and this problem can be modeled as shown in Eq. (5). This model obtains the optimal location to reduce the load of users in all areas to the extent possible by changing the value of Xij: where X ij ∈ 0; 1 f g: is the allocation for facility j in area i, w i is the demand in area i, and d ij is the distance from area i to facility j In the present study, the evaluation method is developed based on the p-median problem. Eq. (5) is a model that obtains the suitable location to reduce the travel load of users in all areas as much as possible and is different to the purpose of the present study. Therefore, in order to match the purpose of the present study, Eq. (5) is changed into Eq. (6): where Z i is the evaluation point for area i, w i is the demand in area i, and d ij is the distance from area i to facility j (m).
Xij of Eq. (5) is removed in order for the facility location to stay the same, and by calculating the evaluation points according to each area, the facility location situation in each area can be quantitatively grasped. Additionally, as for the originality of the present study, distance dij is weighted by the specialization coefficient of the population aging rate as mentioned in Section 3.2.2. With Dij as the weighted distance, Eq. (6) can be expressed as Eq. (7): where, Z i is the evaluation point for area i, w i is the demand in area i, and D ij is the weighted distance of area i to facility j (m).
The evaluation points of each area is calculated after applying the aging population and the distance of shortest routes to nursing facilities of each area obtained in Sections 3.2.1 and 3.2.2, and the results are displayed on the digital map using GIS.

Selection of evaluation target area and data processing 4.1 Selection of evaluation target area
For the evaluation target area in the present study, Chofu City, Tokyo, was selected. Chofu City is located in the suburban area of Tokyo as shown in Figure 4.
Chofu City was selected as the evaluation target area, as the aging population (age 65 or over) has already exceeded the youth population (under age 15), and the aging population is expected to increase in the future. According to the survey on the aging population in Chofu City [31], 23,545 people, equivalent to approximately 10% of the total population, fit in the current age range of the present study subject which is 75 and over. In the present study, evaluation will be conducted in the unit of 105 areas within Chofu City.

Data overview
The utilized data and the utilization method of the data in the present study are shown in Table 1.

Distribution map of aging population and nursing facilities
The distribution of the aging population and nursing facilities are, respectively, shown in Figures 5 and 6. As shown in Figure 5, the southeastern part has a high aging population, and there are old housing complexes in this area. In Figure 6, the admission type of nursing facilities is only shown as mentioned in Section 3.2.1, indicating that they are distributed throughout the entire Chofu City excluding the central area.

Utilized data
Utilization method of data  Figure 7 is the evaluation results when using the specialization coefficient of the population aging rate. Areas with particularly high evaluation points are outlined by bold black lines. As evident in Figure 7, in addition to the areas with a high aging population mentioned in Section 4.2.2, evaluation points were also high in the eastern areas. On the other hand, evaluation points in the northern part of Chofu City were generally low.   black lines in the same manner as Figure 7, and among the areas outlined, four areas were the same as those outlined in Figure 7.

5.3
Comparing the evaluation results with and without the specialization coefficient Figure 9 shows the difference between the evaluation results with and without the use of the specialization coefficient of the population aging rate. Areas with evaluation points higher in the former results are black, while areas with points higher in the latter are gray. Additionally, areas with particularly large or small differences are outlined with black bold lines. Regarding gray areas with high evaluation points when not using the specialization coefficient of the population aging rate, the evaluation points when using the specialization coefficient of the population aging rate were even higher, highlighting the lack of nursing facilities. With 32 areas marked gray and 25 marked black, it was made evident that the evaluation results when using the specialization coefficient of the population aging rate were lower overall.

Extraction of areas lacking nursing facilities
Based on the evaluation results of Section 5.1 (the evaluation results for cases with the specialization coefficient), the five areas with the highest evaluation points will be extracted as areas lacking facilities. The five areas outlined in light gray in Figure 7 have a particularly high population of those 75 and over, and it can be said that there is currently a lack of nursing facilities. Additionally, as the population of those aged 65-74 are also high in these areas, further shortages of nursing facilities can be expected.

Conclusion
The conclusion of the present study can be summarized in the following three points: 1. In the present study, the model of the p-median problem used to obtain the optimal location of facilities was modified, and a method to evaluate the current situation concerning the shortage or overage of nursing facilities by area was proposed. By using the A* algorithm to calculate the distance between nursing facilities and areas, the shortest routes were efficiently calculated. Additionally, the specialization coefficient of the population aging rate used for the distance weighting can be used, because the users of nursing facilities are limited to the aging population, and the evaluation method which used this coefficient is a method specialized for nursing facilities.
2. Regarding the evaluation method in the present study, as evaluations are conducted using quantitative data such as the specialization coefficient of the population aging rate and the distance between nursing facilities and areas, the evaluation results are also quantitative, making it an effective indicator for evaluating the locations of nursing facilities. Additionally, as the current location situation of nursing facilities is evaluated by the area, the comparison of the shortage or overage of nursing facilities in each area as well as the extraction of areas lacking facilities can be easily conducted. Furthermore, as Figure 9.
The difference between evaluation results with and without the specialization coefficient.
the evaluation results are clearly displayed on the digital map using GIS, the shortage or overage of facilities can be visually understood.
3. In the present study, the specialization coefficient of the population aging rate and the distance between nursing facilities and areas were calculated based on public open data such as the National Census and OpenStreetMap. As evaluations are conducted based on public information, by obtaining population data and geospatial data similar to the present study, evaluations can be conducted using data in other areas as well as for the past and future. Therefore, the evaluation method in the present study has a high temporal reproducibility as well as spatial reproducibility. For example, by using the "future population estimate by region in Japan" of the National Institute of Population and Social Security Research as future data, the shortage or overage of nursing facilities in the future can be evaluated.
As research topics for the future, the improvement of the evaluation method for cases where there are multiple facilities within a certain range as well as the application of the evaluation method of the present study in the evaluation of other facility locations and other areas can be considered.