 Research
 Open Access
Hybrid casebase maintenance approach for modeling large scale casebased reasoning systems
 Malik Jahan Khan^{1}Email authorView ORCID ID profile,
 Hussain Hayat^{1} and
 Irfan Awan^{2}
https://doi.org/10.1186/s136730190171z
© The Author(s) 2019
 Received: 5 November 2018
 Accepted: 25 February 2019
 Published: 13 March 2019
Abstract
Casebased reasoning (CBR) is a nature inspired paradigm of machine learning capable to continuously learn from the past experience. Each newly solved problem and its corresponding solution is retained in its central knowledge repository called casebase. Withρ the regular use of the CBR system, the casebase cardinality keeps on growing. It results into performance bottleneck as the number of comparisons of each new problem with the existing problems also increases with the casebase growth. To address this performance bottleneck, different casebase maintenance (CBM) strategies are used so that the growth of the casebase is controlled without compromising on the utility of knowledge maintained in the casebase. This research work presents a hybrid casebase maintenance approach which equally utilizes the benefits of case addition as well as case deletion strategies to maintain the casebase in online and offline modes respectively. The proposed maintenance method has been evaluated using a simulated model of autonomic forest fire application and its performance has been compared with the existing approaches on a large casebase of the simulated case study.
Keywords
 Casebased reasoning
 Lazy machine learning
 Softcomputing
 Casebase maintenance
Introduction
Casebased reasoning (CBR) is one of the widely used lazy machine learning methods inspired by natural learning behavior towards solving a new problem [1, 9, 22, 50]. In lazy machine learning methods, training of the model evolves with time and presentation of more data. As the new data is presented, it continues contributing in improvement of the learning curve of the system. CBR is implemented as a learning layer onto the problem domain. Each new problem instance which needs to be resolved through this learning method is presented as a case. It is matched with the cases present in the knowledge repository called casebase. Set of the nearest neighbors of a new problem is extracted using a selected similarity measure. Solutions of the retrieved nearest neighbors of the case at hand are combined together appropriately to figure out the new solution. It may undergo a fine tuning exercise if the need arises. Finally, the new case comprising of the problem and its relative solution is added into the existing casebase. This leads to continuous increase in the casebase cardinality which helps to improve its problem solving capability. On the other hand, the time complexity for computing solution of a new problem increases and the overall performance of the decision support system gets compromised [28, 44]. To address this performance bottleneck of a CBR system, the existing casebase needs to be maintained without compromising its problem solving capability [28]. This exercise is known as casebase maintenance (CBM).

Efficiency: Time to solve a problem.

Competence: Problem count that can be solved.

Quality: Accuracy of the proposed solution.

Continuous timing.

Conditional timing.

Ad hoc timing.

Case addition strategy.

Case deletion strategy.
Most of the case deletion strategies have conditional timing. For instance, Smyth and Keane proposed a case deletion strategy in which blind addition of new case is done [45]. Moreover, when the casebase cardinality is equal to the swamping limit, harmful and useless cases are no more part of the case repository.

In case addition methodologies with continuous or adhoc timing, cases can be added only in the casebase. There is no criteria for deleting cases from the casebase as this could lead to retrieval time bottlenecks with the ongoing increase in the casebase cardinality.

In addition and deletion policies with conditional timing, the whole process of maintenance tends to recur quite often due to the blind addition of completely resolved case. This results in performance bottlenecks.

The proposed approach is capable to control the casebase cardinality at a user preferred threshold.

It acquires knowledge and preserves the casebase competence using online and offline case maintenance mechanisms.

The resultant casebase from the proposed approach is more compact and equally competent as compared to the the existing benchmarks. Its average problem solving time offers up to 65% reduction as compared to that of the standard CBR cycle without compromising on its performance.

Over time, it strengthens the existing stronger cases through updating their utility and periodically removes the weaker cases.
Case based reasoning (CBR)
CBR cycle
CBR is a nature inspired problem solving methodology. It uses already solved problems to solve a new one. Its working principle is reasoning by remembering. This principle implies that the reasoning used to solve a target problem is remembered. The first working principle of CBR is that similar problems have alike solutions i.e. to solve a new case, the existing cases and their solutions from the case repository are used. The second principle is that the kind of problems which an agent faces tends to repeat. Thus, there is similarity between future and current problems. Therefore, it is worth to remember and reuse [22]. This leads to construction of the casebase which contains completely resolved cases called cases.
For the purpose of solving complex and complicated problems, the applications of CBR are widely distributed directly or indirectly in almost every field of knowledge. It has been effectively used for adoption problem in medical CBR systems [5], ecommerce [39], data mining [25], autonomic systems [19], travel planning [6], software engineering [7, 18] and an enormous range of other applications [34, 48]. Furthermore, CBR can also be used to mine big and sparse data as well [37].
Retrieve phase
Distance function  Formula 

Euclidian distance  \(dis(i,j) = \sqrt{\sum \limits _{k=1}^{m}(w_k(p_{ik}  p_{jk}))^2}\) 
Manhattan distance  \(dis(i,j) = \sum \limits _{k=1}^{m}w_kp_{ik}  p_{jk}\) 
Canberra distance  \(dis(i,j)= \sum \limits _{k=1}^{m} w_k \frac{p_{ik}  p_{jk}}{p_{ik}+p_{jk}}\) 
Squared chord distance  \(dis(i,j) = \sum \limits _{k=1}^{m}w_k\left(\sqrt{p_{ik}}  \sqrt{p_{jk}}\right)^2\) 
Minkowski distance  \(dis(i,j)= \left[\sum \limits _{k=1}^{m}w_k p_{ik}  p_{jk}^q\right] ^{\frac{1}{q}}\) 
Reuse and revise phases
Retrieved nearest cases are used to calculate the solution. In classification problems, the solution of the nearest retrieved case will most likely be returned because the frequency of recurring solution to similar cases is higher. On the other hand, one might have to come up with adoption criterion for devising the solution of problems other than classification [31].
Retain phase
Finally, decision is made if the new completely resolved case should be retained in the existing knowledge. The outcome of possible repair and evaluation triggers the learning from the outcome of the proposed solution. It is important that the casebase maintains a bare minimum size to capture the problem domain clearly. Due to the retain phase, each new success is added in the casebase and the size grows on continuous basis which may result in the performance bottleneck. Casebase needs to be optimized to make sure that the casebase cardinality does not grow beyond an unacceptable limit and at the same time, it does not shrink too much to affect the span of the problem domain. Casebase maintenance takes care of this important factor.
Casebase maintenance (CBM)
 1.
Case data, i.e., it is more complicated than to just learn from completely resolved case.
 2.
Retrieval process time, i.e., the complexity of retrieval process increases in terms of time as size of knowledge base increases.
Quality
The correctness of the derived solution is one of the major aspects which represents the performance of a CBR system. However, with the increase in size of a typical CBR system, this factor is not affected since more knowledge is always better to produce better results. But on the other hand, while maintaining a casebase by adding or removing cases, this factor may get compromised. Therefore, an acceptable criterion of adding or deleting cases should be deployed to avoid the removal of important cases and to generate a satisfactory solution of desired quality [43].
Competence
The number of successful solutions of target problems proposed by a casebase is termed as its competence. There is a strong relation between casebase competence and its cases. Coverage and reachability are two important factors on which the casebase competence is dependent [43].
Efficiency
The average time taken to solve a new problem is defined as the efficiency of a CBR system. In case of a classical CBR system, the size of knowledgebase tends to increase with the passage of time due to continuous retention of cases. When the casebase cardinality gets large enough, this leads to serious time complexity bottleneck for the case retrieval process of the CBR system. To comprehend this problematic situation, one approach is to decrease the size of CB by using different maintenance techniques. Performance of a CBR system cannot be compromised and thus it raises the need for a maintenance strategy. One way of doing that is by decreasing the size of the knowledgebase by using an appropriate CBM strategy.
Related work
Condensed nearest neighbor method (CNN) is an approach used in instancebased learning and nearest neighbor methods for the purpose of editing training data. An edited set of training examples are generated by CNN method which is consistent with the unedited original training data [15]. CNN is further extended as reduced edited nearest neighbor method in [8]. The noisy cases are removed, which belong to a class different than the majority of their NN’s [8].
Smyth and McKenna introduced an addition policy for creating a compact competent case base [46]. They combined relative coverage (RC) with condensed nearest neighbor (CNN) method in such a way that all cases were arranged in descending order according to their RC value and then applied condensed nearest neighbor algorithm to create a compact competent casebase. The approximate running time of their approach is \(O(n)^2\) where n is number of existing cases [46].
Leake and Wilson proposed a case addition procedure which adds cases on the basis of the performance benefit (PB) that they provide by their retention in the casebase. Relative performance (RP) measure and CNN were used for performance guided maintenance. The running time of this approach is \(O (n^2)\) [23].
In [35] MunozAvila presented a case retention technique which decides whether the retrieved cases could provide any useful guidance. If not, then the new case does not need to be retrained in the casebase. However, if guidance of the cases retrieved is harmful then the new case needs to be stored in the casebase [35].
In [51], a new caseselection policy has been presented which is based on a streaming criteria for addition. A loss function is used to make a decision on usefulness of the case at hand.
Eager case retention policy is inherited from CBR problem solving cycle [1, 24, 35]. It is a permissive policy in which every new completely resolved case is retained in the casebase. Due to a very permissive case retention criteria, this policy leads to a very large casebase quickly and results in performance bottlenecks.
In [16], a case retention policy is proposed where the retrieved cases are extended to provide the solution during the adaptation process and new case is added to the existing casebase. However, if parts of retrieved cases are revised to provide the solution then the new problem does not need to be retained in the casebase.
A casedeletion procedure has been defined to categorize the cases into four different categories on the basis of their coverage and reachability values in [45]. The computational complexity of this approach is \(O (n^2)\) for n existing cases [45]. Moreover, it uses a learning heuristic algorithm to update the case categories whenever a new case is learned.
In [42], a deletion policy has been presented to keep a compact casebase. This approach uses mixture of different methods to decide the deletion step. These methods include feature weighting, outlier detection and clustering.
In [32], Markovitch proposed random deletion technique which is completely dependent on the domain knowledge. In this technique, cases are randomly selected and deleted from the casebase once it has reached some predefined limit. This policy works surprisingly well, but it can also be very destructive because important cases can also get deleted which will result in an unrecoverable loss in the casebase competence.
Another deletion policy suggests to delete cases based on their retrieval frequency [33]. This policy calculates the retrieval frequency of all the cases and then deletes those cases which are not accessed frequently.
Adhoc timing deletion policy conducts a number of tests on all cases of the casebase for the detection of redundancy and inconsistency [38]. It asks for a user approval for the deletion of detected cases from administrators.
Once the casebase cardinality reaches the swamping limit, cases are treated and deleted. Initially, all the auxiliary cases will be deleted. Then the support cases with relatively lower competence metric will be removed, while retaining the most significant case from each group based on the competence metric value. Afterward, all the intracategory spanning cases will be removed. This process will continue until each case covers only itself among the existing cases in its category.
There also exist some other algorithms like evolutionary algorithms and clustering using random forests which can also be used to maintain a CBR system as mentioned in [4, 17, 27–29].

The existing approaches either focus on case addition or case deletion.

In case addition policies, only new completely resolved case is added which looks important according to the existing casebase. However, the importance of that new case may degrade with the addition of new cases.

These addition policies are unable to delete cases amplifying the gradual increase in the casebase cardinality which will eventually degrade the performance of a CBR system.

On the other hand, the case deletion techniques encourage blind addition of new cases and after reaching a swamping limit, unimportant cases are deleted from the casebase.

Due to the blind retention of new cases in deletion approaches, the casebase cardinality grows swiftly resulting in frequent need for complex maintenance. It results in degradation of the CBR process.
Proposed hybrid CBM approach

Auxiliary cases By the removal of auxiliary cases, the competence does not get effected. If the coverage provided by a case is absorbed by one of its reachable cases then that case is an auxiliary case. For instance in Fig. 5, cases a, b and c are auxiliary cases, since the coverage they provide is subsumed by their reachable cases such as case y.

Pivotal cases If a case is only reachable by itself, then that case is a pivotal case. The competence of a system degrades by the deletion of a pivotal case. In Fig. 5, case x is a pivotal case since it is reachable by only itself.

Spanning cases The cases with an overlapping coverage space with other cases within the problem space are categorised as spanning cases. In Fig. 5, case y is a spanning case as it links together the problem space between case x and cases a, b and c.

Support cases The cases which exist in the form of groups and act as spanning cases within the group are known as support cases. In Fig. 5, cases 1, 2 and 3 are support cases and as a whole they represent a support group. The removal of a member of the support category does not harm competence at all. However, deletion of a whole category is as deletion of a pivotal case.
The significance of the proposed approach over existing techniques is its ability to manage the growth and development of the casebase while enhancing the robustness of the CBR system. The growth of casebase is controlled by an online algorithm which adds only the most competent new cases and the casebase cardinality is maintained by an offline deletion algorithm which deletes the least competent cases.
Case study, experimental settings and initial results
Case study: Simulated model of autonomic forest fire application
Data partitioning
Distributions  Casebase  Problem space 

Distribution 1  3000  7000 
Distribution 2  5000  5000 
Distribution 3  7000  3000 
For RCCNN and footprint deletion maintenance techniques, casebase was allowed to learn 1000 cases for each data partitioning before triggering maintenance. However, in case of hybrid algorithm only the most competent cases were allowed to be learnt. Therefore, this learning limit was reduced for this algorithm for each data partitioning in a way that 500 cases were allowed to be learnt when the casebase cardinality was set 3000 and problem space was 7000. When the casebase cardinality and problem space was 5000 then 400 cases were allowed to be learnt. For the data partitioning of 70:30, 300 cases were allowed to be learnt. The results obtained by the application of each technique on different distribution of dataset have been presented in different experimental settings.
Optimal nearest neighbors
To compute the solution, the set of the nearest neighbors (NN) is retrieved. The size of NN is determined by varying the number of the nearest neighbors from 1 to the casebase cardinality and is notated as k. The value of k resulting in maximum accuracy is obtained as optimal number of the nearest neighbors. It is computed through exhaustive search based on the accuracy. Actual value of k does not hinder the performance. This value of k is determined for each CBM technique on each data partition.
Optimal value of NN’s for simple CBR cycle
Optimal value of NN’s for RCCNN
Optimal NN’s for CBR cycle
Training  Testing  Optimal NN size  Accuracy (%)  Precision (%)  Recall (%) 

3000  7000  11  95.9  97.8  87.7 
5000  5000  13  96.2  98.2  87.8 
7000  3000  12  96.3  98  88.5 
Optimal NN’s for RCCNN
Training  Testing  Optimal NN  Accuracy (%)  Precision (%)  Recall (%) 

3000  7000  8  95.2  97.3  85 
5000  5000  13  95.8  97.6  86 
7000  3000  8  95.6  97  86.3 
Optimal value of NN’s for footprint deletion
Optimal NN’s for footprint deletion
Training  Testing  Optimal NN’s  Accuracy (%)  Precision (%)  Recall (%) 

3000  7000  10  95.4  97.9  85.9 
5000  5000  10  96.1  98.5  87.5 
7000  3000  12  96.2  97.8  88.3 
Exact performance of optimal k for each partitioning have been given in Table 5.
Optimal value of NN’s for the proposed hybrid approach
Optimal NN’s for proposed hybrid approach
Training  Testing  Optimal NN’s  Accuracy (%)  Precision (%)  Recall (%) 

3000  7000  7  95.6  97.2  87.2 
5000  5000  11  95.8  97.8  86.7 
7000  3000  9  96  97.5  88.1 
Selection of the optimal k in all variants of the casebase maintenance activity is done through a bruteforce approach. It searches all possible candidates exhaustively and selects the one with the highest accuracy. The actual value of the optimal points for a different problem domain or data may be different and will need to execute this step independently.
Benchmarking with stateoftheart approaches
In this section, a comparison of proposed approach with existing standard CBM techniques has been presented in terms of theoretical complexity, average problem solving time and performance.
Experiment 1: Comparison of theoretical complexities
Comparison of theoretical complexities
Technique  Activating time  Online complexity  Offline complexity 

CBR cycle  Continuous  O(n)  N/A 
RCCNN  Conditional  N/A  \(O(n^2)\) 
FPD  Conditional  N/A  \(O(n^2)\) 
Hybrid  Continuous + conditional  \(O(n^2)\)  \(O(n^2)\) 
Experiment 2: Comparison of physical time
Experiment 3: Empirical comparison in terms of performance
Accuracy captures the correct classification rate of the applied algorithm. Precision captures true positive rate out of the total positive predictions. Recall captures the true positive rate out of the total positive actuals. The hybrid approach has clearly shown high performance on the simulated case study across all these three performance measures. It has shown up to 96% accuracy, 97% precision and 88% recall performance in different data splits. At the same time, it has resulted into the most optimal average problem solving time as compared with the benchmarks. Putting these two performance angles together, it shows that the raise in theoretical complexity of the proposed algorithm is insignificant because it controls the size of the compact casebase and average problem solving time reduces significantly.
From above experiments, it is clear that the hybrid approach provides optimal performance while improving the efficiency of the CBR system. When the casebase cardinality was set to 3000, the proposed maintenance architecture was able to reduce average problem solving time to almost 65% whereas footprint deletion was able to reduce it by 55% and RCCNN reduced it to 57%. Somewhat of a similar trend was noticed when the casebase cardinality was 5000 and 7000.
Conclusion and future directions
To encounter the retrieval efficiency of existing casebase maintenance techniques, a hybrid CBM approach has been proposed in this work which uses a case addition algorithm on continuous basis in online mode and a deletion algorithm which operates on conditional basis in offline mode. The results obtained suggest that the average problem solving time is much lesser than other standard approaches and the overall competence and the quality of solution is not affected. The computational complexity of the proposed approach for online maintenance is relatively high. However, the bottleneck does not hinder the performance of proposed approach because of its conditional offline and nonfrequent nature.
In future, the computational complexity of the online maintenance will be addressed. This could be achieved by exploring different data representation mechanisms including fuzzy representation. Moreover, if we group different parts of data in casebase by deploying appropriate clustering algorithm as an offline option, then the performance and retrieval efficiency are expected to improve further. Another possible future direction is figure out compact and competent casebases in distributed environments. Synchronisation of distributed components of the casebase and their online maintenance may be addressed in future. Multiobjective criteria may be designed in future to shrink the larger casebases into set of appropriate representatives like centroids of welldefined clusters. Appropriate representatives may be augmented carefully through various data augmentation techniques so that the quality of casebase is improved and its overall problem solving capability in the respective domain is enhanced.
Declarations
Authors’ contributions
MJ identified the research problem, designed the research study and methodology, formulated proposed algorithms and contributed significantly in writing the manuscript. HH contributed in implementation of algorithms, analysis of results and writing the manuscript. IA contributed in formulating the algorithms, choosing the benchmarks and writing the manuscript. All authors read and approved the final manuscript.
Acknowlegements
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
The datasets used and analysed during the current study are available from the corresponding author on reasonable request.
Funding
Authors acknowledge the internal funding support received from Namal College Mianwali to complete the research work.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Aamodt A, Plaza E (1994) Casebased reasoning: foundational issues, methodological variations, and system approaches. In: AI Communications, vol 7:1. IOS Press, New York, pp 39–59Google Scholar
 AbdelAziz A, Cheng W, Strickert M, Hüllermeier E (2013) Preferencebased CBR: a searchbased problem solving framework. In: International conference on casebased reasoning. Springer, Berlin, pp 1–14Google Scholar
 AbdelAziz A, Hüllermeier E (2015) Case base maintenance in preferencebased cbr. In: International conference on casebased reasoning. Springer, Berlin, pp 1–14Google Scholar
 Aydadenta H, Adiwijaya (2018) A clustering approach for feature selection in microarray data classification using random forest. J Inform Proces Syst 14(5):1167–1175Google Scholar
 Begum S, Ahmed M U, Funk P, Xiong N, Folke M (2011) Casebased reasoning systems in the health sciences: a survey of recent trends and developments. IEEE Trans Syst Man Cybern C 41(4):421–434Google Scholar
 Büyüközkan G, Ergün B (2011) Intelligent system applications in electronic tourism. Expert Syst Appl 38(6):6586–6598Google Scholar
 Costa DM, Teixeira EN, Werner CML (2018) Odysseyprocesscase: a casebased software process line approach. In: Proceedings of Brazilian symposium on software quality, pp 170–9Google Scholar
 Cummins L, Bridge D (2011) On dataset complexity for case base maintenance. In: Casebased reasoning research and development, pp 47–61Google Scholar
 Cunningham P (1998) CBR: Strengths and weaknesses. In: Proceedings of 11th international conference on industrial and engineering applications of artificial intelligence and expert systems. Springer, Berlin, pp 517–23Google Scholar
 Cunningham P (2009) A taxonomy of similarity mechanisms for casebased reasoning. IEEE Trans Knowl Data Eng 21(11):1532–1543Google Scholar
 Fan ZP, Li YH, Wang X, YangLiu (2014) Hybrid similarity measure for case retrieval in cbr and its application to emergency response towards gas explosion. Expert Syst Appl 41(5):2526–2534Google Scholar
 Feuilltre H, Auffret V, Castro M, Breton HL, Garreau M, Haigron P (2017) Study of similarity measures for casebased reasoning in transcatheter aortic valve implantation. In: Proceedings of international conference on computing in cardialogyGoogle Scholar
 Hao F, Sim DS, Park DS, Seo HS (2017) Similarity evaluation between graphs: a formal concept analysis approach. J Inform Proces Syst 13(5):1158–1167Google Scholar
 Haouchine MK, ChebelMorello B, Zerhouni N (2008) Competencepreserving casedeletion strategy for casebase maintenance. In: ECCBR’08, vol 1, pp 171–184Google Scholar
 Hart P (1968) The condensed nearest neighbor rule (corresp.). IEEE Trans Inform Theory 14(3):515–516Google Scholar
 Ihrig LH, Kambhampati S (1996) Design and implementation of a replay framework based on a partial order planner. In: AAAI/IAAI, vol. 1, Citeseer, pp 849–854Google Scholar
 Juarez JM, Craw S, LopezDelgado JR, Campos M (2018) Maintenance of case bases: current algorithms after fifty years. In: Proceedings of international joint conference on artificial intelligence, pp 5457–5468Google Scholar
 Khan MJ (2014) Applications of casebased reasoning in software engineering: a systematic mapping study. IET Softw 8(6):258–268Google Scholar
 Khan MJ, Awais MM, Shamail S, Awan I (2011) An empirical study of modeling selfmanagement capabilities in autonomic systems using casebased reasoning. Simul Model Practice Theory 19(10):2256–2275Google Scholar
 Khoshgoftaar TM, Seliya N, Sundaresh N (2006) An empirical study of predicting software faults with casebased reasoning. Softw Qual J 14(2):85–111Google Scholar
 Leake D, Wilson D (1998) Categorizing casebase maintenance: dimensions and directions. In: Advances in casebased reasoning, pp 196–207Google Scholar
 Leake DB (1996) CBR in context: the present and future. In: Casebased reasoning, experiences, lessons & future directionsGoogle Scholar
 Leake DB, Wilson DC (2000) Remembering why to remember: performanceguided casebase maintenance. In: European workshop on advances in casebased reasoning, Springer, Berlin, pp 161–172Google Scholar
 Lenz M, BartschSpörl B, Burkhard HD, Wess S (2003) Casebased reasoning technology: from foundations to applications, vol 1400. Springer, BerlinGoogle Scholar
 Liao SH, Chu PH, Hsiao PY (2012) Data mining techniques and applicationsa decade review from 2000 to 2011. Expert Syst Appl 39(12):11303–11311Google Scholar
 Liao TW, Zhang Z, Mount CR (1998) Similarity measures for retrieval in casebased reasoning systems. Appl Artif Intell 12(4):267–288Google Scholar
 Lupiani E, Craw S, Massie S, Juarez JM, Palma JT (2013) A multiobjective evolutionary algorithm fitness function for casebase maintenance. In: International conference on casebased reasoning, Springer, Berlin, pp 218–232Google Scholar
 Lupiani E, Juarez JM, Palma J (2014) Evaluating casebase maintenance algorithms. Knowl Based Syst 67:180–194Google Scholar
 Lupiani E, Massie S, Craw S, Juarez JM, Palma J (2016) Casebase maintenance with multiobjective evolutionary algorithms. J Intell Inform Syst 46(2):259–284Google Scholar
 Mair C, Kadoda G, Lefley M, Phalp K, Schofield C, Shepperd M, Webster S (2000) An investigation of machine learning based prediction systems. J Syst Softw 53(1):23–29Google Scholar
 Mantaras D, Lopez R, David M, Derek B, Leake D, Smyth B, Craw S, Faltings B, Maher ML, Cox MT, Forbus K et al (2005) Retrieval, reuse, revision and retention in casebased reasoning. Knowl Eng Rev 20(03):215–240Google Scholar
 Markovitch S, Scott PD (1988) The role of forgetting in learning. In: Machine learning: proceedings of the fifth intl. conf, pp 459–465Google Scholar
 Minton S (1990) Quantitative results concerning the utility of explanationbased learning. Artif Intell 42(2–3):363–391Google Scholar
 Montani S, Jain LC et al (2010) Successful casebased reasoning applications, vol 305. Springer, BerlinGoogle Scholar
 MuñozAvila H (1999) A case retention policy based on detrimental retrieval. In: International conference on casebased reasoning. Springer, Berlin, pp 276–287Google Scholar
 Myllymäki P, Tirri H (1994) Massively parallel casebased reasoning with probabilistic similarity metrics. In: Topics in casebased reasoning. pp 144–154Google Scholar
 Perner P (2014) Mining sparse and big data by casebased reasoning. Procedia Computer Sci 35:19–33Google Scholar
 Racine K, Yang Q (1996) On the consistency management of large case bases: the case for validation. In: To appear in AAAI technical reportverification and validation workshopGoogle Scholar
 Segovia J, Szczepaniak PS, Niedzwiedzinski M (2013) Ecommerce and intelligent methods, vol 105. PhysicaGoogle Scholar
 Shepperd M (2003) Casebased reasoning and software engineering. In: Managing software engineering knowledge. Springer, Berlin, pp 181–198Google Scholar
 Smiti A, Elouedi Z (2011) Overview of maintenance for case based reasoning systems. Int J Comput Appl 32(2):49–56Google Scholar
 Smiti A, Elouedi Z (2014) Wcoiddg: an approach for case base maintenance based on weighting, clustering, outliers, internal detection and dbscangmeans. J Comput Syst Sci 80(1):27–38MATHGoogle Scholar
 Smyt B, McKenna E (1999) Footprintbased retrieval. In: International conference on casebased reasoning. Springer, Berlin, pp 343–357Google Scholar
 Smyth B, Cunningham P (1996) The utility problem analysed. In: European workshop on advances in casebased reasoning, Springer, Berlin, pp 392–399Google Scholar
 Smyth B, Keane MT (1995) Remembering to forget. In: Proceedings of the 14th international joint conference on artificial intelligence. Citeseer, pp 377–382Google Scholar
 Smyth B, McKenna E (1999) Building compact competent casebases. In: ICCBRGoogle Scholar
 Sriwanna K, Boongoen T, IamOn N (2017) Graph clusteringbased discretization of splitting and merging methods (graphs and graphm). Hum Centric Comput Inform Sci 21(7):1–39Google Scholar
 Sun J, Zhu Z, Zhang Y, Zhao Y, Zhai Y (2018) Research on personalized recommendation case base and data source based on casebased reasoning. In: Proceedings of international conference on cloud computing and security. pp 114–123Google Scholar
 Ullah K, Mahmood T, Jan N (2018) Similarity measures for tspherical fuzzy sets with applications in pattern recognition. Symmetry 10(6):1–14Google Scholar
 Watson I, Marir F (1994) Casebased reasoning: a review. Knowl Eng Rev 9(4):327–354Google Scholar
 Zhang Y, Zhang S, Leake D (2016) Casebase maintenance: a streaming approach. In: Proceedings of international conference on casebased reasoning. Springer, Berlin, pp 222–231Google Scholar
 Zhu J, Yang Q (1999) Remembering to add: competencepreserving caseaddition policies for casebase maintenance. In: IJCAI, vol 99. pp 234–241Google Scholar