The results of this study indicated that the multiple-terminologies mode retrieved resources that were not retrieved by mono-terminology mode. In fact, the added-value of the multiple-terminologies information retrieval in terms of the coverage was +15% for the first run of the method (16,283 resources provided by multiple-terminologies search mode vs. 14,159 by the mono-terminology search mode) and +17% for the second run of the methods on general queries. This can improve health information retrieval in CISMeF or any portal such as PubMed and, in general, in any catalogue or portal based on multiple-terminologies such as National Guideline Clearinghouse (NGC, URL: http://www.guideline.gov/) which, recently, has also shifted to a multiple-terminologies approach (URL: http://www.guideline.gov/content.aspx?id=15096&search=pain).
Therefore, after this evaluation, the results were considered by the CISMeF team to be sufficient to implement multiple-terminologies information retrieval algorithm in the Doc’CISMeF search engine (as an optional choice). For example, for the query “spine” the mono-terminology information retrieval algorithm provided 213 resources on April 2012 (URL: http://doccismef.chu-rouen.fr/servlets/Simple?Mot=rachis&aff=4&tri=20&datt=1&cis=cis&msh=msh&pha=pha&debut=0) and multiple-terminologies information retrieval algorithm provided 238 resources (URL: http://doccismef.chu-rouen.fr/servlets/Simple?Mot=rachis&aff=4&tri=20&datt=1&atc=atc&cca=cca&cif=cif&cim=cim&cip=cip&cis=cis&cla=cla&drc=drc&fma=fma&lpp=lpp&mdr=mdr&med=med&msh=msh&ncc=ncc&orp=orp&pha=pha&sno=sno&uni=uni&vcm=vcm&art=art&wps=wps&toutes=toutes&debut=0). The results show that in spite of discrepancies between the experts’ ratings, the global result is quite interesting, as good results for the three experts were respectively 53.8%, 68.3% and 47.7% (see Table 4). In general, the average of the results is classified as follows: good results (56.6%) are in the top, followed by the intermediate results (24.7%) and lastly the bad ones (18.7%). The difference between the resources of the mono-terminology search and the multiple-terminologies search is less significant for three-words queries due to the difficulty of finding a correlation between user query and the multiple-terminologies indexing terms. For example, it is more difficult to have a good mapping between the user query “treatment of the breast cancer” and the resource index because there is no descriptor belonging to any terminology of CISMeF information system which expresses this query. For the second run of the method, 70.44% of the 20 top returned resources were rated as having a good relevance.
In contrast, to highlight the add-value of our approach, let us consider the user query “mrkh” which provides a better result with the multiple-terminologies information retrieval algorithm in comparison to the mono-terminology information retrieval due to the fact that the term "mrkh" does not belong to the MeSH thesaurus. Indeed, we created a CISMeF synonym "mrkh" for the MedDRA term “Mayer-rokitansky-kuster-hauser syndrome”, and then we linked the two terms in order to have semantic interoperability between health terminologies. Therefore, using both terms was more efficient for information retrieval process.
The limitation of the study was the number of the evaluated queries. Thus, the established study constitutes a proof of the concept of the proposed model and its implementation. The integration of new medical terminologies in CISMeF (for example the Foundational Model of Anatomy or the Human Phenotype Ontology) and the improvement of resource indexing (manual and automatic) would permit a broader study and certainly obtain more meaningful results.
In addition, considering the limited knowledge of the indexers concerning the new terminologies integrated in CISMeF, the rate of manual indexing by multiple terminologies was still rather low compared with that performed by only the MeSH thesaurus. Nonetheless, 5,164 manually indexed resources out of 37,263 (13.8%) are already being indexed with at least one terminology besides the MeSH (ATC (n=4616), CCAM (n=326) and SNOMED (n=39) etc.), mainly with the ATC for the creation of the PSIP Drug Information Portal .
To the best of our knowledge, this study was the first which evaluated multiple-terminologies information retrieval in any health site. This multiple-terminologies information retrieval approach could be applied to any Web portal currently using the MeSH and in particular to MEDLINE/PubMed as newly included citations are now automatically indexed with MetaMap , which provides multiple-terminologies indexing.
In conclusion, the strategic decision of the CISMeF team has made possible the evolution from a mono-terminological world to a multiple-terminological universe through the integration of the main health terminologies available in French in the CISMeF information system. The contribution of this new universe is to overcome the relative weakness of the MeSH thesaurus and to improve health information retrieval.